Context Placement

GitHub

Lesson Description

The "Context Placement" Lesson is part of the full, Practical Prompt Engineering course featured in this preview video. Here's what you'd learn in this lesson:

Sabrina discusses why the placement of context matters. Providing context at the beginning and end of the prompt is much more effective than placing it in the middle.

Join Now

Preview

Transcript from the "Context Placement" Lesson

[00:00:00]
>> Sabrina Goldfarb: The next thing that we are going to talk about is probably my favorite thing, which is context placement. So, context placement, where context is placed in prompts is important. The beginning of the prompt has been shown, and we have a research paper to show on the next slide, but the beginning of the prompt is better than the end of the prompt, which is better than in the middle of the prompt for information retention.

[00:00:34]
Models struggle with the middle of long contexts, so critical info should go first, and supporting details should go last. This is a really hard prompting technique to actually utilize in production, but it's just something that you really need to know, especially as models get larger and context windows get larger. So the next slide is going to be our research. There was a research paper, "Lost in the Middle: How Language Models Use Long Context." And basically what this research paper did is it analyzed the performance of language models with multi-document questions.

[00:01:18]
So basically there were a bunch of documents, right? And it asked the models a question and placed it somewhere within those documents. So sometimes the answer to that question was in an early document, sometimes the answer to the question was in a late document, and sometimes it was right in the middle. So, we can see that the accuracy changes based on the position of the document with the answer.

[00:01:50]
This is really, really interesting because our brains and these large language models operate extremely similarly. There's something we have called the primacy bias, which means that we remember things better at the beginning of context. So if someone asks you to remember a list of words, you're more likely to remember the first few words on the list. It's called primacy bias. It's a psychology thing, right?

[00:02:18]
Then there's also something called recency bias, which means if I ask you to remember a bunch of things, you're also more likely to remember a few at the end of what I asked you of that list. So we have primacy bias and we have recency bias. And so do LLMs, which is pretty cool, and I guess we could kind of have made that assumption because LLMs, and we didn't really get into neural networks today, but there is a neural network behind them, just like our brains have neural networks, right?

[00:02:49]
So it does make sense that the psychology humans have and the psychology LLMs have, right, can be kind of similar in these ways. But what was really interesting in this study, my absolute favorite part that is just wild to understand, is that the LLMs performed worse when the answer was in the middle of the documents than when they had no documents at all. We can see from this graph the position of the document with the answer.

[00:03:23]
Once it was in the 7th or so place until it was in the 16th place, this red line is nothing closed book. Nothing was provided at all to the model, and it got the answer correct 56, 57% of the time. Which means when we provided too much context, even though the language model could take in all that context, it literally did worse than if we provided no context at all. So I think that is just the coolest thing that models may have larger and larger context windows, but models cannot necessarily utilize those context windows to their fullest potential.

[00:04:05]
Stuff literally got lost in the middle. Information got lost in the middle. We can think about this again with humans, right? So if I'm really thinking about it, if I read a book that's 1,200 pages long with a lot of factual data in it, I'm more likely to remember some of that data at the beginning, some at the end, maybe something really cool in the middle, but I'm not going to remember everything, right?

[00:04:30]
There's just no way. There's too much. There's too much to remember. And it's the exact same thing when it comes to these large language models with these AI models. So, again, this is very hard for us to properly implement and do correctly, right? Because when do we say, let's start a new chat. When do we say we've given too much context? How do we know, right? All of this information. And unfortunately we can't.

[00:04:57]
That's part of the art, right? We're learning the science, and now we have to adjust to the art, right? And so this is like my favorite thing that we talked about today just because this really shows that you have to sometimes utilize a little bit of your gut and kind of go along with how the model is performing. I keep telling you, if the model chat is degrading, if the performance in your chat is degrading, then change the model.

[00:05:25]
But what does that really mean, right? That means if maybe I'm getting stuck, right? There's a bug that keeps popping up. Have you ever been in that situation where you're like working on a feature and a bug pops up and you're like, hey, can you fix this bug? And it's like, sure, I fixed it, and you're like, you didn't fix it. And then it's like, sure, now I fixed it, and it just keeps like adding more and more like garbage, right?

[00:05:49]
That's when you need to stop, and you need to try to revert all those changes and hopefully you committed at some point recently, so it's easy to revert all those changes and that's when you need to say, let's start a new chat because something is getting lost. This is also again just important to say if I have critical information for my model, put it at the beginning. If I have important information that comes up later on, put it at the end.

[00:06:19]
But remember, as your chat continues to grow, what you put at the end is slowly moving towards the middle of your context. So bring it up again if you need to. If you have things that are important, continue to bring them up to avoid this lost in the middle effect, okay? I just think that's like so exciting. So we're not going to continue on our code journey on this one just because I think this is more a theoretical, really, really thinking about what we're doing, but I would highly suggest if anyone here has not started a new chat since we've started, do it now, right?

[00:07:05]
Say, hey, we've written a whole lot, we've gotten a whole lot of output. Something is about to get lost in the middle. I'm going to start a new chat right now. And that's all I want you to take from that section. Is the lost in the middle problem specifically for larger prompts, or does it come up with smaller prompts too? Comes up with smaller prompts, I would highly recommend everybody look into this research paper and read it.

[00:07:32]
It comes up at just a couple thousand tokens actually is when it starts. And so if we go back and think about the fact that we have a system message, right? And we also have our own inputs, and we might also be attaching other things outside. That 2,000 tokens can get filled up really quickly. And now I'm not saying that 2,000 tokens is like a threshold where it's going to happen, right? But it is where it happened in some of the research.

[00:08:01]
So we just have to remember to be a little bit careful because it's not just our inputs and outputs that are affecting how much context we've used, right? Other people have control of that as well. Especially if you're using an AI application that's also connected to another AI application, right? So, maybe I'm utilizing an AI application that has a system message that comes from Anthropic, but then I also have some sort of like developer message that's being added from the AI application that kind of wrapped around it.

[00:08:31]
So now all of a sudden I'm filling up even more and more context, and that context isn't going to fall off. That's just going to stay there forever. So now you might really experience this lost in the middle technique even more. So it's just something to consider, and it's really tough to know exactly what to do with that information all the time, but especially if you are going to utilize a ton of documents or books, just keep in mind the fact that literally it did better, right?

[00:00:00]
We'll go back to the chart. The model did better at getting the right answer with nothing than with too much.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now