Lesson Description

The "Few-Shot Prompt" Lesson is part of the full, Practical Prompt Engineering course featured in this preview video. Here's what you'd learn in this lesson:

Sabrina introduces few-shot prompting. This technique provides two or more examples and edge cases. Models learn nuances and variations from a diverse set of inputs and outputs.

Preview
Close

Transcript from the "Few-Shot Prompt" Lesson

[00:00:00]
>> Sabrina Goldfarb: All right, so now we're going to discuss few-shot prompting for those more complex tasks. So let's take a second and think, right? First, we covered the standard prompt, then we saw the very similar but nuanced in its differences, zero-shot prompt, which had zero examples within our prompt. Then we saw a one-shot prompt, which had one example in its prompt, right? So I think we can probably surmise what few-shot prompting is, right?

[00:00:32]
And I promise this is the last time we're going to be talking about how many shots we're adding to our prompt, which is also actually kind of interesting when you think about it that they named it like that, but okay. But so few-shot means more than one, right? So two or more shots, two or more examples in our prompt. And we want to do this when we have complex logic we need to consider, when we have all sorts of formats to consider, when we have a lot of edge cases to consider.

[00:01:04]
So we provide these two plus examples to establish patterns and edge cases. The model is going to learn these nuances and variations from having these very diverse examples. So within this, right, when we talked about the one-shot prompt, we talked about how important it was to send a very generalized example to make sure that we were giving the most general, easy to kind of replicate example for our model to make sure it was really just generalized and specific.

[00:01:39]
But here it's important to include variety. We want to have different inputs, different outputs, different edge cases, and we might even want to send error cases, right? This way the model can learn how I want to handle each of these different kinds of cases. Few-shot prompting from the research has shown to be more effective as models get larger. Why is this important and why do we care? Well, models are trending larger, right?

[00:02:08]
So, whereas the early GPTs could only have a context window about 4,000 tokens, like I said, now we have some models that have a million or two million token context windows. So, the larger these models get, the more few-shot prompting works, the better few-shot prompting works, the more we should use it. So this is a really important technique for us to learn. So, we're going to talk about the research of this, and there was a research paper called "Language Models are Few-Shot Learners," okay?

[00:02:42]
And in this paper, there was evidence that zero-shot and one-shot prompting techniques got more effective as the language models got larger. But the few-shot examples, right, the few-shot prompting techniques increased more rapidly in accuracy as the models got larger. So we saw an exponential increase in how well this type of prompting worked with the accuracy of the prompting, of the answer, of the model.

[00:03:11]
So really important to know, right, that if zero-shot prompting got slightly better and one-shot prompting got slightly better, but few-shot prompting got significantly better, these are the kinds of scenarios where we're like, if we're really struggling to get a complex task done, few-shot prompts are going to be the types of prompts that you want to use. When it comes to an ideal amount of shots, right?

[00:03:34]
I'm sure everyone's wondering, well, how many examples should I provide it? Because when I first thought of this, I just figured like, okay, so three, right? Because zero, one, and then two or more, so, sure, probably three, right? The paper that I mentioned, "Language Models are Few-Shot Learners," and a couple of other papers tend to disagree. But one thing we can say is that the ideal number of shots is the amount that you can provide that have diverse but high-quality examples.

[00:04:06]
Usually we stick to about four to eight, and we see diminishing returns after ten, and some models actually degrade in performance with too many shots. So just try and keep that in mind when you're utilizing these, right? So stick to around four to eight, but it is more important that our examples are diverse and that our examples are high quality. If we human engineer these examples and they are low quality, then we saw model performance degrade as well.

[00:04:38]
So, we don't necessarily want to use few-shot prompting for everything, right? Someone in chat even asked earlier, hey, like, why wouldn't we just use the best technique all the time, right? That's, I mean, who cares? If we have the best technique in front of us, we might as well use it. But if it takes us again more time to craft the prompt than finish the entire solution, we shouldn't bother to do it, right?

[00:05:04]
This is an example where you have complex cases that you're really going to want to use this. Also, when you're human engineering these examples because it's so important that they're good examples, that they're really proper to not only the solution you want, but also to your actual inputs and outputs being correct, then it really can take a lot of time and a lot of human energy to create the proper shots that you might be better off just creating the algorithm or whatever you're working on yourself.

[00:05:37]
So when would I actually use this? I would use this in complex patterns with multiple variations. I would use this in classification tasks that have a lot of categories. Maybe if you want to standardize formats and you have a diverse array of inputs, right? This would be good for that. Or if you have really domain-specific tasks, tasks that require context, right? So, if I'm working in a very large codebase and it requires some really domain-specific knowledge, then that might be a really good time to use few-shot prompting.

[00:06:15]
And again, my pro tips: diversity in examples again, right? Include edge cases and failure cases. Keep your examples concise but complete, right? So as concise as you can make them without giving up any of that quality. And also with few-shot prompting, you might want to test in a few different chats with a changing number of examples each time to see what results work best. So, I might open up, let's say, Claude and try with three shots, and I might open up Claude in another chat and try with eight shots and see if the performance got better, got worse, or stayed the same, right?

[00:06:54]
So if I have all the time in the world, I can really test with that. And that's going to come in handy when you're making AI applications more than anything, because if I'm just building and I can get, if I can put that first example in and get an answer that's no problem, then great. Why would I try and add more shots, right? But if I'm creating an AI application that has to be correct, you know, 90%, 95%, 99% of the time, then it's really worth experimenting with how many shots are going to be ideal, not only for your prompt, but for the model that you decide to use.

[00:07:32]
Something else to note is some models do perform better with few-shot prompts than other models do. There's the very odd case of models that perform actually slightly worse with few-shot prompting. It's very rare to come up. It has happened like one time in the past. It's also in that research paper if anyone's interested in learning more about that. But in theory, this should get more and more helpful based on the fact that models are getting larger and larger every single day.

[00:08:02]
So, let's go over to our Claude chat and let's put in an example of a few-shot prompt. Now we're going to have to remember when it comes to few-shot prompts, if we're going to be providing at least two examples, but even more in some cases, these are going to be pretty long prompts, so feel free to ride along with me, just watch whatever you're most comfortable with, but we're going to go through at least one of these together.

[00:08:33]
Okay, so let's say we want to analyze the business decision with the requested level of detail, right? And now, I'm going to write one of these out and then I'll copy in some other shots, and we'll go through them. So, decision is going to be opening a new store location. We're going to say the analysis level is a quick take, right? I just want to do a quick take. My response is: new location appears viable given foot traffic data and competitor absence, though initial investment is substantial.

[00:09:24]
So I'm going to stop here for one second and say, okay, so here we're giving it an example of a shot, right, with a decision, with an analysis level, with a response all attached to it. I'm going to copy in a couple of other shots and what I want my assistant to answer for me. And this is something that you could find right in any AI application where maybe your users are inputting some information and you're saying, okay, give them, you know, with an analysis level that they pick, a certain response.

[00:09:58]
So now I'm saying, okay, the decision is implementing remote work policy and my level that I'm asking Claude for is a standard review. If we look at our standard review over here, our shot with our standard review, we see that it's going to be a couple of sentences long, so that's what we're really hoping for is that it'll be a standard review, but about implementing remote work. So let's see what Claude has to say.

[00:10:29]
Okay, standard review. Remote work policy implementation presents balanced trade-offs worth careful consideration. Employee satisfaction surveys and retention data from similar companies show remote flexibility significantly improves recruitment and reduces turnover by 20 to 35%. Cost savings on office space could reach $8,000 to $15,000 per employee annually if transitioning to a hybrid model. And then we get some additional key considerations we didn't ask for.

[00:11:02]
So again, we're not controlling 100% our output. We will be covering that in another section, but we can see that we were able to get exactly what we were looking for. We were looking for a standard review on implementing a remote work policy. We were able to show the shots that kind of gave Claude the idea of how much detail to go into, the kinds of stats that we were looking for, and we were able to get what we wanted from this.

[00:11:31]
Now, obviously this isn't a super complex example. So this is also something I probably could have done with a one-shot technique, right? But maybe it's really important for my AI application that I have the correct length of response, and that's why I wanted to provide those diverse examples and use a few-shot technique.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now