Agent Loop Overview

Netflix

Lesson Description

The "Agent Loop Overview" Lesson is part of the full, AI Agents Fundamentals, v2 course featured in this preview video. Here's what you'd learn in this lesson:

Scott explains the agent loop, showing how it manages tasks with uncertain steps and adapts to changing requirements. He demonstrates creating the loop, handling LLM responses, stop conditions, and streaming tokens for smoother interaction.

Join Now

Preview

Transcript from the "Agent Loop Overview" Lesson

[00:00:00]
>> Scott Moss:Last we left off, we did some single turn evals, understand the sweet pseudoscience slash hand wavy slash psychological thriller of trying to get our agent to do something. I went and looked at like jobs that were hiring specifically for that. And yeah, there are a lot of companies hiring specifically for that, and, you know, well over 6 figures. So please take that seriously as far as evals go, and we'll touch more on evals later.

[00:00:26]
But in this lesson, we are going to cover the agent loop. So the agent loop is essentially kind of what it sounds like. I've been hinting at it pretty much the last few lessons, and now we're going to build it. We had it for free with the Vercel AI SDK if you just use the stop when parameter, but now we're going to build it from scratch. So, just to go over the agent loop, I have a lot of notes here about it, but I'll mostly just talk freely about it.

[00:00:56]
It's like, so if you think about a workflow, right? A workflow is a set of, you know, nodes. Each node is like some work that needs to be done, and then there's a path where a node, you know, traverses to another node. So in a workflow, I might say, when this email comes in, then do this, and then depending on, let's say the email is this long, go do this other thing, or if it's not this long, go do this other thing.

[00:01:21]
These are deterministic paths that if you were making the workflow, you would make yourself. Now, imagine if you think of that workflow as like a tree, right? Or even a graph that you can traverse. If you know all of the, if you know how many nodes there are, like you could just traverse it with a loop. It's known. You can just put a number in there and traverse it with a finite loop. But if it's unknown, well, then you have to do either recursion or a while loop, or, you know, you have to keep looping until you reach some type of like end state.

[00:01:52]
That's essentially what the agent loop is. We don't really know what the end goal is, so we have to give all that agency to the LLM and let it decide when it's time to finish, whether it reached all of its goals and it's ready to answer, or we take control and say, oh, there's an error, or, you know, we don't want to spend too much money, so we're going to stop after this many steps, or this is taking too long, so we're going to stop, or you've used too many tokens, so we're going to stop you.

[00:02:20]
But if it's up to the LLM, it'll just keep going until it's ready to respond. And that's the point of an agent versus a workflow. Kind of like what I said at the beginning of the course, agents are really good when there are unknown amount of steps, or the steps themselves aren't clear and they need some definition. Otherwise you could just use a workflow. And to do that, we're going to create a loop.

[00:02:45]
So I talked about the spectrum of agency before, you know, having a slider, giving things full agency. You know, I kind of mentioned that agents aren't a solution for most problems. I would say most business use cases are probably easily solved with workflows, and you can just add LLMs into different nodes on that workflow. That's probably the more resilient, accurate, deterministic, cost efficient approach because most business problems are well defined and they don't need to be reinvented every time you start the workflow.

[00:03:14]
They're usually the same always. Maybe they change and then you change the workflow, but they definitely don't change every single time, at least for most use cases. Now, obviously there are some that do and that's where agents are useful. That's why you also see people make general agents. Because that's the best use case for an agent is to be extremely general because in being general there is no specific problem to solve.

[00:03:39]
So therefore the only thing that could solve it is something that can keep going until it is solved. That's a loop, that's an agent, right? So that's why general agents are typically the number one use case you see for agents. We are just going to dive right into it. There's a lot of ways to do it. Essentially, what we're going to do, I wrote some pseudo code here, but it's just going to be a while loop that we continue to go.

[00:04:07]
We'll talk about the stop cases, but what we want to do is we want to, you know, continue doing what we did before is we give the LLM, here, hey, here's the new user message. Generate your response. The LLM is either going to say, cool, I'm done, loop finished, or it's going to say, hey, can you call this tool for me and give me back the results? If the LLM says, can you call this tool for me and give me back the results, then the loop continues because the LLM says I need the results of this tool in order to figure out what I need to do next, right?

[00:04:42]
That's what the LLM is doing, it's just trying to figure out what to do next. So in order for us to do that, we just need to append the results from the tool call to the messages array and just feed that back into the LLM so when it sees it at the top of the loop, there'll be a new message in the bottom of that array, which will be the results of a tool call and then it's up to the LLM to decide what to do next.

[00:05:08]
Is it another tool call? Is it another set of tool calls? Is it a question you want to ask the user? What is it, right? Along the way, we can also interrupt and do things like approvals and, you know, stop for errors and things like that outside of the LLM. That'll be something that we would put in our loop. We don't want to give the LLM control of that. That'll be a waste of tokens. Like if we knew that the tool threw an error and that error should kill our server, like it was like a 500 error or something, we don't need to feed that to the LLM to be like, hey, our server broke.

[00:05:40]
Can you generate a token to let us know our server broke? Like that's just a waste of money, just, you know, recover your server. So there are times when we would step in. Some of the stop reasons, like I said, it's typically, you know, it could be, you know, you hit a token limit, this is costing us too much money, max iterations, which is really hard to do. User intervention, of course, depending on your infrastructure, we're running in a terminal, so obviously somebody can just hit control C to stop it, but, you know, some UIs don't allow you to just stop an LLM while it's in the middle of thinking.

[00:06:13]
So it really just depends on the infrastructure they have. You can have like an error threshold. I've seen this before where it's like, hey, if you don't have like some type of durable execution, some durable runtime where things can be retried and, you know, transactionally and things like that, you might just say, hey, if we get 3 errors from tools, then, hey, something's just going on, something's bad, let's just stop this.

[00:06:36]
So there's really no wrong way. It really just depends on the experience, the infrastructure you have, and things like that. The other thing we want to introduce here, not because we have to, but just because it's cool, is we're going to stream the tokens versus generating them at once. So up until this point, we've just been using generate text, which just waits and buffers all the tokens at one time and then it shows it to us.

[00:07:02]
That's not really a great experience, so we want to stream them as they come in. It's just a better experience, it's more performance. It's an experience that we all expect from some type of, you know, chat, whether it's with an AI or not. We just prefer to see those things stream in versus having to wait. Sometimes you do have to wait and it's, you know, really hard to solve that problem, that's a UX problem, but for the most part, you could just stream them in.

[00:07:32]
If you check out lesson 4, or you can continue where you left off. Inside the run function, whatever you had in there, which probably isn't much at this point, you can just get rid of. There's, you know, we only had a little bit in there. We're pretty much redoing this whole function now to do the loop. If you checked out the lesson for like an hour ago, it might have been some stuff in here. I've pushed it up since, there's nothing in here.

[00:00:00]
Or like I said, continue with what you had. Either way, we don't need what was already in there. That was just to demonstrate an LLM call. Now they're making a loop. It is completely different. So there's really nothing in there worth saving.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now