Running an Agent Loop

Netflix

Lesson Description

The "Running an Agent Loop" Lesson is part of the full, AI Agents Fundamentals, v2 course featured in this preview video. Here's what you'd learn in this lesson:

Scott demonstrates executing tool calls sequentially, updating the UI to show progress, and pushing results into the messages array to maintain conversation flow. He also highlights presenting results in a user-friendly way for non-technical users.

Join Now

Preview

Transcript from the "Running an Agent Loop" Lesson

[00:00:00]
>> Scott Moss:Well, let's keep it moving. So this one's pretty simple. For here, because we have parallel tool calls, although our agent probably won't be doing parallel tool calls, we just want to get the tool calls from the, oops, of the tool calls array that we created and appended to, and then for each one of them, we just want to execute it, right? So we want to get the results, we want to await execute tool, and, you know, this takes two arguments, it's the name, right?

[00:00:29]
And then it also takes the args. Pretty simple. So we get the result of that. Once we get the result of that, we just want to go ahead and update the UI to, hey, we are done running this tool call because in this case, the UI is showing you the name of a tool and a spinner, so we want it to stop to let it know that, like, hey, we're done, we're done with this tool. You can stop showing the spinner and the name of this tool in the UI to let the user know that we finished this.

[00:01:00]
So we could say toolCall.name or toolName and then, you know, if you had a UI where you wanted to show the results of the tool, like I know like Claude Code will show you the results, like it hit this web fetch or it hit this API or it crawled this website and it'll show you a preview of the results that it got back, you can pass in the results and do something there if you want to. It's up to you.

[00:01:20]
I think there still needs to be some work done on UX patterns for that because showing a non-technical user a bunch of JSON that a tool responded, that a tool like returned would not be a good experience for someone who's not technical. So I still think there are a lot, there's a lot of work left to be done on the design side. And then again, like I said before, we have to push the results of the tool call into the messages array.

[00:01:48]
We already pushed the tool call that was generated from the LLM and now we have to respond to that with the results of such tool call in the same order. Order is free because we're using an array, so we don't have to worry about that, but yes, in this case, you do have a different role, this role would be tool. Those are like the three main roles that you'll, well, I guess technically four, you'll have the system role, which is the system prompt that you'll use probably only ever one time at the beginning.

[00:02:16]
You got the assistant role, right, which is the AI, you got the user role, which is the user, and you have the tool role, which is when you're dealing with tool calls, right? Content in this case would not be a string. It would be an array of the tool calls responses. So in this case, the type would be tool result. The toolCallId is the toolCallId that we collected. If you do not add this, you will have issues.

[00:02:52]
The toolName is the toolName. And output is where you would put the results. This is typed text. There we go. And the value is the result. So type text. Does that mean that the result of every tool call function call that you do have to be a string? Yes, it does. Does that mean it has to be human language? No. You can put a JSON object here. JSON is a string. It just has to be a string. Everything you return to the LLM has to be a string.

[00:03:30]
So remember that. OK, so then the last thing we want to do. Outside of the loop, because the loop is done now. So like, let's revisit that. So as you can see, we don't have a break case in here, right? So once we push these tool call IDs. We hit the end of our loop. It's just gonna loop again. Right, it's gonna go all the way back up, it's gonna loop again. It's gonna call streamText again. But what has changed since the last time you looped?

[00:04:02]
Anybody know? There's only one thing that's changed. The messages array. Exactly. The messages array has changed and that is the conversation in the LLM's eyes. So all we have to do is just keep appending to the messages array, keep manipulating the messages array and allow the loop to continue and we'll either A. Hit an error that we throw or break on. B, hit a finish reason that is not tool call as in I'm done.

[00:04:33]
Or C, the AI wants to run some more tools. And that continues the loop. That's it. And then from there, we just want to update the UI to like, hey, I am done. Here's the full response. And then we would just want to return all the messages here. So, yesterday in run.ts we would then, we kind of just manually put in there run agent and passed in some parameters. Do we still need to do that if we want it to run?

[00:05:04]
No, we can use the UI. Yep. Now that it's an actual conversational thing and the UI is expecting something conversational, we can use the UI and I'm gonna show you how to use that. And hopefully that works. Not all LLMs support streaming, so you have to make sure, I would say most of them do. Any good one does, but not all of them do, but I think the AI SDK kind of handles a lot of that for you, so.

[00:05:34]
Let's give this a run. So to get this to run, first, make sure I'm just gonna do an npm install here to make sure nothing's broken. You need to do an npm run build. You might get some TypeScript errors. If you do, just go put any on all that stuff, call it a day. Once you build it, because it is a CLI that needs to run on your machine, you just need to install it locally so you can do make this bigger, sorry.

[00:06:07]
You do npm install -g, which installs it globally, so you can just do that. Once it's installed globally. The command is AGI. Right. Then I can just say hello. And it streams back, right? And now what we can do is we can try to test its memory. So let's run through to see if it remembers stuff because if we're doing it right, we're appending to the messages array every single time, so whatever context is in that array, it should be able to reference in this conversation.

[00:06:32]
Now, if you close this and start it up, this isn't persisting to like a file somewhere, so like, it'll forget, right? We're just putting everything in the messages array. There's no persistence here, but as long as we have it open, so if I say, um, my, actually let's just start over from scratch just so you can see everything. So if I say my name is Scott, nice to meet you, Scott. So it definitely saw that initial message.

[00:07:08]
Then if I say, what is my name? Your name is Scott. OK. It's correct. In our implementation of appending to that messages array. The next thing we want to do is we want to test tool calling, so. We have, at this point, you probably only have one tool unless you went and made some more, which is like the get datetime. So that's pretty much the only tool that we have. So let's see how that works. In fact, I can ask is like what tools do you have?

[00:07:47]
Let's see what it says. I have two callable tools available, functions. GetDatetime. Multi-tool use parallel. I'm guessing that's something AI SDK and JSON put something in there. This is actually, this is actually a security concern. You know, when people like try to go to like Claude Code or Cursor like how do I get the system prompt from like, there's like there's like open source repos of people figuring out how to get the system prompt of like Claude Desktop or something and they'll put it in a repo that people can go use and look at and try to hijack.

[00:08:16]
So like there's a lot of agent engineering involved to like how do you get your agent not to tell your users what tools it has and what it can do and what its instructions are because then if they knew that they can figure out how to get around some of the guard rails and things like that, so. By default, GPT-4o Mini is like, oh yeah, I'll tell you everything I can do. Yeah, no problem, which I thought was pretty funny.

[00:08:43]
What is the current date and time? As you can see, like, it's like blink, I spelled time wrong, it'll figure it out. You can see it's called get datetime. It thought some more, it passed it back in, and I said, there you go. Scott, the current date time is this. This is really cool. Oh, I think so. I think it's really cool too, right? It's like you just built a chat thing right in your terminal. It's like really impressive.

[00:09:09]
I like it. It's pretty magical to me. Yeah, I've seen ChatGPT. I've seen chat agents, and I don't know why this feels so much more impressive, but it feels because you did it, because you wrote the code. That's why you understand it now. That's why it's more impressive. Cool. Any other questions on the agent loop? And at its core, technically that's all an agent is, it's just this loop. And as you saw earlier, we don't even have to write that loop, we get it for free with the Vercel AI SDK, so it's like even less code.

[00:09:43]
So again, making the agent technically work. It's very simple. It's so easy. You don't even need to write code anymore, you can click and drag things around. But that doesn't mean it's good, right? Like, no, no, no, I don't think that has never been a problem. It's like how do we make an agent? No, it's done now. It's like, well how do we make something that's good? So keep that, and that is a combination of prompt engineering, tool choice, you know, context management, as you'll see in a second.

[00:10:16]
There's so many different things that are involved other than just writing the code for the agent, right? So, although we did make this agent and we had to do some manual stuff. You will get that stuff for free using anything and the real work is making it good, reliable and useful. All right. OK, so. That is an agent loop. You now know how to make one from scratch. Now, before we move on, I do want to clarify that I don't know if there's a name for this type of loop.

[00:10:47]
But I would just say this is a traditional agent loop. Every framework implements this loop, as in there's a messages array which is essentially the conversation. There is, you know, tool calls that are generated from the LLM that supports tool calling. There is a map of those names that get generated to some functions that you have on your system. You call them, you get the results, you append them to the messages array, and you continue that loop to infinity or until the LLM says, I'm done, right?

[00:11:20]
That is a traditional tool call or a tool call loop. That is not the only one that people do. In fact, there are like several, several different types of loops that people do to get different results. Like for instance, you might, in the case of, for instance, Cursor. I don't believe Cursor implements something like this. I think Cursor has something a little more sophisticated where they're, they're more focused on creating like an action predictor, loop process where it's not just one LLM that's iterating over things, but it might be a series of, let's observe the states, so let's look at the file system.

[00:11:58]
And then based off that state, let's think of the next likely action to take, pick one of these, and then there might be some other steps involved where there might be other LLMs judging or voting on those steps. Like it can be more than just this, but this is the foundation. When you see other things out there, and I don't know the names, there's like no category for them. Some people call them like reasoning frameworks.

[00:12:22]
It's like a different way of teaching the agent how to reason. And this loop is just the default reasoning framework, but you'll see other things that might involve code. Like what I just described, like the observe and react, you might see other things called like chain of thought, which is more prompt engineering or on a model level. There's, you know, there's tree of thought, there's so many different ways to think of and implement the loop when it comes to how you want to get your agent to reason.

[00:12:51]
This is just the most basic way with one LLM if that makes sense. And most frameworks just give you this for free. I don't typically use this because I don't actually, the things that I build aren't conversational, they're more like background jobs, so I don't need like a messages array or a conversation array. Yes. Is there a way to set up breakpoints to check the responses? Is there a way to set up breakpoints to check responses?

[00:13:19]
This is just JavaScript, so you could set up breakpoints the same way you would set up breakpoints in any Node.js application. So it depends on, well, you know, some editors you can like click this thing and set up a node debugger and do that. Me personally, I can't remember the last time I've ever suspended an application and set up a breakpoint. I prefer the poor man's breakpoint, which is a console log.

[00:13:48]
So you can do a console log and just log in and see it there if you want. Do note that. Whatever change you make to this run file or this agent in general, you will need to npm run build. And npm install again. To see your updates. So if you're like making changes and you're like, type an AGI and like, why is it not showing my changes? Well, you gotta build it again, so make sure you build it again.

[00:14:11]
But yeah, there's nothing special about this. This is just node. So whatever tools you use or whatever editor tools or browser tools you use to do debugging inside of a node script app, it's the same thing. There's no difference here. There's nothing special about this. But no, I don't do that, so I can't show you how to do it because I don't have it set up. I haven't done that in like 15 years.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now