Creating a Tool Runner

Netflix

Lesson Description

The "Creating a Tool Runner" Lesson is part of the full, Build an AI Agent from Scratch course featured in this preview video. Here's what you'd learn in this lesson:

Scott demonstrates how to create a tool runner function that matches the tool call with the corresponding function and returns the result. He also shows how to save the tool response in the chat history and handle looping through tool calls.

Join Now

Preview

Transcript from the "Creating a Tool Runner" Lesson

[00:00:00]
>> Scott Moss: Let's do it, so now we're going to create a system that can handle figuring out. It's gonna take the tool that the LLM told us to run, it's going to try to run the tool, get the answer for it, feed it back to the LLM, and then hopefully get an answer, we'll see what happens.

[00:00:22]
So first thing we're gonna do is we're gonna create a tool runner. So if we go into source, I'm gonna create a new file, call it toolRunner.ts, I'm sorry, we already have it, it's just empty, there we go. This is where we're actually gonna create the function getWeather.

[00:00:40]
We created the definition for the tool, but we didn't actually create the function that actually gets anything yet, so let's do that. So I'm gonna just import this type of OpenAI from openai, and I'm gonna make a function that says const getWeather. It doesn't do anything right now, I'm just gonna hard code a value.

[00:01:02]
I looked, there were no good OpenWeather APIs that didn't require auth, and I was like, screw it, we're just gonna fake it. Cuz trust me, I tried, there just were not any good ones that were meaningful. So we're just gonna say it's hot, 90 degrees, put whatever you want, put whatever weather you want.

[00:01:24]
So the way the run tool works is it's gonna take the tool call message, and then it's gonna take the original user message. And then it's essentially just going to switch over the name of that function and match it with a function that we have. And that's really it, and then it'll throw an error.

[00:01:40]
If for some reason, the AI decided to call a tool or a function that we didn't tell it about, like, where did you getting this from? I didn't say that, so that's where that comes from. Or maybe you don't wanna throw this error, and instead you have a function here that when called will return a message back to the AI, stop calling this function, it doesn't exist.

[00:01:58]
So then, that way you don't break the AI, you tell it to stop calling this function. And that will live in the chat history going forward, so it probably won't call that function anymore. So it's a different way of thinking, different way of programming, so let's do that.

[00:02:16]
>> Scott Moss: Gonna run the tool, it's gonna async, these tools are gonna be async eventually. And it's gonna take the toolCall, which is that type, and then the userMessage, which is just gonna be a string, like that. Then I'm just gonna make the input I'm gonna pass to every single function.

[00:02:42]
It's gonna have the original userMessage, and then the toolArgs. And they always come in as JSON, so you wanna parse them.
>> Scott Moss: Assuming that they're there, so I'm just gonna just safely put an empty object here, because I don't want that to break.
>> Scott Moss: So we have that.

[00:03:06]
>> Scott Moss: And then we're just gonna do our switch statement on toolCall.function.name. So switch on toolcall.function.name, if the case is the exact same name as the tool that we created, in this case, we called it, I'm gonna change my back to get_weather, we called it get_weather. So if the toolCall is that name, I just want to return the getWeather function with this input.

[00:03:42]
>> Scott Moss: That's it, and this is where you go back to the question we had earlier is, how do I make sure it only runs things that I want and has the right arguments? You can pass them in here. So when you run a tool, I'm passing in a toolCall, I'm passing in the userMessage, just nothing stopping me from having like, here's the current signed in user's ID.

[00:04:03]
I can pass that into inside of our agent, and then you can pass that down into here, now you have the userId. So whatever function you need to run here that needs access to the user ID, you can run it here, you could literally pass whatever you want.

[00:04:15]
So this is what I do in production right now, works fine, works great.
>> Scott Moss: Okay, so we got that. The next thing we want to do, I think we already kinda did it, yeah, we already did this. So having the tools here, passing them in, everything works great.

[00:04:38]
I think we already did that, so we're fine, we need to just go adjust our agent. So right now, we have an if statement where we're just logging it. Now what we need to do is we need to call the actual function, run the function, get the result, save it in the database, we're gonna save it right now.

[00:05:00]
And we'll make a special function that saves the toolResponse for us right now. And then we'll just say that, cool, we called it, we'll get the result. And then we'll stop, because after this, we need to loop, and that's the next step. That's a little more complicated thing about.

[00:05:17]
So let's go do that, we'll go into our agent.
>> Scott Moss: Here is our loop, so instead of logging it, what we can do, here's something that's really cool, at least with OpenAI, is that it's either gonna be content or it's either gonna be tool_calls, it won't be both.

[00:05:38]
If you get a content back, that means the AI is ready to answer, it has a final answer, it's done. The loop is done, I have a final answer. If you get back tool_calls, it's still thinking. It's like, run this for me, right? So that's what I check.

[00:05:55]
There is another property that I think OpenAI recommends you check, and it's like, stop reason, I think, it's like, what's the stop reason? I stopped because I called a function or so. I don't look at that, I just look at content or tool_call, it's one of those two, so that's just what I do.

[00:06:16]
You can also look at refusal, so we can say, if there's tool_calls, if it is there, it's always gonna be an array, even if you have parallel off. So what we wanna do then, in this case, is we want to, if this was a parallel tool call, we would then loop over this array and do all of this for each one of them in a promise.all.

[00:06:36]
But we don't have parallel, so I'm just gonna just do this. So I'll just say, toolCall,
>> Scott Moss: Is that, and then I want to get the response.
>> Scott Moss: And I'm gonna call my runTool function like that. It takes in the toolCall and it takes in the userMessage, like that.

[00:07:10]
>> Scott Moss: Actually, let me update our loader, our little spinner, so it knows what it's doing. So before I execute it, I'll say, executing this thing, so we can see that status. We'll call it, and then what we want to do there is save that in our database. So I made a special method called saveToolResponse that literally just calls addMessages, but with the role of tool, and then make sure it puts the right ID in the right place.

[00:07:37]
So that way you don't have to do it every time, it's kind of annoying. So let's just swing over to memory and do that right quick. And we'll say export const saveToolResponse.
>> Scott Moss: And this is going to take in the toolResponse. Us, which is gonna be a string.

[00:08:04]
And then the tool call, what did I call it here? Yeah, tool call ID and tool response. I'll do the same order so it's not confusing. So tool call ID, it's a string, and then tool response string. Whoa, it's not what I want you to do. I'm just going to return messages like that.

[00:08:33]
So add, messages is gonna be object. The role has to be tool, role, tool, It's your answer to a tool call. Remember that, roll tool is your answer to the tool call. This the AI says hey run this function, you run the function you get the answer, you save another message called roll tool.

[00:08:53]
What the result of that and the ID of their the tool that the agent told you to call and you send it back. That's it, that's all this is for. I think it used to be function at some point, it's tool now.
>> Scott Moss: Okay, so now that we have that, wow, that thing just completed.

[00:09:13]
I just wanted to go to the next slide. It's autocompleted it so fast. So saveTool response, like that. We're gonna pass in the toolCallId, which will be toolCall.Id. And then the toolResponse is just going to be toolResponse. Wait, did I pass an entire response here? There
>> Scott Moss: No, we just want to get the,
>> Speaker 2: Come back.

[00:09:47]
>> Scott Moss: So we just want to put the results of that here, so the response would be, whatever this returns. Yeah, that's right, okay. So taking the original ID of the tool call that we found from the output of the agent and then passing in the result we got back from running that tool.

[00:10:06]
Now, I can update my loader and cool, I'II say done, or I think there's like a stop method that I have or update it. Sure, okay, so done with that and then for now, we'll just stop and we'll just return. We won't do anything else. But now let's see here.

[00:10:27]
We're adding this user message.
>> Scott Moss: We still wanna add this message, well, now we have to move this up. So if we add this message after this, it's gonna be added after we added the response to it. So this has to be added before. So we got to move this up to like here now.

[00:10:46]
Because this is the actual tool. This is the agent saying, hey, I just responded back to you. So if this is a toolCall, then now we get to return or save back the response to that tool. If it was all the way around, it would put a response before the tool call was actually saved.

[00:11:01]
So next time we run this, it'll see that in the history and be like, what the hell? You just tried to respond to a tool that I didn't even tell you to do yet. Why did you do that? So order matters.
>> Scott Moss: So we'll do that and yeah, I think I had that here.

[00:11:16]
Yeah, I have it here. So, all right, let me just, I guess I'll add back that load, that, that log message. Let's just add that back. It's like I'm logging somewhere else. I don't know where it is, but it's like really annoying. I don't know where it is.

[00:11:30]
Is it in here? It's this thing. Got it, okay, cool. Okay, so now we have that. We're passing it a tool. We call it the weather tool, got our descriptions. Got some reasoning here, it's looking really good. Let's run it and see where we broke. Let me clear my history.

[00:11:59]
>> Scott Moss: And let's try to see what happens here. This, so wow, look at that. I said, how is the weather? It said, can you specify the location for which you would like to know the. The weather, interesting. Why did it do that?
>> Speaker 2: Cuz it knows it needs a parameter for the weather.

[00:12:31]
>> Scott Moss: Does it? Let's see, get_weather. Let's see, get_weather, and then get_weather used to, I don't have anything here for that. Let's see, run agent tools, LLM tool cause I don't know, I guess it just got ahead of itself. It was like, give me a location. Let's see. Let's, I don't know let's play along.

[00:12:59]
Let's just say, just smart enough to know, right. There it is. It caught, yeah, I guess it was just like, yeah, I don't know. And this is what I'm talking about. So this is where then I would go in, and that would be like, easy to get weather.

[00:13:16]
Does not need a city or location, which, to the AI, probably doesn't make sense. It's like, think, what am I getting the weather off? If you don't need a city or location, but don't worry about it. That's what I told you to do.
>> Speaker 2: I mean, if the context is like, the system already has their let long or something like,

[00:13:35]
>> Scott Moss: Yep, now you're thinking, so, like, you In the system prompt, I could have been like, here's the current user's location, right? And and now it just assumes that it's which is exactly what it would do. So, but, yeah, this is why you need evals, so you can, like, test stuff like that.

[00:13:50]
But okay, cool. So we noticed that it ran it and then it stops, which is exactly what we wanted. Okay, where am I at? There we go. And if we go back, we should end off on a cool response, there we go. So obviously response back with a hard coded value that I had, it's hot, it's 90 degrees.

[00:14:07]
The toolCallId matches this call ID you don't. So, in my system we do something called human in the loop. And I know I said come and I'll come back and talk to that. But that's like, when the function that needs to be called is not a function that's on your system, but it's a response that a user must give.

[00:14:28]
They must say, I approve or reject. So in that case, or actually beyond that, like, we have to get an approval sometimes. So you have to like, if you want to send an email, if you want to book the calendar event. We won't just run that tool for you.

[00:14:41]
That tool is guarded by another tool called an approval tool. And the approval tool shows you what it's trying to do. And you have to say yes or no. And that so but the AI doesn't call the approval tool. We tell the AI that it called the approval tool.

[00:14:56]
So we fake a tool call. So like I was telling you, you don't, you can, the AI doesn't know who put these messages in this array. It doesn't know that it responded with this. It's just there. So when it sees role assistance, that's what I said, it doesn't know that it did it or not.

[00:15:13]
So what we do when I see that, you're trying to call a book calendar Tool. I don't write that to the database. Instead, I put in front of it an approval tool and I have to make up my own ID. I have to do all this stuff. I have to put the arguments in there that I think you're trying to do, and then I show that to the user.

[00:15:31]
And then only if that's approved, do I then write back the original tool that it was trying to do. So it could get pretty complicated, but the flexibility of you being able to put whatever the hell you want in this message is kind of cool. So remember that you're not limited by like, Man.

[00:15:49]
I wish we could just like, I wish the assistant would just say, you can just say it. You can say the assistant said this, the AI said this, and then it's going to think it said that. And that's always there. So Yes question?
>> Speaker 2: How many tokens do you recommend sending it?

[00:16:04]
Is it okay to send 10K as an initial system object?
>> Scott Moss: It's a hard question to answer because I don't know what model you're using. I don't know your constraints on your budget. I would say if you have 10,000 tokens for a system prompt, I'll say this, all the folks making foundational models are moving towards a world where you don't need rag anymore.

[00:16:38]
And the reason the way that they are trying to do that is because they just want to increase the token limit. So in their world, you should be able to send as many tokens as you want, and it doesn't matter. Now, what you have to think about, if you put 10,000 tokens in the system prompt every single time you run the LLM, you're gonna get charged 10,000 tokens every single time.

[00:16:57]
In addition to the messages that are also being propended to the array. So if that's not something you wanna deal with, then don't do it. If you're okay with that and it's necessary, then fine. I don't really know what the use case would be if you're putting a whole PDF inside it or not, I don't know, it really depends, yes.

[00:17:15]
>> Speaker 2: If a response from a tool is saved in chat history, what would happen if a similar question was asked again? Can you ensure that the AI calls the tool again instead of getting it from the chat history?
>> Scott Moss: That's a great question. So let's try it right now.

[00:17:32]
I already asked it what the weather was, so if I ask it again what the weather was, what do you expect to happen? I guess it depends on how you ask, right? If I ask, what was the weather again?
>> Scott Moss: Let's see what happens. It did not call the tool.

[00:17:57]
It was like, I already know the weather. And it's because how I asked, yeah. But I think if I ask this way, tell me what the weather is right now though. I think it might run the tool again, I could be wrong, let me see. Yeah, now it ran the tool again.

[00:18:21]
So it really just depends on how you ask it [LAUGH] cuz it's human, so you gotta account for these things, I don't know it's it's really hard to think about sometimes. But, yeah, and then you can add instructions with I could go in here, and this is, and we'll get to system prompt.

[00:18:40]
But this is where I would put an assistant prompt. Hey, prefer calling tools every single time, because data might be stale, and I don't want you giving someone outdated information always recall the tool. Or you could just be really good, and after you respond back to the agent and you save this tool here.

[00:19:06]
I don't know, maybe give the user a grace period of, I'm gonna let them do three follow-ups or X amount of follow-ups. After that, I'm deleting this data out of the chat history. So that would force the AI to call the tool again. Or I'll delete it as soon as I save it.

[00:19:23]
So if they do a follow-up, it'll just call it again so it's never still. So I'll only put it there so it can answer the question they just asked. As soon as I get back this response, I'm deleting it out of the chat message because I don't want them to call this again.

[00:19:44]
So when you loop, when you loop and you come back before you run, before you return back to the user. You could be like, now, delete the tool call before this so the AI has no history of ever calling it. But then you gotta think about the user experience.

[00:20:01]
How does that reflect in the chat app? What does that look like? So, yeah, it's kinda tough.
>> Speaker 2: Somebody got an error an assistant message with tool calls must be followed by tool messages responding with each tool call ID.
>> Scott Moss: Yeah, that that you got that error was the same error that I had earlier.

[00:20:21]
Which means you have a message like this, it's role assistant, it has a tool, calls array. The very next message that you have absolutely has to be a role tool. I imagine you probably started working, you probably started running this example where you ended from the last example where we didn't answer the tool calls, your database is in a broken state.

[00:20:45]
Just clear all your messages out of your array and try again.
>> Scott Moss: But that's the reason you're getting that error is, because this has to come after this. I would imagine what's happening is you're getting this, and then you're immediately trying to send this. So it's messed up.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now