Read & Write

Netflix

Lesson Description

The "Read & Write" Lesson is part of the full, AI Agents Fundamentals, v2 course featured in this preview video. Here's what you'd learn in this lesson:

Scott demonstrates creating file system tools, covering organization, read and write tool implementation, input schemas, and execution steps. He emphasizes error handling, detailed messages, and guiding the AI for accurate task completion.

Join Now

Preview

Transcript from the "Read & Write" Lesson

[00:00:00]
>> Scott Moss:All right, so let's make the file system tool then. It's not going to be too crazy. Like I said, it's mostly what you would think it is, but with just some of these different edge cases, so let me see the status. Yeah, I'm going to check out to lesson 6. Let me just delete that thing. Cool. OK. All right, so all we want to do is go to our tools folder inside of agent. If you have your daytime tool in here still, that's fine.

[00:00:52]
It's not going to hurt. In fact, you should keep it. You literally always need that tool. And then we just want to make the file tool, so we'll say file.ts. And instead of just making one file for every single one of these file tools, we'll just put them all in this file, call it like a tool set, so it's a collection of tools. They're not that big. Organization around this stuff is up to you.

[00:01:19]
If you use a framework, that'll have different opinions and stuff like that. There's no wrong answer. These files and tools are really small, so it's not that big of a deal. So implement our tool helper from AI like we always do. We want to import Zod. As you know, we need this for our input schemas so the agent knows what to pass in. We're going to import fs. If you haven't used Node before, this is short for file system.

[00:01:55]
And, oh, actually we want fs promises. Yep. I put a note in front of it. You don't have to do this. This is just a 100% guaranteed for sure way that we're importing the fs slash promises module from Node and not some coincidentally named module called fs that we either npm installed or somehow have locally on our machine and pointing to it. This is just a direct way of like get the one from Node, not from npm or something that I have.

[00:02:26]
You don't need it because you definitely don't have something called fs installed. If somebody made an npm package called fs, they're being malicious. They're being malicious, 100%. Don't install it. Same thing with path. That's built in, so I'm going to do that as well. Cool. First thing is read file. So we'll say read file. Oops, that's going to be a tool like that. Let's give it a description.

[00:03:00]
There's no wrong answer here. Describe it how you want. I was going to say, like I said, I think there's two ways to think about it. What does this tool do? As in, giving blank returns blank. And then also in addition to maybe when would you use it? So those are like the only two things you might want to think about when you're writing these descriptions. So I might say like read the contents of a file at the given path, and I might even be more explicit, read the full contents of a file at the given path.

[00:03:49]
Use this always. Use this to read a file. Right. Just in case there's another tool somewhere that might read a file like the shell command. And this is like a good example of the MCP thing. If you added a bunch of MCP servers that have very similar descriptions, how does the agent know which one to pick? You can't really control the descriptions of MCP server unless you made the server.

[00:04:09]
So which one does it pick? I've run into this a lot where I'll like Claude Code has web fetch built in where it can fetch things on the internet, but then I'll add like another thing like Firecrawl, which is a very popular service that you can add to an AI agent. It's like Google for AI essentially. Sometimes Claude Code would use Firecrawl, sometimes it'll use web search, right? And if I'm Firecrawl, if I'm a developer from Firecrawl, I'm definitely going into our MCP server tool description.

[00:04:41]
I'm like, always pick this one. If there's another search tool, don't use that one, use ours instead. Prioritize this because you want someone's agent to use your thing so that they'll pay you, right? So that's why I was like, I don't know, MCP is great. It might not be the future because that's scary. And then, of course, we need to add an input schema using Zod. It's always going to be an object, even if it's empty, you put an object.

[00:05:09]
But in this case, we want the path, which is going to be a string, so we can say z.string here, and then we can put a describe, and then just describe what this is supposed to be. This is the path to the file to read. Right? And then you could be expressive. It could be like the path, it could be like the relative absolute, relative or absolute, you know, whatever you want to put, like this is where you would have to eval, right?

[00:05:46]
So we got that. And the execute is about as basic and simple as you would think. It's mostly just, again, handling the error. Fs does the heavy lifting. We have our path here. I'm actually going to call this something else, so I'm going to call this node path up here so we don't get confused like that. And it's pretty simple, right? So we just want to do a try catch. We'll handle the error gracefully with the LLM later, and then from here we just get the content.

[00:06:23]
We want to await fs.readFile. It's a promise because fs promises. Passing the path. Do some encoding. You can make this an option too, right? If you wanted to, you could say, hey, give you some encoding, you got to pick the encoding, and you can actually give it an enum, right? You can give it like it's got to be one of these, right? And you can pick that if you want to support that.

[00:06:49]
So it supports enums, so do input schemas. And then just return the content. That's it. Everything else is just for, well, I mean, obviously there's other things like big file sizes and things like that, but outside of those edge cases, we just want to get the error, and then depending on what the error is, we want to show a certain message. You can do what you want. I'm just going to just generalize it here and be like, you know, well, actually I think it is quite helpful for, let's just do this.

[00:07:32]
I'll say this. There was an error reading the file. Here is the native error from Node.js. And then I'll just put in, if you just stringify an error, it just gives you the message, so I'm just going to do that. That'll just give it the message. Or you can just put what I had in the notes. It's up to you. There's no wrong thing. What I have in the notes is very specific because I think a very common error that you or an AI using this tool would have is trying to read a file that doesn't exist.

[00:08:02]
So to help it better understand that, I just checked for that error and like, hey, if I see this, you know, ENOENT error, that means the file doesn't exist, so I'm just going to go ahead and tell the LLM that the file was not found. But what I just put here live is enough for it to figure out as well, but that's just more context and that's kind of how you have to think about it is like how do I make it easy for this LLM to really understand what's going on, right?

[00:08:31]
I can switch on these error codes, I can get better detailed error messages, right? I can even hint at you should try this instead, right? Like whatever you need to do, this is where evaling and coming back to figure out like what's going on will help you. So you might write an eval for this with a mock tool that, you know, just will always return an error message and then you want to eval and see what does the agent do after that.

[00:09:05]
You expect it to either A respond with no tool call and instead ask the user a different file path, or do you expect it to do a list file because you said this file doesn't exist, so the next tool that it should run should be list files to see what files are available. So you eval that, right? So it can get pretty granular on how you want to do that. Cool. Write file. Pretty similar.

[00:09:53]
Give it a description. Write to a file at a specified given path. Creates the file if it does not exist and will overwrite it if it does, right? So if you were not going to do that, you were only going to write to a file that does exist, obviously you wouldn't say that, because then it would confuse the LLM. You're like, well, you told me that it would create a file, well then you're going to confuse it, then it's going to start panicking and like depending on the LLM it might try a couple times, maybe it might get stuck in a loop, or eventually be like, I can't help you.

[00:10:27]
Like, my bad, my engines are trolling me. They gave me a function that says it does one thing, but it doesn't. So like, I'm done, sorry, like, you know, it will just give up depending on the model. So input schema. Object, we have our path, it's a string. You know, the path to the file to write to. And then we have the content. The content to write to the file. Pretty simple. Execute, which is going to take in that object with the path and content.

[00:11:38]
And same thing, we want to be able to handle these errors, so we'll try catch. And inside of here get the directory and we'll say path, or sorry, node path.dirname of the file path. So we want to see is like, is this a folder. And basically we're trying to create the file if it's not there, right? So await fs.mkdir with this directory, recursive true. Recursive true means if this path is in some directory, like basically if something in this path includes other directories that don't exist, make those too.

[00:12:23]
So if you give me a path that's like slash tools, you know, slash config slash thing slash other. And none of those folders existed, it's going to make those too. It's going to make those folders that prefixed the final one in which you're putting the file in. It's recursively making those directories versus having to do each one by one, which would be so annoying, right? Await fs.writeFile to the path with the content and UTF-8 encoding.

[00:13:20]
And then just return a message successfully wrote. And, you know, putting the number here is super helpful because it tells the LLM that all your content I wrote, it's not partial, so content.length characters to path. Right, you don't have to do it. You could just return success, right, you could just return done or did it. Right? You might also say you should verify by listing files, right?

[00:13:50]
And this would hint that like you just need to, if you have another tool in your toolbox that helps you list files, you should call that one after this. And then I could eval this and be like, every time this thing calls write file, the next thing that it does better be list files because I'm suggesting to do it here, right? And this is like you kind of super suggest to an LLM what to do next.

[00:14:15]
You're like guiding it, like it's not, it's like a semi-workflow. It doesn't have to listen to you, but you're strongly suggesting that it might do this next. It's creating a little bit of determinism on what might happen. I might also argue that if you wanted to run that tool after this tool, you should just make a tool that does both. You should, you yourself should verify for the LLM by doing a list file and then return that, hey, I successfully wrote the characters to the path and I verified it by listing the file.

[00:14:43]
And then that way it doesn't need to do it, right? So there's so many approaches to this. Your tools don't have to be atomic. We'll get to that later, but right now, they are very much atomic, they do one thing, but as you build your agent out, you'll start to realize your tools will be combinations of many different things because the more atomic your tools are, the more steps your agent has to take and the more steps your agent has to take, the more error prone it'll be, right?

[00:15:18]
If you do the math, if, let's say you get the best model in the world that's 99% accurate at, you know, solving a task. That sounds pretty damn good. OK. That 99% accuracy over a million tasks is like less than 50% accuracy now. Right? It's good when you only do it once, but the more you do it, the less that 1% difference, is that quadratic? I don't even know what rate of change that is, but it will be less than 50%, you know, at scale.

[00:16:11]
So, you want to try to do as least as many tasks as possible. Handle the errors, so let's get the error here. And we can do something specific, but I'm just going to return, I was not able to write to that file at that path. Here is the Node.js error. Right, I mean if you were like giving this thing like full autonomy, you're like, I gave it a web search tool, so it should be able to look up Node docs and look up the error messages itself, so I'm just going to be like, you should go look up the docs on what this error, like there is literally nothing you can't do.

[00:16:51]
Like the way you have to program these things is way different than what we're used to. The moment you start thinking about building an agent as similar as in like creating a step by step set of instructions for a new hire or something like that, then the more it'll start like clicking and like how you have to work with these folks. So you're actually, you're not, you know, obviously not talking to a person, but it's more closely related to talking to something that can reason than to like I/O of some functions, right?

[00:17:22]
It's not the same, so it's a little different. And these things matter. These things carry weight. Every token carries weight, so you have to be very specific in how you do these tokens. That's why this is why prompt engineering is the first thing that you would implement or try when you're trying to improve your evals. It's like, all right, let's just adjust these prompts. Let's play around with these prompts.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now