Build an AI-Powered Fullstack Next.js App, v3

Structured Output with Zod

Scott Moss

Scott Moss

Superfilter AI
Build an AI-Powered Fullstack Next.js App, v3

Check out a free preview of the full Build an AI-Powered Fullstack Next.js App, v3 course

The "Structured Output with Zod" Lesson is part of the full, Build an AI-Powered Fullstack Next.js App, v3 course featured in this preview video. Here's what you'd learn in this lesson:

Scott uses the Zod library to supply the LLM with a schema for the results returned from the LMM. Types and descriptions are provided for each data property. Temperature is also explained in this lesson.


Transcript from the "Structured Output with Zod" Lesson

>> But as you can see, when it does work, we are getting back those things. So now we got to find a way to keep this consistent, parse this, extract it and then add it to our database. So, going down that thought of basically giving it a schema, we're gonna use some JavaScript schema library to create the schema for us, and then create this prompt for us, so we don't have to create it.

We'll just write code, and that code will generate this prompt with the schema embedded into it. And we'll get it back every single time, and we'll just get back a JavaScript object from this versus like a string. Sound pretty cool, right, yeah, all right. So let's do that, and that's kind of what langchain is for.

So few things here, we need to install something. We need to install something called Zod, you've never heard of Zod? Best way I could describe it is, it's just a schema library. You could just make schemas in JavaScript. That's basically it, it's probably the most popular one, I'll just show you what it looks like, okay.

I hate when people don't put examples in their README, I get it. You spend a lot of time on the website, it's nice. But just put one example in the README. So, that's a good example, okay, here we go. It looks something like this, right? So you can make a schema, you can give it like, what types.

Here's a better one. So you can make a schema that represents a user. It's an object that has a username, it's type of string. We're basically gonna do that, all right? So get into our code, we'll import Z from Zod like so. And then we need to import a few other things from langchain, but let's just work on the schema for now.

So let's just say schema = z., actually, no, there's a, we're gonna use langchain right now. I forgot langchain has a plug-in for Zyde that we're gonna use so we don't have to do it manually. So we're gonna import, from langchain, this one's going to be there. Where is it, I think it's like output parsers.

Yeah, I think it's just from output parsers, and that's Zod. I'll see they changed all their imports. Let's go look at the docs, Let's see. I'll put parsers underneath prompts, I'll put parser, Here we go, it is called yeah, structured output parser and then .from Zod Schema. Yeah, that's way different than what it used to be.

So structured output parser. And then basically what we're gonna do is just make this parser that takes in a Zod object. So what we can do is we can say, make our parser here, structured output part. Structured output parser.fromZodSchema. And then we can just kinda create our .schema, so the way it's gonna work, all we have to do is, in our case, it's an object with those four properties on it.

But we have to add this describe here. This describe is what langchain is gonna use to write the prompt for us. So this is the description that you wanna tell GPT what this field is. So you're only describing this field, whereas, if you look at the prompt that I wrote, I kind of described the whole thing.

I was like, this is what the whole thing looks like. What we're gonna be doing is we're only gonna be describing each field individually. I didn't do that, I just said, give me the summary, the mood, the subject. But I didn't describe what the mood was, I did describe what the summary was.

So it probably could have did a better job at that if I gave it a better description. So that's what we're gonna do. So inside of our AI, we're gonna make a z.object. It's gonna have a mood and then mood's gonna be a z.string, and we're gonna say .describe.

So here, all we have to do is just describe, yes.
>> Kinda an opinion question, but can you experience so far, at what point, at what level of complexity or usage do you find it's worth it to switch over to using a structured output parser? Be it Zod or another one, when using langchain or an API to connect with an LLM?

Is it really just like don't leave home without it or is there a certain level of complexity at which point you really do want to just be structuring and parsing your JSON, you get back?
>> Yeah, for the most part I think the most use cases out there probably aren't doing structured outputs, they're doing natural language outputs.

So they're using open AI to get back a natural language in which they do for semantic searching, they render documentation, they do text-to-voice. So for them they don't want structured output, so I would say that's probably most of it for the people doing structured output. If you're generating anything that has any type of formatting, you're gonna need this.

There's this, well, I mean, technically we're still relying on prompt engineering. This thing's just making a prompt for us. There's been other advancements that people have created to like get structured output. The best way to do it is to do something called fine-tuning, where you can train, so if you think about an LLM, let's say it's 99% trained.

You can train the last 1% of it, to do a very specific thing, you can say, here is a prompt that I will give you. And here's the output I would expect you to give me, and I'm gonna give you 200,000 examples of that. And then now you know exactly what the format that I'm looking for when I asked for something specifically.

So you can fine tune OpenAPI as well, and OpenAI I will give you a custom model ID that you can use to make API calls to. So you could do that. There's also something called, Yeah, that's mostly it. There's another thing but it's kind of complicated. So I'm not gonna get into it.

But fine tuning, finding another model, or in this case, prompt engineering is what we're doing. But we're using the tool to help us a profit engineering versus having to figure this out yourself, because there's gonna be a time where this is going to break, and I promise you, it will.

Because actually, when I built this course the first time, I didn't use the output parts. I was like, I'm just gonna do this, it's gonna be fine, and it worked and it worked. And then when I got here, it stopped working, I was okay, this ain't it. So, cool.

So now we just need to describe what the mood is. So I'll show you what, just so what this looks like, I'll go to the code that I already wrote so you can see what I put. So when I described the mood, I basically said, the mood of the person who wrote the journal entry.

And that's what I described the mood as. I'm really trying to describe to the AI what, I think the mood should be in much detail as I can but be very brief at the same time. So in this case, the mood of the person who wrote the journal entry is what I wanted to put.

So I'm actually just gonna put that there as well. And then for summary, same thing, z.string.describe. And for that one I put, Quick summary of the entire entry. So I'm going to do that, Also I had some problems when I didn't put periods on the end of these descriptions cuz I think the way that it can catch them together, if you don't put a period there it just creates this run on sentence, which I think OpenAI can figure out on its own, but I'd rather not leave it a chance.

So just putting the period there to let her know that I'm done with that. And you can try your own descriptions, you don't have to put these. And then for, what do we have? Whether it's negative or not. This will be z.boolean.describe. And I put, is the journal entry negative?

And then I put in parentheses. Example, does it contain negative emotions? Question mark. So this is kind of like a question. And then I give it a further reason or further thought to think about, hey, think about it like this. Does it contain negative emotions? And I think I did have to experiment with this one.

I was not getting back what I thought, I was given a Boolean but it was just never right. I would put in there like, I'm not feeling good today at all. I don't wanna get out of bed, and it was like this isn't negative. I'm like, no, that's pretty negative, I don't know.

And then the last one we wanna write for now will be color. So for color, we're using a hexadecimal. So I put a hexadecimal color code that represents the mood of the entry. And then I gave it an example of what a hexadecimal will look like for a certain mood.

So, I put example this color which is like blue for representing happiness. So it knows. Some I'm gonna do that, color: z.string.describe. So this is something that's called like a, well, if we only wrote just this prompt, this would be like a one shot, which basically just means, I gave it an example of what I expect the output to be.

This helps with steering. When you give the AI examples of like, hey, for instance, I asked the AI to generate me a React component and I was like, here's an example of how I write React components. That's me giving it an example that's called like a one shot, you can do many shot.

If you don't give an example it's called a zero shot as in like, I'm not even gonna tell you how to do this, just figure it out. So typically, depending on what you're asking and what you're doing, more examples could be better. Sometimes you don't need them, but in this case, we're giving an example.

So, cuz I wasn't sure if just saying hexadecimal, it was going to give me a hexadecimal. So I was like, this is what I mean, I mean this. So I'm steering it in the direction that I want to keep it more consistent. Cool, any questions on this so far?

>> I see you set temperature to zero, or are we gonna talk about temperature later?
>> That's right, we'll talk about temperature next. I did say I was gonna get back to that, okay. Any other questions on this? Okay, so temperature I did say I was gonna talk about this, the best way I can describe [LAUGH] temperature.

So basically, whenever you run completion or inference, there's a bunch of settings you can tweak to determine how you get a certain output with most LLMs. Temperature is one of those settings. Temperature basically just describes the variance and how random the outputs could be. A temperature of one means, so think about randomness as like, okay, if you think about how predictions work.

I'm gonna give you a prompt and then to be able to predict the next word, it basically has statistics on, let's say, the next word that it can write can be one of these 10 words, right? And then it ranks those words based off of how likely each word would be next.

So maybe this word is a 90% chance this is the next word. This is a 50% chance, this is the next word, right? Temperature basically allows that window on which it can select to be bigger. So, let's say temperature is zero, it'll probably only pick the most likely things to come next.

But with a temperature of one, you're like, I don't care, pick any one of those. Even one that's not likely to come next, but it's on the radar somewhere, pick that. So what does that look like in practice? It just makes your stuff look and sound more silly or more creative is what you would see.

So a higher temperature makes it more creative, a lower temperature makes it more not creative. But it's helpful for if you want factual things. If I'm using AI to ask questions about some PDF, I gave it. I definitely want the temperature to be zero cuz I want it to be factual and I want it to be real cuz as it gets more silly and creative, it starts making up stuff.

And that's called hallucination. So the AI will hallucinate, and that's actually a billion dollar problem. There's a lot of companies out there trying to solve AI hallucination. Because it sounds plausible, the census is coherent, the grammar is correct, but if I fact check that, that is completely wrong, and even putting temperature on 0, won't fix it.

And there's a lot of reasons because of that, the training cutoff for, this GPT-3 had a trading cutoff in like 2020 or 2021. So it knows nothing after that, and even the stuff that it knows before that, it's probably still gonna be wrong. So there's a lot of ways around that but yeah that's what temperature is.

It's the silliness level, or the, I call it the fun level, this crank it up to one if you want a lot of fun.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now