Check out a free preview of the full AI Agent: From Prototype to Production course
The "Limitations of Structured Outputs" Lesson is part of the full, AI Agent: From Prototype to Production course featured in this preview video. Here's what you'd learn in this lesson:
Scott shares some limitations with structured outputs. One limitation is constraints on the schema, like the maximum number of objects or nesting levels. Another is string limitations, such as property length, size of the description, or JSON needing to be fully loaded before it can be parsed.
Transcript from the "Limitations of Structured Outputs" Lesson
[00:00:00]
>> Scott Moss: The thing about structured outputs, though, is that it's hard to, like, you lose some things. Like, for instance, we haven't done it here, but we do streaming, A lot of people do streaming, where they stream tokens by tokens. Which, you went to Chat GPT and you ask it a question, you see it just starts typing in.
[00:00:14]
If you really watch, it's not typing every letter, it's like, every token is kind of coming in. And if it is every letter, they're doing that on the front end. But they stream every single token cuz that's how the LLM spits stuff out, it spits it out by token.
[00:00:25]
You can't really do that with structured output, because structured output is JSON, so you can't render partial JSON. You can't call JSON dot parse on partial JSON, it'll just break. JSON isn't complete until you get the last bracket at the end, so you can't even parse it. So you lose that ability.
[00:00:43]
I mean, you could technically still stream it, you just can't do anything with it, or anything useful, I would say. So you lose out on some of that. But I think there's a lot of upside to it so, highly recommend checking it out. You can also do structured outputs with tool calling, which is what we've been doing.
[00:00:59]
So I said, we've already been doing it, that's where we've been doing it, is in the tool calling, which is, you can use that for structured outputs too. Before structured outputs, the way I got deterministic results from an AI was using the tool calling to get the output that I wanted.
[00:01:13]
So instead of, cuz you couldn't tell the AI, there was JSON mode, but JSON mode didn't guarantee a structure. You couldn't give it a schema. It would only guarantee that it would give you JSON, but not in any specific format. But the tool calling, even before structured output, would always try to use those arguments that you provided.
[00:01:31]
So, if I wanted some structured output, I would actually just make a tool that actually had the prompt in it that would describe what I wanted this LLM to do. And then you can pass in a property to the LLM that says, you have to use this tool no matter what.
[00:01:45]
You must use this tool. So, right now this tool called Auto, you could just give it a name of a tool, I forgot the exact syntax. I think it's like, function name, you can give it a name of a tool, and it has to call that tool, and then that way I'm forcing it to call my tool, which forces it to get the arguments.
[00:02:06]
And then all I really want is those arguments. That's the answer that I wanted. And then I won't actually respond back to the LLM, I'll just get those arguments and I'm done, right? So that was the hack that I did before structured outputs. Now you don't have to do that.
[00:02:17]
So, pretty cool. Any questions on structure outputs?
>> Speaker 2: They use the same logic to use with structured outputs to guarantee the shape of it, like with function calling so you can work reliably.
>> Scott Moss: Yeah, it's fine tuning. I actually attempted to do this, I tried to fine-tune, what was it?
[00:02:41]
Was it 3.5, I think it was 3.5, I tried to fine-tune 3.5 to do structure outputs. I got pretty close. It was pretty good. It worked like 75% of the time from zero. So yeah, it was pretty good. But, yeah, they fine-tune it. They have a blog post when they talk about how they did it and stuff like that.
[00:02:58]
But yeah, it's fine-tuned, I think they also have like an additional step with the structured outputs to guarantee that it always does that. And I think the joke in the community is that they just do what Lang Chang does is they just check it, [LAUGH] again and so they get it.
[00:03:12]
Or what you would also probably do is, you'd have a bigger LLM check the output of it to be like, hey, does this output match this schema? And if it doesn't, you do it, right? [LAUGH] So there's like, a big brother intervention type thing. So, yeah, I don't know, I don't know what they do, but I'm guessing some combination of some of those things.
[00:03:34]
>> Speaker 2: Sure.
>> Scott Moss: Those people are much, much smarter than me.
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops