Check out a free preview of the full AI Agent: From Prototype to Production course
The "Wrapping Up" Lesson is part of the full, AI Agent: From Prototype to Production course featured in this preview video. Here's what you'd learn in this lesson:
Scott wraps up the course with some tips for further researching emerging AI techniques and discusses Vercel's AI SDK.
Transcript from the "Wrapping Up" Lesson
[00:00:00]
>> Scott Moss: Cool, here's some emerging research areas, although everything's emerging research. So, just some things to keep you busy. One thing that I think is interesting is re-ranking. Highly recommend looking up re-ranking. I talked a little bit about this, but this is you get the results back from your vector database.
[00:00:14]
You still need to rank them based on how relevant they are according to the original user message, cuz the vector database doesn't know anything about relevance, that's a big thing. Creating feedback loops, highly recommend incorporating that into your app so you get that free data for your golden data set.
[00:00:30]
Evals, evals, evals, look, start with evaluation, please do your evals. And yeah, from there, it's all about identifying specific use cases, getting data and things like that. So there's a lot to learn, but again, it's just more of the same of what I just taught you today. There isn't anything new.
[00:00:49]
It's just a new technique to do this thing, increase RAG, make this eval number go up. Make it faster, make it cheaper. Error handling, that's really it. A better way to do human in the loop, but you understand human in the loop. So, everything is just from here on out it's just keeping up with the new ways that folks are doing it and figuring out if that's relevant for your implementation or not and being aware of all this stuff.
[00:01:15]
So, this right here, is literally so much is a team of people doing all this stuff. So, again, having this skill set I think is extremely relevant. And I think going forward, I think a lot of this is gonna be required for a lot of people in my opinion.
[00:01:29]
I think you're just gonna have to know this stuff because having an LLM stack in your company is gonna be synonymous with having Postgres. So I think you're just gonna have to know this stuff. And right now there's specialties, people are specialists at this. But I think eventually this is gonna be a generalist thing that everybody just knows how to do.
[00:01:47]
Just like in 2012, if you were a frontend, you never touch the backend. Now, I don't even know if there's even such a thing as just frontend anymore. You kinda just do everything and really just frontend, only frontend is probably a designer that knows how to code, whereas most frontend people can definitely spin up a full stack app and set up Postgres and do stuff like that.
[00:02:05]
Whereas in 2012 there was no way your frontend engineer was ever gonna do that, they would probably never even touch Firebase. Well, Firebase was 2013, 2014, but still, it's different. So I think a lot of people will be doing a lot of this stuff. So, any other questions?
[00:02:23]
>> Student 1: What do you think about SDKs for LLMs, like Vercel's or LangChain? Do you think OpenAI is all you need to launch an AI app?
>> Scott Moss: I personally do not recommend, so I can't talk too much about Vercel's AI SDK cuz I haven't used it since when they first made it and it was a preview.
[00:02:42]
I thought it was cool back then, and it probably is good now. From what I understand from their documentation, their AI SDK is heavily geared towards using it in a Next.js environment. Actually have this really cool thing in there. Let me show you. I think this is really cool.
[00:03:00]
You can give your AI a render function the same way you would give React a render function. You can use a generator to send back and stream back a React component, which works pretty much only in Next,js because of server components. So that's really cool. So if you're using Next.js in Vercel, yeah, I would check out the AI SDK.
[00:03:18]
I think it's really cool. It looks like they have a lot more things in here now than what they did have. As far as LangChain, I personally would never use it, I think it's an overkill. I think the patterns are bad. It looks like the AI SDK has really good patterns because it's people who know JavaScript.
[00:03:34]
The LangChain SDK looks like people who've never written JavaScript in their life, only use Python, and then ran all their code through ChatGPT and said, turn this to JavaScript. That's what it looks like. Honestly, the abstractions that they have are just so insane to me. I don't know anybody uses it.
[00:03:51]
So, me personally, no, I would never use LangChain. Vercel AI looks really cool. I might give it another look. But I I'm not using Next.js right now, I'm using a mobile app, so I don't think I'll get a lot of the value from it. But for me, I think just using the OpenAI SDK is probably the best move.
[00:04:07]
They're kinda like the Tesla charger of OpenAI SDKs. Everybody just uses their standard, so if you wanna swap out to Anthropic, you change out the URL, you change that API key, you're done, everything's the same, assuming the same for functionality for those models, and every model is just adhering to that.
[00:04:24]
So I would just use OpenAI SDK in my opinion. Maybe there's some tiny mom and pop open-source things out there that hasn't blown up that are quite useful and they do one thing very well. Go check those out. But for the most part, no, just OpenAI SDK.
>> Student 2: For doing evals on real user input, do you need to have a mechanism for users to grade your output such as like a thumbs up, thumbs down?
[00:04:50]
>> Scott Moss: Yeah, exactly, almost every chat app has that. They have a thumbs up or a thumbs down to indicate whether or not this was the expected input. That way, they can annotate that or not. And frameworks,
>> Scott Moss: Let me see, Humanloop, humanloop.com, it's a prompt tool thing. They have this ability, you could do that as well, so you can collect the data.
[00:05:18]
They even give you APIs for you to hit when someone hits thumbs up or thumbs down. I mean, Braintrust does this too. Almost any tool that helps you do evals and stuff is gonna have that ability. So yeah, if you're doing evals, you need to collect that feedback and try to encourage users to do it through some way.
[00:05:34]
For instance, thumbs up, thumbs down is cool, but almost nobody presses it. So, we're experimenting with different ways to incentivize the user to tell us things. So one thing that we're thinking of doing is showing the user an accuracy score of how accurate their assistant is, and then once a week send them a push notification like, hey, it's been a week.
[00:05:58]
Do you wanna review my work for the week? And then, we kinda just go over this email came in, I said it was important, do you agree, yes or no? And then it's a swipe thing, and they can just do that really quickly, and they can see their score go up and down.
[00:06:09]
So it's kinda gamified, but it's giving us that data to be, they didn't expect this, they did expect this, they did expect this, they didn't expect this. And we get that data for free. So trying to create some incentive for them to label that data for us. And then maybe, hey, if you get your accuracy to 90%, you know, we'll give you a cuz because it's cheaper for us if your thing is more accurate.
[00:06:30]
So that might be an incentive. You got to play around with it. You gotta figure out what works for you, but yeah, you definitely wanna do that. Okay, well, if there are no other questions, that's it. Thanks for coming to the course and I hope you all have learned a lot about getting AI LLM apps into production.
[00:06:46]
>> [APPLAUSE]
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops