Vercel AI Gateway

Databricks

Lesson Description

The "Vercel AI Gateway" Lesson is part of the full, Build a Fullstack Next.js App, v4 course featured in this preview video. Here's what you'd learn in this lesson:

Brian introduces using AI to generate article summaries, explaining its benefits and showcasing services like Vercel’s AI SDK, OpenAI, Anthropic, and OpenRouter. He demonstrates setting up an AI key, creating a summarization file, and using the SDK to interact with different models.

Join Now

Preview

Transcript from the "Vercel AI Gateway" Lesson

[00:00:00]
>> Brian Holt: We are now going to talk about, um, AI as I like to call it, just kidding, AI, um. So, I worked this in because I think most of the products that you work on, if they're not already using some sort of AI inference, they're probably about to, because as a product manager, you just want to shove it in everything. It's pretty fun and frustrating to the engineers, which is also fun for me. I thrive on that.

[00:00:26]
Um, so we are going to do, we're going to make our summaries be AI, right? So right now, if you go to like our Wiki Masters homepage, uh, this is just displaying the actual content, right? It's actually not giving you a summary of any variety, um, which is, and you can see that like it has like formatting and stuff like that, not great. So instead, we're going to use AI to read the content of the article, read the title, and then put out like a one sentence AI summary, so that this displays a bit better on this page.

[00:01:02]
We are spoiled for choice here. There's so many different ways to do AI, right? Um, but luckily Vercel has a, uh, an SDK for this called, uh, it's the AI SDK. Um, it's actually, I think one of the best things that, uh, Vercel puts out. Um, it's nice because you don't actually have to use it with their gateway, you can use it directly to Anthropic, directly to OpenAI. Um, and they make like the user experience really easy to switch between the two.

[00:01:30]
So if you have keys for both OpenAI and Anthropic, all you have to do is like switch the actual like endpoint and everything just works, and the AI SDK actually absorbs all the complexity for you. Beyond that, let's go to Vercel. Do I have Vercel up here? I don't think I do. So let's go to Vercel.com. AI Gateway is up here. Um, I have a key up here. Um, they give you $5 for free. Um, you do have to put a credit card in.

[00:02:03]
So that's, um, I don't know how you want to handle that, um, but I looked at all, basically all of the AI services that give any amount of free tier, and they all require a credit card because it's just, like otherwise people would just be using free tokens, right? So if you're not into that, I would say just observe, see what it's like. You can even write the code if you want to, but um, that being said, like, you can see, I did a lot of testing on this course, and I only used 4 cents' worth of token, so that does actually go fairly far.

[00:02:35]
Um, but if you do have, um, tokens already on OpenAI or Anthropic, you can use those with the AI SDK, which is nice. The other one that I recommend to people is Open Router. This is another one that's, it's like this a very, very similar idea, um, but again, their free tier requires you to, I don't think they, they might even have a free tier, I actually don't know. This is what I normally use like personally when I'm trying to like try a bunch of models all at once.

[00:03:00]
The nice thing about them is they have like, every model known to mankind, right? And it's really easy to switch between a bunch. Um, I do that so I can like try to see like, does this work in this, does this work in this, all that kind of stuff. And I should mention the company I work for, Databricks does have an AI gateway as well. It's just like it's not very oriented towards developers today, which is why we're not using it.

[00:03:30]
Uh, OK. So, um, yeah, put in your credit card, um, you don't actually have to, it won't charge you anything, right? It'll just give you your $5 worth of, uh, free credit. Yeah, as far as I know, unless someone can tell me otherwise, there's no totally free way to do AI. You know what you could do, you could run Ollama locally. It's like I anticipated this, uh, you can use Ollama. Check here if you need Ollama instructions.

[00:03:59]
All right, so yeah, this actually would get you started with a llama. But this will only work locally, obviously, right, because that's how that's going to work. So, uh, go here to AI Gateway, if you're going to try and follow along with the free credits here, put your credit card in, um, and then you should get up to something that looks like this, you can go to API keys, I'm going to create a new key, uh, we're going to call this FEM Wiki Masters, create key.

[00:04:41]
Copy. So put that in there. That gives you an AI key here. Um, and now we're going to go to our command line, and we're going to say npm i ai. They got the ai key. I wonder how much they paid for that. Having bid on uh NPM package names before, people charge a lot of money for them. So getting something as short and as powerful as ai probably cost them quite a bit. OK, uh, awesome. We, uh, we're going to go.

[00:05:20]
I'm going to just run dev again, because why not? And I'm going to open, I'm going to make a new directory here. In my app, it's going to be a folder, it's going to be called ai. And I'm going to make a new, and it's going to be called, this one I didn't create the AI client here because it's actually just not really conducive to that. So, but we're going to make a file here called summarize.ts. And here we're going to import, generateText from ai.

[00:06:12]
And we're going to export async function summarizeArticle. And it's going to take in a title and a string, which is a string and it's going to take an article, which is a string. No, that needs to be a comma. OK. Uh, I have the return type here, but it should be obvious, so I'm not going to put that, um, and you're going to say, if no article or article trim. You can live without the title, you definitely, but you need the body of the article.

[00:06:56]
So I'm just going to say throw new error, article content is required to generate a summary. This is not a course in uh prompt engineering. I would normally make a much longer prompt for this, but we're just going to copy and paste it because no one wants to write English. OK, but I put just like a really basic prompt here. If you were doing this for real, I would invite you to like actually go and do like a very long succinct prompt of like, do these things, don't do these things, um, if you haven't seen like the Anthropic like 10 step prompt, that's a really good thing for you to go look at of like how to prompt things really well.

[00:07:32]
Don't do 10 steps every single time, but it would be good to tell like, here's the tone I want, here's the audience I want, here's the call to action I want, here's the article content, here's the title. Blah blah blah, bunch of stuff like that, you'll get a much better summary at the end of it, the more kind of bounding boxes of English you can put around the LLM. It's worth spending a lot of time getting a prompt right, that's really what I want to say, which I didn't do here.

[00:08:20]
So we're not, uh, don't do what I do. Do what you should do. Deep thoughts with Brian. All right, equals await, generateText. OK, I'm going to do model. I have this going to openai/gpt-4o-mini. With something like this, I mean, by all means, go try a bunch of these, and you should kind of like spend every maybe 2 to 3 months reevaluating which model you're on, it's just how fast the industry moves.

[00:08:47]
Um, GPT-4o-mini for specifically summarizing text at the lowest cost was what I found the balance at, and I spent 20 minutes messing around with Haiku and Mistral and a bunch of these other ones. This one seemed to be pretty good. That being said, like summarizing text is like the perfect LLM use case, they're all pretty good at it. In which case you can kind of choose whatever one's like fastest and smallest, or maybe, yeah, fastest and cheapest rather.

[00:09:39]
You are an assistant that writes concise factual summaries. And then prompt, return text or nothing. And then I have found that, um, I don't know if it was this model, but there's some models that will just put annoying white space at the beginning or the end, so I always throw on that trim. And then here I'm going to say export default summarizeArticle, article rather. If you've ever written like prompting code to like the ChatGPT API or something like that, this is a lot nicer to write.

[00:10:17]
They just like kind of like corralled it into like a very common use case, don't get me wrong, it's less flexible. I don't even know if that's necessarily true, but like it requires, it just requires less. Let's, let's say that we wanted to go instead of to Vercel's AI Gateway, we want to go directly to Anthropic. Um, it would look something like this, and you can like, uh, not follow along if you're not going to do that.

[00:10:59]
Anthropic from and then you would have to install @ai-sdk/anthropic. And instead of the model here, let's just do this, it would look like this. You would say anthropic and you would put something in here like claude-haiku-4.5. And all of a sudden you're no longer querying against Vercel's AI gateway, you're now querying directly to Anthropic's API. What's nice about this is like if you already have, like I already have like money in my uh OpenAI account, uh, just from doing other stuff.

[00:11:37]
So if I wanted to use that spend instead of my Vercel spend, you could do it with this, and it would just be instead of this, it would be OpenAI, OpenAI and then you would have here, so on and so forth, you just wrap this with OpenAI. And I think this would just be gpt-4o-mini like this, and now this would be going instead to the OpenAI. I think, go check this one, this one I'm just writing from memory, but does that make sense?

[00:12:01]
It's one of the reasons I actually really do like the, I like that they don't tie you into the Vercel Gateway, and also I think it's just like brilliant marketing because now everyone's using it, it makes it very easy to just like, oh, I want to try this other model. I guess I'll just use Vercel's AI gateway because I'm already using their SDK because it worked. I did do that. OK. Yeah, it makes me sad because I do have a Claude Pro subscription.

[00:12:51]
It doesn't come with any API credit though, which I'm sad. I like, I feel like I should get like some nominal amount of money per month. So, Anthropic, you should hire me to do pricing. Just kidding, don't hire me to do pricing. I hate doing pricing. Um, let's also, I just wanted to show you as well. Uh, it's in here. Where is? This, yeah, so. They support a lot of models, a lot, a lot, a lot. But you can use DeepSeek is on here, that's always an interesting one to try.

[00:13:26]
Mistral always puts out some interesting ones if you want Grok. Um, don't sleep on the Gemini models like Gemini 2.5 Flash is like really, really, really good for some things. Like my general thinking is I usually try, uh, for just like all tasks I work on, I usually start with Sonnet 4 or 5. It just seems to be kind of good at everything, it's kind of fast. Um, it's just expensive on here, right?

[00:13:51]
Like their input tokens are 3 bucks here, as opposed to one that we're going to be using is 5 cents per million tokens. And then like this Gemini 2.5 Flash here, this is obviously by the time that this gets edited and put out here, these are going to be like Gemini 7, right, or something like that. Um. But yeah, don't sleep on, definitely try these ones. It was one that I was always surprised at how well it does on something versus how much it costs.

[00:00:00]
Um, yeah, so feel free to get these, uh, OpenAI, uh, the GPT-4o ones are always interesting as well. Yeah, some of these I don't even know like I don't even know what Zhipu is. Never tried it. Anyway, that's what's fun about the AI SDK though. All you have to do is swap the string and all of a sudden you're using a different model.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now