Build an AI-Powered Fullstack Next.js App, v3

Vector Database Overview

Netflix

Lesson Description

The "Vector Database Overview" Lesson is part of the full, Build an AI-Powered Fullstack Next.js App, v3 course featured in this preview video. Here's what you'd learn in this lesson:

Scott introduces vectors and vector databases. Vectors group related data points which simplify the amount of context required for the OpenAI API to respond to a prompt accurately. Each entry will be added to a vector database, and only those closely matching the subject of the prompt will be used in the search analysis.

Join Now

Preview

Transcript from the "Vector Database Overview" Lesson

[00:00:00]
>> So let's do the AI part. This part is really cool. Don't worry if you can't get the form down. It's just a regular form. You can get to it later. I also have the code on GitHub. Okay, AI stuff, this one's gonna take some explaining. [LAUGH] Because it's way different than what we did here, much different.

[00:00:17]
But still pretty exciting. So I'm gonna make a new function in our AI file, and I'm just gonna call it QA. It's an async function. Okay, let's talk about this. In order to do what we're trying to do, I have to explain to you how the AI works.

[00:00:37]
So in-order for something like a LLM to be able to answer a question based off some information, which is what we wanted to do, we wanna give it. So think about how did the the sentiment analysis, we had to give it the entry, right? If we didn't give it the content of the entry, how would it have done the sentiment analysis?

[00:00:56]
Okay, so if we want to have it answer a question across all of our entries, what do you think we need to do?
>> You have to patch it and feed it to OpenAI as well, or is there a different line chain module?
>> Or feed what to OpenAI?

[00:01:17]
>> Yeah, all the [CROSSTALK].
>> All the entries?
>> Yeah.
>> So, yeah, you do. You wanna feed all the entries. The only problem is, there's one huge caveat, is that most LLMs have a context limit on how many tokens you can feed it. A token, you can think of a token on average is basically like four letters, every four letters is a token.

[00:01:38]
And most LLMs, or actually every LLM has a token limit on how many tokens you can feed it. And it depends on what model, GPT-3, every new one that comes out has a bigger window. I think the biggest hosted big one right now is probably. They just released one that has like a over a 100k token limit, which is massive.

[00:02:01]
I forgot what GPT 3.5 Turbo is, but it's not big. And eventually you'll hit that limit. We might not hit it with our small data set, but you will hit it. I mean, you can go to ChatGPT right now, go copy a long blog post and paste it in GPT and ask it a question about it, and it'll throw an error.

[00:02:17]
They're like, I can't, this is too much. You passed my context window. So we got to find a way to get around that. The best way to get around that is like, well, do you always need everything to answer this question? Maybe you don't. Maybe you don't need everything.

[00:02:35]
That's one thing to think about. Maybe you only need a specific set of things. If I have a history book that's got 100,000 pages in it, and I'm asking a question about a specific date and time, do I need to tell GPT about this entire history book or only the part that references this date and time?

[00:02:53]
That's a solution. Another solution is to make multiple GPT calls, each call filling up the context window until it is aware of all of it, and then you kinda like summarize every single time. But basically, it's recursive, you would feed it some context, ask the question that you want, get that answer.

[00:03:16]
Feed it some more context, you'd be like, is this an answer still valid based on this new context? And you'll keep doing that until you're done with all the data that you had, and then eventually you'll get the final answer. That works too. There's a lot of ways.

[00:03:27]
We're in the number one, the first way, where we're just like, maybe you don't need all this data. We're only gonna send GPT the data that it needs to answer your question. But we're gonna use AI to figure out what data that we need to answer the question.

[00:03:44]
Actually, we're gonna use something called a vector database. Anybody ever heard of that? Okay, let's talk about that for a second. So what is a vector? Vectors are just numbers, really. If you've ever done matrix math or anything like that, you could think of a vector as just like an array of points, right?

[00:04:04]
It's an array of numbers, a number between 1 and 0, and it can have many different numbers in that one vector. The amount of numbers in it is called, how many dimensions it is, right? So in the case of OpenAI's capabilities, you can have vectors that are over like 1500 dimensions.

[00:04:26]
Too many to plot in physical space, right? You can't plot 1500 dimensions, we don't even know what 4D looks like, yet alone 1500. So why is that important? Well, basically what's gonna happen is we're gonna use AI to take the content, in this case, our entries, the text of that, convert those into vectors or embeddings.

[00:04:48]
An embedding is like a collection of vectors. And basically we are taking that string and turning it into a number that represents that string. It represents the meaning of it, the sentiment, everything about that string is being converted to a number, okay? We're then gonna store that number in a database, a special database called a vector database.

[00:05:08]
And then when you ask a question, we're gonna do the same thing. We're gonna take that question, we're gonna convert that to a number. We're gonna put that in the vector database as well. And now, forget about 1500 dimensions, just think about two dimensions. If I had two-dimensional points plotted on a chart, okay, and they're just scattered all over the place, and then I get this other one that comes in and I plot it, I can do math to figure out which points are closer to this one, right?

[00:05:33]
And I'm gonna grab the ones that are closer, let's say the top five, and I'm gonna say, cool, these are the top five points closer to this point. That's basically what we're gonna do. So we're gonna turn your question into a vector. And then we're gonna see what vectors are the closest to it in this space, which will be what entries are closest to this in meaning and reasoning and whatever algorithm that AI is choosing.

[00:05:55]
And we're gonna pull those out and we're gonna feed that as context to ChatGPT, just like we fed this single entry itself. That sounds like a lot of work, [LAUGH] very hard. And there's companies literally that are raising billions of dollars to do just this. There's vector databases that are worth billions of dollars now that just came out two years ago.

[00:06:16]
Pinecone is one of the biggest vector databases out there. It's amazing. We're just gonna do one in-memory. We're gonna do in-memory vector database. But a lot of databases are specialized for this, some of them extended. So you can do vector database in Postgres, Redis, Superbase. You can get a standalone vector database, ChromaDB, Pinecone, Milvus, Weaviate.

[00:06:36]
There's so many of them. We're just gonna do one, in-memory, because we're just not gonna do that. I've messed around with all of them, they're all just too much to set up and too annoying. So let's do one, in-memory. Okay, any questions on any of that? That was a lot.

[00:06:52]
Actually, yeah?
>> Are they similar to a KNN algorithm?
>> KNN, I'm not familiar with the KNN algorithm, so I can't tell. I think they might be describing how the search on the algorithm or the search on the vector databases, and there's different algorithms for that. One is cosine similarity, which is what we're gonna do.

[00:07:14]
I think KNN might be one of those algorithms, in which you can search vectors on, but I'm not a machine learning expert or mathematician, so I don't really know, but could be. Yeah, if you go to OpenAI's website, if you go to their blog, they actually have a really good, They have a really good illustration of vectors.

[00:07:43]
Let me see if I can find it. I think it's on this other one where they talked about embeddings.
>> Nearest neighbor algorithm, is what they were referring to.
>> Nearest neighbor. I think you could use a nearest neighbor for that. I think the nearest neighbor gets more complicated the more dimensions you have, though, so I don't know.

[00:08:03]
It's actually not that complicated once you look at it. Okay, so here's a 2D example of it. So imagine all of these are journal entries, right? They're plotted based off meaning. The closer they are, the more they relate in meaning or reasoning, right? So like canine companion, say, that's very related to wolf, all right?

[00:08:26]
Feline friends, say, meow, those two are very related. And these are animals, so they're kinda grouped together, right? So everything's kinda grouped together by their meaning and relations. And then, basically, the question that you ask gets turned into one of these two, and it too gets stuck on this graph.

[00:08:44]
And then from there, we're just gonna pick the ones that are closest to it. Take those out, feed it to GPT and keep it already to the context window. And then we just ignore everything else, cuz you didn't need all of that to answer this question. That makes sense?

[00:08:59]
Okay, and I have like 3D one too. All right, here's what I wanted to show. Yeah, this one. So you can kinda see how these things are plotted, right? This is just a 3D representation. There's only three dimensions, but it goes higher than that. So you can see these things are grouped together.

[00:09:13]
Here's the text for them and why they are grouped together. So these are like all animal stuff, athletes, some film stuff, something about a village, and transportation, and they're all grouped in that way. So when you ask a question, we're just gonna throw yours in here too. Let's say that one's gonna be a yellow dot.

[00:09:30]
Which one of these dots are closer to the yellow dot, take those. Yeah, it's kinda cool, yes?
>> How deep have you gone into vectors and matrices? Is it fairly superficial understanding?
>> Yeah.
>> [INAUDIBLE] or you are like, no, keep keep learning what you can.
>> Yeah, so to be able to do what we're doing now, what I just told you is all you need to know.

[00:09:57]
You don't need to know any of this stuff in order to be able to build applications with this stuff.
>> But you don't, so as a boot camp developer, like don't worry about it?
>> No, you do not need to worry about any of this stuff. The only reason I went deep on this is to, one, part of my job is taking a lot of these technical pitches.

[00:10:16]
I'm usually talking to PhD scientists from MIT who's like pitching me a company, and they talk very deeply about this stuff. So I'm like, okay, I need to go understand this. So that's one. And then two, I've always just been interested in math and stuff like that. I'm not a mathematician, if you know me, you know I'm terrible at math.

[00:10:33]
But it was always something I was interested in, so I just never was excited about it. This actually got me excited to go learn. So I actually asked ChatGPT to build me a math curriculum to teach me more about AI and machine learning. So then I just went and followed that curriculum and-

[00:10:47]
>> So cool.
>> That's how I know a lot of stuff. But I would even say, I'm not good at it. I don't understand it as well as the people who created it. I can have a conversation about it, but don't ask me to write an equation about this stuff, cuz I don't know.

[00:11:00]
I really don't know.

Learn Straight from the Experts Who Shape the Modern Web

In-depth Courses
Industry Leading Experts
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now