Build an AI-Powered Fullstack Next.js App, v3

Similarity Searches with Vectors

Scott Moss

Scott Moss

Superfilter AI
Build an AI-Powered Fullstack Next.js App, v3

Check out a free preview of the full Build an AI-Powered Fullstack Next.js App, v3 course

The "Similarity Searches with Vectors" Lesson is part of the full, Build an AI-Powered Fullstack Next.js App, v3 course featured in this preview video. Here's what you'd learn in this lesson:

Scott codes the helper method to receive a question, creates an in-memory vector database, and performs a similarity search based on the entries. The result is an embedding that can be sent to the OpenAI API for analysis.

Preview
Close

Transcript from the "Similarity Searches with Vectors" Lesson

[00:00:00]
>> Let's get started. So where's my GitHub? There we go. Okay, so if we go into the AI one, I also don't have this one open because there is a lot going on here and I want to make sure I get it exactly the way I had it.

[00:00:14]
So the first thing is if we go into our AI, we have our function called QA. QA just needs to take two things. It needs to take the question that you're asking and then all the entries. Cuz we have to index, we have to put all the entries in the memory database cuz we're doing in memory database.

[00:00:30]
If this was like a hosted database, we won't have to take the entries every time, we'll just index them every time we save them in our planet scale database. Every time they get saved in the planet scale, also save it in the vector database. But we don't have a hosted vector database, in memory database.

[00:00:46]
So we have to create it every single time we ask a question. So we'll take the entries, and then basically what we wanna do, I'll just kind of give an overview of everything, and then we'll talk about the specifics of what's actually happening. We want to turn everything into what's called a langchain document.

[00:01:02]
Nothing fancy about it's just an object that represents some string. And you can add metadata to it like where it came from when it was created, whatever you want. This is helpful if you wanna have the AI cite sources on where it learned information from. Because some, that's very beneficial.

[00:01:19]
If you ask AI question and you say, can you cite where you learned that from? So you can put that metadata in your document so it can use sites to understand where it came from so you know it's not just making up stuff. Or like where did you get that from?

[00:01:32]
Really cool. So we'll take that, then what we'll do is we'll create a model, from there we'll do this thing called a chain. We'll talk about a chain in a minute, chain inside of Lang chain, we'll create embeddings, embeddings are just a group of vectors. And then we'll store those documents in our memory vector database.

[00:01:53]
We'll perform that similarity search that I talked about where we take your question, turn it into a vector, plot it, grab all the similar ones, come back with the similar ones. And then we'll do our a API call to the AI and get back our result. Cool, so let's do that.

[00:02:12]
So the first thing is let's take our entries and turn them into docs. So we'll say docs = entries.map, and we get an entry here. And basically we wanna return a new document that we can get from langchain. Let's see if it auto-completes it for me, and I don't know why the autocomplete pulls from Dist.

[00:02:38]
Let's see if it pulls from anywhere else. It must be like lang chains, the way they set their thing up, because the autocomplete never works for their stuff but I'm pretty sure it comes from-
>> /dist/document/chain/dist ASD.
>> Yeah, typically you don't wanna pull from dist. This is typically like, yeah, that's just like their output folder.

[00:03:00]
But there should be like another one, right? So it probably is. Let's see. Where did I pull it from? Yeah, it's just /document, right? But I don't know why the autocomplete doesn't see that. I think it's just something, how they set their thing up. But yeah, you typically don't pull from dist.

[00:03:16]
Yeah. And then it has this green, this is what throws me off. So usually when you see the green there, to me that tells me that that's usually a type file, like a file with just types in it and not a typescript file, so that's why it always throws me off.

[00:03:33]
So I don't know why they have that there, I don't know. Anyway that's just some OG stuff I guess [LAUGH]. So let me make a new document, And then inside the document, you can put the page content, the page content is just gonna be the entry.content. And then we can have the metadata, which is just an object.

[00:03:55]
And then I'll put like, I want the ID of the entry in here, so we know where this came from. Maybe I want when this was created, I'll get the entry. CreatedAt. And this is helpful if I asked a query on, well, how did I feel two days within the last two days?

[00:04:14]
Well, now we can calculate that cuz there's some metadata here on when this was created. So it's helpful to do that. You can put whatever you want. I mean, you can put the whole entry in here if you want, it doesn't matter. I'm just gonna put those two things.

[00:04:28]
So now, we have an array of documents. And from there, I mean, there's really no right order or whatever we wanna do here. But I think from there, what we can do is we can just load up the model and then create the chain. So for the model, because this is a QA model where we want it to be very strict, we can say, model = new, open AI, temperature, 0 model name, gpt- 3.5-turbo, like that.

[00:05:14]
And then we're gonna create a chain. So a chain is basically, it's exactly what it sounds like. It allows you to chain multiple LLM calls together. You can string them together, have the output of one, be the input of another one. That's basically what it is. Lang chain has many different types of chains.

[00:05:31]
They have like a chat chain, an LLM chain, agent chain, different types of agent chains, it's actually exhaustive. I don't think I've even tried half of them yet. So what we're gonna do is use the one that they have here called the low QA refine chain, so let's do that.

[00:05:48]
So we'll say, Chain = loadQA, it doesn't autocomplete for me. No, this one's from Langchain chains. Yeah, they just changed all their imports, used to just be able to import everything from langchain until a week ago. So that's why I'm still learning these imports. Load, Not QA chain.

[00:06:21]
Load QA refine chain. We'll load the QA refine chain we'll pass in the model. So what is QA refine chain? I think it's helpful just to talk about that. So that one is, I believe it is on index document QA. Here we go. So the QA refine chain.

[00:06:57]
They talk about it right here. So this chain iterates over the inputs of documents one by one, updating an intermediate answer with each iteration. It uses previous version of the answer in the next document as context. It is suitable for QA tasks over a large number of documents, so that's what we're using this.

[00:07:16]
There's other ones too, like MapReduce stuff. You can use stuff if it's just, you don't have that many documents, you can just stuff them all into one prompt. That's why it's called stuff, a MapReduce is like- How do I describe this one? So, yeah, this one basically just, it just like maps over it and tries to summarize stuff and create smaller.

[00:07:43]
It tries to do things in parallel. So what it'll do, it'll make multiple calls to the AI in parallel. And then take all the results of all them. Combine those results and ask another call to get the final result. So literally like maps over all of them and then reduces it into one MapReduce.

[00:08:02]
We're gonna refine documents, I think that will get the best results for what we have. But as this app grows, you might find you need something else. Okay, the next thing we're gonna do is create the embeddings that we wanna use, and then we'll use a vector store to get that.

[00:08:19]
So we can say the embeddings. There's many embeddings model, we're just gonna use the one that OpenAI has, so we can say new embeddings, which I believe comes from /embeddings. Let's see, langchain/embeddings, yep. And we wanna get the OpenAIEmbeddings, like so. We don't have to pass in any args here, it's gonna read the environment variable.

[00:08:51]
This is gonna return a function that any chain can use to create embeddings. Remember, embeddings are a group of vectors. So we need, it's basically saying this is how I want you to create embeddings. In this case, it's going to make an API call to open AI to create an embedding.

[00:09:07]
Open AI has an embedding endpoint where you can send it some text and I'll send you back some vectors. Because you don't have access to Open AI's model on your machine, so you don't know what that is, so you have to use theirs, host it. So that's what this is gonna do.

[00:09:22]
So we have our embeddings. The next thing we wanna do is create our store. This is our vector store, which we can store everything. And this is just going to be inside of, where's this one? Vector store's memory. So we'll do that, and then we'll just do the from documents and parse in the docs and the embeddings.

[00:09:41]
So let's import that. Langchain/vectorstore/memory. We'll bring in the memory vector store. We'll make a new memory vector store.from. I think you don't even need the new actually. It's just memory vector store.from docs. We're gonna pass in our docs, and then we're gonna pass in our embeddings. So we got that.

[00:10:22]
Yes.
>> And just to clarify, document is kind of the required data type from LangChain to be able to.
>> Yeah, if you want to be able to use their stores that they abstract away, you would need the document. But if you talk to, for instance langchain has wrappers for every popular vector database out there, so first let me show you.

[00:10:51]
Vector stores. So you can see they have, these are like some of the top vector databases out there. So they had to create a standard format and how you store things in those vector databases, documents is that standard format. You can interact with these databases outside of langchain, and then they might have their own way in how you store text.

[00:11:10]
It might not be a document, but if using langchain, you might as well build an abstraction, or use their abstraction to use it. So, but yeah. And there's many ways to make a document. They have document loaders that can pull in a PDF file. Actually let me show you.

[00:11:26]
Here's a file loader this can take multiple files this can take CSVs, ePubs, docx, JSON files, PDF files, a notion markdown, a text file. You can take in a website, [LAUGH] you can take in a good book, a figma file, a GitHub repo. It doesn't matter, it'll load all that in and create documents which would then you can create embeddings for.

[00:11:49]
And this is a small list compared to the other stuff that's out there on llama index. If you've ever heard of that, that thing you can pull anything from anywhere on that, it's absolutely insane. And that works with Langchain too. But that's getting off topic. Cool, so now we have the store.

[00:12:10]
Now all we wanna do is perform our similarity search here, so we can say story.similarity search and pass in our question and get some relevant docs. So we'll say, cool. These are the relevant, Docs, and we're gonna await the store.similarity search, and pass in our, why did it do that?

[00:12:29]
What? There we go. Similarity search and then pass in the question. Lines are freaking out. I don't know why I forgot like that. What do you want? Does not exist on, it does? I use this all the time. You have to await this, I'm sorry, that's a promise.

[00:12:54]
There we go. Okay, so now we have our relevant docs. Now we're ready to make the call to the AI cuz at this point, we know what entries you need to answer this question. That's what this is, based off the question you gave us, we know that these are the only entries you need to answer that question because we already put them in the database.

[00:13:17]
And this right here takes your question, puts in the database and comes back with the relevant docs. So now we're ready to make that call. And we can do that with any chain if you just call call, and then we can pass in our input documents, and then we can just pass in our question.

[00:13:31]
So let's just do that. Res = await. Chain.call, relavantDocs, it's gonna be relavantDocs. Or I think it's, I'm sorry, input_docs. Yeah, input_docs are gonna be relavantDocs, and then question is just question. Input_documents, sorry. High school, where are you at? Where are you at? Now we have a response and then response typically, I think it always has .output text or something like that.

[00:14:13]
Yeah, so we could just say return response.output. Text, and that'll be our answer. So this is significantly different from what we did with the first AI thing. This is way more advanced. In fact, this is a really good foundation for building some pretty crazy stuff. I don't even know where to begin to give you some examples of what you can do with this, but, you can do pretty much anything with this.

[00:14:41]
I have some code on my computer that I give it a URL, it'll crawl that website, go to the sitemap, find every URL on that website. Download every single page, create embeddings for every single page, put in the database, and then now I can ask questions about that website, right?

[00:14:58]
I can then I can have it to ask questions across many websites. Great for doing research, great for learning, I can have it crawl next.js's documentation in line. I can just like, yeah, write me some code in nextjs's 13. And it knows how, cuz it knows the whole website, so stuff like that.

[00:15:14]
So, and that's still a small example of all the things you can do.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now