Transcript from the "Connecting GPT to Data" Lesson
>> So if you ask me, I think this is kind of magic that you can give more context to the prompt. And that's how you do that. So if you want to create a chat in your own website that answers questions from your frequently asked questions. Well, you need to pass to GPT, you are frequently asked question.
[00:00:25] And then the user question. Doesn't make sense and you need to explain, you will answer only from this set of questions. Things like that, that's how it's done. But is there any other way? So because we cannot fine tune LLMs such as such, chatGPT. We need to go back to the prompt.
[00:00:47] But if you wanna connect GPT to our data, our databases or our PDF files, The magic happens in the context. And we know that we need to pass a lot of information In this chat format that we have. Typically we pass that in the system message remember that we have a dialogue and we can pass a system you are this and that well there you can add all the information or the context that you want to provide.
[00:01:19] The problem is that sometimes the context is too large. GPT doesn't have memory so we have to inject the prompt on every call, which is costly first and seems like a problem, right? So for large databases and documents there is a design pattern. To split the data and do a search outside of the model before the prompt.
[00:01:48] Let me explain how that works, for example we can connect GPT with our own data for a chatbot that can answer questions locally. To search transform summarize our own data. Or something that is kind of a final use case that we see lately a lot is chat with, chat with a PDF, chat with a video.
[00:02:15] So you can chat with these videos you're watching right now. In this case We use the caption, we need to use text for now, the captions of my voice speaking. And we can put this into this process that I will explain in a minute. And then you can chat with I know some kind of myself in the metaverse of GPT, at least in this video.
[00:02:38] So how does it work? The idea is that if you have an idea of what you're looking for you first search. So for example, someone is asking about how to make from this video how to make a ChatGPT plugin. Okay, so you know, that has to do with ChatGPT plugin.
[00:02:57] And I haven't mentioned ChaGPT in the last hour, so we know that it's not in that park. But there is a part of the video where we mentioned that. So we search that we can search in a normal database in files or data collection. And then you pass the result just the piece, the chunk of text of that part of the video to ChatGPT as a context.
[00:03:27] I know it's a silly. But sometimes it's easy when not simple to understand where in the whole document, we have the answer for the user. It's difficult just by searching by keyword. It's not so simple. For example, if you asked me about GPT, I mentioned GPT over the whole video 1000 times, so then you will pass the whole video text?
[00:03:52] No, we have another solution for that. And that solution is called embeddings. And we use a different kind of database for that. That open AI is not helping you with that part. So, because this is an open AI and ChatGPT course. I will give you some hints on where to look for these APIs, but we won't cover that today.