Check out a free preview of the full Build AI-Powered Apps with OpenAI and Node.js course:
The "Document QA Systems" Lesson is part of the full, Build AI-Powered Apps with OpenAI and Node.js course featured in this preview video. Here's what you'd learn in this lesson:

Scott introduces document QA (Question Answering) and compares it with semantic search. Document QA systems provide direct answers to questions, rather than just returning links like search engines. Use cases for document QA systems are discussed as well as benefits including efficiency, accuracy, and scalability.

Get Unlimited Access Now

Transcript from the "Document QA Systems" Lesson

[00:00:00]
>> Let's talk about the next one, which is very similar to semantic search, but it just goes a step further, which just makes it harder to deal with, but also pretty cool. And this one is Document QA. So we all know what a QA system is. I mean, you ask it a question.

[00:00:15] It gives you back an answer. Every website has an FAQ that already has those things built out. This is just more life, right? It's literally a QA. So this is a little different than search engines, right? So a search engine like Google is gonna return links that you can go to, but they're not gonna tell you the answer that you're looking for, right?

[00:00:39] Like if I'm searching for the weather, I think this is a bad example, cuz Google does show you the weather now, and every other search engine will try to show you a widget of the weather. But you could think of that as a document QA. So what they're doing is they're going somewhere, getting the weather and they're answering it for you right there.

[00:00:55] Versus traditionally, they would show you the links of places on which you can go find the weather yourself. That would be more like a search. Like, you search for weather, here. Here's the results. Go click on it and figure out how the weathers there. A QA is like, we already figured it out for you.

[00:01:11] We clicked on that link, we got the data and then here is the weather, right? So that's what a Document QA is. It's more of like, here are the hits versus like, here's the answer that you wanted, right? And then it's also, typically, whereas a search might be indexed across multiple things, like in our case, movie titles, a Document QA could just be one thing.

[00:01:35] It could just be one movie, and I want to do QA on one movie. I wanna ask questions about that movie, right? It's not always against multiple things. And that's the difference. You wouldn't need to search if you were only indexing one movie. If you only had one movie in your database, you wouldn't need a semantic search, you only have one.

[00:01:53] But you could definitely make a Document QA about that movie and ask questions about it. If you have a wiki page about that movie and you index that, that's great for Document QA, terrible for search. Okay, so some of the use cases, like I said, knowledge bases and wikis.

[00:02:07] If you have a wiki, your company, and you wanna be able to ask questions on the wiki, like, for instance, you guys have an engineering style guide of like, this is how you need to write these functions. This is how this stuff should work. Here's our pull requests thing.

[00:02:18] This is great for that. You can just have a bot. Imagine a Slack bot that knew everything that onboarding engineer needed to know, and they can just talk to it. And like, how do I get started? How do I set up the repo? What's the best way to write functions?

[00:02:34] Do we use semicolon, right? If it was indexed on that data, it would answer those questions perfectly and accurately, otherwise it wouldn't give you an answer at all cuz it can't give you an answer that isn't based off reality. Or at least [LAUGH] that's how you want it to be.

[00:02:48] How you get there is a whole nother question. So it's good for that. Customer support's really good too. You can have like, we're gonna make a bot that is indexed on all of our customer support policies and everything, our warranty, our legal policy, whatever. So when someone sends in an email, we can take that email, turn that into a question, run that against our policy, and we can get back some type of response that if we feel safe about it and we feel good about it, we can just automatically send that back.

[00:03:17] Or we can just put it in draft and a human just has to either go add to it or clean it up and then they can send it themselves. But it does a lot of the work. So, in fact, there are a lot of customer support AI tools out there that do a lot of this, because it works so well.

[00:03:33] Research and academia actually use this a lot. Whenever new whitepaper that comes out, if you don't know what a whitepaper is, it's just like a scientific paper. It's just like a format in which scientists released their studies. It's called a whitepaper. And I'll always go look at a lot of the AI whitepapers that come out.

[00:03:48] And I just can't make sense of them, so I will take it, and I will stick it in ChatGPT, and then I can do QA on it. I can be like, okay, how is this different than this? Explain this to me. What does this mean? How does this work?

[00:04:00] And that way I'll learn a lot faster than just like reading the entire white paper, which confuses the hell out of me, because I don't know any of the stuff they're talking about. So I use this a lot for a lot of science stuff, and before GPT, I was never able to learn any of that stuff.

[00:04:15] Legal and compliance is really good. Obviously, there's a lot of information there. I talked about some of the policies and things you can do there. So, it's pretty good. It's efficient. It can be accurate, which eliminates the hallucination problem, right? So now if there's a source of truth in which the AI has to pull from, it helps eliminate the fact that the AI is just making shit up, right?

[00:04:35] We don't want you just making stuff up. We need you to answer from this truth here. And it is actually pretty scalable. You can handle a lot of stuff here because you're guiding it towards a direction. You're using it that way versus it just trying to figure stuff out without any direction.

[00:04:53] So the more you can steer the AI, the more efficient it'll be, and the more accurate it will be, but the less flexible it'll be. But that's fine cuz you probably don't want it to be flexible in a Document QA. You want it to be accurate. You don't want it to be flexible.

[00:05:07] You want it to be sure and exact. Imagine if you were a doctor looking up diagnosis and it was like, well, I'm kind of flexible on my answer on what this diag, no, no, that's not good. I want it to be accurate and factual. I don't want you to be flexible here.

[00:05:21] Whereas like, hey, help me come up with a children's book title. Yeah, please be flexible. Show me something creative here.