Transcript from the "Indexes in PostgreSQL" Lesson
>> There's obviously a lot more you can do with these JSON data types. I just wanted to introduce you to them make you aware that they're out there. Kind of show you this really fun way where you can have schema and schema less data living next to each other.
[00:00:14] It's a really powerful feature of Postgres. And it's made it a lot more popular since they introduced it, because this fits now a lot of use cases that only MongoDB could've fixed beforehand. Just like in MongoDB, performance and cost is always a concern. So let's take a look at, let's examine one particular query here.
[00:00:34] Select comment ID, user ID, and time and comment sorry left. Comment 20 from comments where board ID equals 39, Orders order by time, Descending limit 40. So as you can, this is something that you might imagine that you'd be doing a lot for a message board, right? You might be querying that message board database for, give me all the comments by board.
[00:01:17] If I thought if I visit the page for Board Number 39, right? Which is the coffee board, you would probably be querying the database to give you back all of the comments for that particular board. Well, let's look at the performance profile of this particular query that we've done here.
[00:01:36] So just like how MongoDB you can do explain with SQL, you just put, explain in front of whatever query you're about to run. And it'll tell you, here's my plan to do that. So this cost, it's a relative number actually don't even really know what it's measuring. But it's 65 is a lot [LAUGH] I'll just tell you that much.
[00:01:58] It's not a good query, when you're that high and the thing just like we were afraid of seeing called scan MongoDB. The thing that we're afraid of seeing in Postgres is this one here. Sequential scan, you can see a lot of this cost here is coming from this particular sequential scan here.
[00:02:22] This means it's looking at every single comment in the database every single time that we run this query. Now imagine that you're running a Reddit scale and you're running this a lot in production. You're gonna have an either an astronomical bill or downtime or both more than likely.
[00:02:37] We can basically say ahead of time, this is a kind of query that we're gonna be doing a lot. Please index this for us, right? So what I wanna do now is I'm gonna say create index on comments board ID like that. So on the comments board or on the comments table, index on the board ID because that's something I'd be [INAUDIBLE] by quite a bit.
[00:03:06] Okay, again, don't just go run this in production, this is a very hefty command. This only works because I'm on my computer, and it's not doing anything else right now. Now, if I run this again, instead of seeing sequential scan here on comments, what I'm seeing is bitmap index scan.
[00:03:26] This is the name of the index. And this went from being 65 cost to being 33 costs. Much better, right? Much, much better cuz this bitmap index scan and bitmap heap scan, these are going to work significantly faster than what we had before. So something else that we'd wanna do.
[00:03:53] Let's say we wanna make sure that all users names have a unique username. We can do that with a unique index just like we saw before. So we gonna you say, create unique index, we can give it a name of the index, right? We'll just call this one username index like that.
[00:04:18] On users username, like that, okay? So now, not only do we have an index on usernames in the users index. But we can also be totally assured that all of them are gonna be unique going on forward. This it will fail if there's already a duplicate in there.
[00:04:38] So it's gonna make you all right, go do duplicate this yourself and then once you're done we can create this index. So, select star or select username from users where user ID equals ten, right? So we got this one, what happens if I tried to actually I'll just copy this from over here.
[00:05:07] This one right here. I know this Aaizikovj I know that one exists in my database. So if I try and run this with a duplicate username. Like this it's gonna say, duplicate key value virus, unique constraints. I'm not gonna let you do this because you promised me this is gonna be unique and this isn't unique.
[00:05:35] So that's another way of doing that. Postgres has a ton of different types of indexes. I just showed you two of them. I left a link there in the course notes for you to go, click on and see all the different types of indexes. Cool, and that's just kind of like a whirlwind tour of indexes.
[00:05:54] I'm gonna be the same advice even for MongoDB. In general, it's useful to see how you're using your database and where the expensive queries are. And then go and create indexes for those. Usually you don't wanna go too much in advance and prematurely optimizing. Because bad indexes and useless indexes just slow down your computer or slow down your searches and waste a lot of space.
[00:06:16] Yeah, indexes are not small, because it's creating like a whole data structure to represent your table in a different way, so that I can search it more easily.