Complete Intro to Databases Indexes in Neo4j
Transcript from the "Indexes in Neo4j" Lesson
>> So we've been talking a lot about query performance and cost and Neo4j is going to be no different, right? If you write really expensive queries, running Neo4j in production's gonna end up being very, very expensive. So it also has the ability to add indexes. So let's say for example here, you're writing another app, and this app will tell you who was born in the same year as you.
[00:00:24] I don't know if that's an interesting app to someone, probably is, I'm pretty sure it exists. But let's say that's something that we're gonna do with our graph database here. So if we copy this query here, put it over here, it's a really simple query, it's gonna say match person where p.born equals some year, right?
[00:00:47] So if I play that R let's just put p.name. You can see here we have seven different actors and actresses and directors, because Lily Wichowski is a director, in our database that are born in that particular year. Now this actually ends up being a fairly expensive query, because it's actually going to go look at every single node in our database and say what year were you born, is it this year?
[00:01:22] Again, if you have a huge data, if you are IMDB and you have the entire graph of all movies, that's going to be a very expensive thing for you to look at that every single time. So it's actually come back over here, so same query RETURN p.name, okay.
[00:01:41] Let's run EXPLAIN, it's nice that these all work relatively the same way, right? Explain this, and it's gonna tell you its plan of how it plans to evaluate this. So it's gonna go in here and it's gonna do a node by label scan. And it's going to look at 140 rows variable.
[00:02:04] So let's look at 140 nodes in our graph, guess how many nodes in our graph we have 140, right? So this is not a very efficient query. Let's go ahead and give that an index and see how that affects it. Yeah, same thing. Yeah, so this node by label scan is really where you need to be afraid of what's happening here.
[00:02:41] So let's say CREATE INDEX FOR, (p:Person), On (p.born) And try that, create index for p:Person and p.born. So added one index now. We're now indexing all of our celebrities by their birth year. And if you run that, explain again. Let's see if I can just make this smaller for just one second.
[00:03:24] You can see here instead of node label scan, we're getting a node index seek, and this is much faster. So this was available after 23 milliseconds. This one up here where I ran it the last time. So, I mean, we have a very small data set so you're not going to probably see a huge increase.
[00:03:48] When I was running this earlier, I actually saw like a six x speed up but I was probably running a bunch of other stuff in the background as well. So this is how you end up creating indexes in Neo4j. And again this is really nice because, We're not looking at the same, that's why p.name, Instead of looking at 140 records in our database, we're looking at just three, which is much faster, right?
[00:04:22] So those hops and those trees end up being a lot faster. There's a bunch of different kinds of indexes in Neo4j. If you have paths that you're looking at frequently, or if you're doing a lot of relationship examination. You can index relationships, you can index nodes. There's a bunch of different kinds.
[00:04:43] Yeah, so feel free to go ahead and check that out.