Complete Intro to Databases

Graph Databases

Brian Holt

Brian Holt

SQLite Cloud
Complete Intro to Databases

Check out a free preview of the full Complete Intro to Databases course

The "Graph Databases" Lesson is part of the full, Complete Intro to Databases course featured in this preview video. Here's what you'd learn in this lesson:

Brian explains that graph databases are used to represent data that has complex relationships with other data within a given database, and adds that a node is a graph database that represents an entity. Multiple nodes have relationships or edges between each other. Neo4j is the graph database used in this section.


Transcript from the "Graph Databases" Lesson

>> So the two things I've shown you so far MongoDB and Postgre are very general purpose kinds of databases. As in like, if you're gonna be writing a new app tomorrow there's decent chance you're gonna use one of these too, right? They're meant to be very flexible. They're meant to fit nearly every use case, they're obviously very powerful databases.

We're gonna get into one that's a more specialized use case. So you're not gonna use a graph database for your everyday e-commerce website, for example. You're gonna be using it probably in conjunction with another database. You're gonna be using graph databases with another database for the most part.

So graph databases are more niche, they're more intended to solve a specific problem and less meant to be general-purpose, right? This is not meant to power your e-commerce website. So the example that we're gonna use today is kind of the classic graph database kind of example problem, which is movies and actors, right?

And we'll kind of get into how that works. So, let's get into some of the terminologies. Again, this is one of those things where I use the terms interchangeably, and I'm very sorry about it, but it's also just kind of how this is organized in my brain. And it's probably useful for you to hear all the terminologies mixed together cuz that's kind of how it ends up working.

So the first thing we'll talk about in a graph database is a node or an entity. The correct term for Neo4j, which is the name of the graph database that we're gonna be looking at today is a node, but sometimes you might hear me call it an entity.

A node is just a thing. It's basically a row or a document in a graph database. It, yeah represents a thing. So in our case, if we're looking at movies and people, right, those are gonna be the two different kinds of nodes that we're gonna have. Those are both going to be considered nodes, right?

There's gonna be a person node, and there's gonna be a movie node. Something that's peculiar to Neo4j is one node can actually be multiple different things. So let's say instead of having just a person label as they're called, you can have one person be both an actor label, or actress label or director label.

I node can have multiple different kinds of labels. In this particular case, the only two labels that we're gonna be looking at today are persons, right? Which will represent both actor, actresses and directors. But you could also have a label for each one of those depending on kind of how you choose to solve that problem.

And then, the other label will be a movie, and a movie is kind of a thing, right? So you can think of nodes as documents or rows in a graph database. Now the interesting thing and where graph databases really kind of get their power is they have this things called relationships.

Or frequently these are called edges as well. The correct term for Neo4j is relationship. And it just describes this node is related to this other node, and we're gonna draw some sort of connecting line between the two of those things. And we'll kind of visualize this here in just a second.

That's the power of graph databases, is you can describe this kind of graph, like a web of relationships between different things. A really good use case for graph databases, and probably where they really got pushed forward a lot, is social networks, right? So you can imagine like on LinkedIn, you're connected to this person, this person is connected to that person and that person is connected to that person.

And that's how LinkedIn knows how to recommend. Is like, hey, you probably know this person because you know these 10 people and they all know this person and they're connected to them and you're not. Decent chance that as so many people in your network know this person, you also probably know that person.

So that's where these kind of interesting insights of connections can kind of lead to these insights. Yeah, using these relationships. One thing about Neo4j, that's again kind of peculiar to Neo4j is every relationship is directional, right? So it's not that just, well, let's use an example from a social network.

On Facebook when you are connected to someone, you're both friends. Like you can't be connect to someone you both have to acknowledge we are both friends. That relationship has no direction, right? I'm friends with Mark, Mark is friends with me, and we can't have that be one way.

Compare that to something like Twitter. If I follow Ayesha, right? She doesn't necessarily have to follow me back. That's a directional relationship. So in Neo4j, all of these relationships have directions associated with them. And sometimes you can just ignore these directions if you don't really care about them, but every relationship is going to have a direction.

So in this particular case, if we were describing the relationship between me and Ayesha on Twitter, and we both follow each other. We would actually need two different relationships. One, from me to her and one from her to me to fully describe that relationship. Whereas in Facebook, you just need one relationship because there's only one way that relationship can go, cool.

So, there's things called properties or attributes, I probably will mix up those terms. Cuz they're relatively similar, but in this particular case, they're gonna be the same thing. Every relationship and every node can have properties associated with them, right? So, for example for talking about like an actress, like we have Charlize Theron or something like that.

She can have a birthday. She can have, I don't know hair color, eye color, those kinds of those will be attributes of that particular node. And then a relationship can also have a node. You can also have if Charlize Theron acted in Arrested Development, that's what I'm watching right now.

That's why it's top of mind. It can have like the role that she played in it. It can have the year that that happened. It can have the episodes that she was in, right? So those are kind of the attributes that a relationship can have as well. So both entities and relationships can both have properties.

So that's kind of the high level terminology of what goes into a graph database. We'll start kind of getting into it, and using them and sort of become a bit more concrete. We're gonna be using Neo4j, that's kind of like the de facto. If you're gonna use an open source graph database, it's kind of one that people reach for the most or at least it's the most well known.

There are other ones, there are other graph databases, you can make MongoDB and Postgres kind of acts like graph databases. There's a lot of different ways to accomplish this. And then there's obviously, things like again, Cosmos DB and some of those cloud based graph databases that are available to you as well.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now