This course has been updated! We now recommend you take the Complete Intro to Computer Science course.
Transcript from the "Graphs" Lesson
[00:00:04]
>> Brian Holt: Graphs, I think this is really cool. I've now worked for two major, social networks. I worked at Reddit and I worked at LinkedIn. I now work for Microsoft. I'm trying to get a new job every time I come back and give one of these. [LAUGH] Mostly successful so far.
[00:00:25] But a big part of social networks are graphs. And actually, a surprisingly amount of computer science, especially as we go forward. We're finding out more and more that graphs are a central concept to what we want to achieve as engineer. So the basic idea, well, let's just talk about an example of a graph.
[00:00:49] So Facebook's social graph, is probably one of the more well known uses of the term graph. The idea being that I'm friends with you, you're friends with her, she's friends with him, right? And he doesn't like me, so he's not friends with me. It's okay, I don't blame you.
[00:01:08] So that's the idea that this graph that we have all these nodes, we're still gonna call them nodes. And then, we have all these connections between the various nodes for whatever reasons, right? You can call these connections edges. That's the computer sciency term for it is the various edges in the graph.
[00:01:27] Now, what I just described to you, the Facebook social graph is what we call bidirectional graph. Which means, if I'm friends with you, you are friends with me. Sorry, that's the way it is. But because it's connected in one way, it's implicit that you are connected in the other way.
[00:01:45] A good example of a unidirectional graph would be something like Twitter, where like I follow you, but you don't follow me. So in this particular case we are going to be unidirectional graphs. But almost all of these things work apply as well with directional graphs, as well. In Facebook's example, each person would be a node.
[00:02:05] And the node represents some entity, you can think that entity is almost a row in an SQL database. It's an object, it could be a document in a Mongo database. Or hopefully, if you're doing graphs you are using something more like a graph database like Neo4j or something like that, cool.
[00:02:26] Graphs are just everywhere, various social networks, a lot of internet of things that have mesh networking, right? Those are graphs of sorts as well. Neural networks, so like machine learning and AI, that's just graphs up and down, right? Especially, neural networks, they're just neuron nodes that just fire and strengthen graph edges and things like that.
[00:02:49] It's all turtles all the way down. It's just grass everywhere
>> Brian Holt: Yeah, machine learning in particular, very very cool. And I think that, this is just more and more technology beginning to mirror what nature's, right? In terms of nature, everything is connected to each other, right? And so, what we're doing with computer science now is we're modeling how things connect together, so really cool stuff.
[00:03:28] And some of these things that we're about to get into is gonna be kinda familiar because it's gonna sound like things we've already talked about, weird. So let's picture that we have this social network, that I'm right there. And I'm connected to both Bob and Maria. And Bob is connected to Sally and Alice.
[00:03:47] And Alice is connected to Bob and Maria.
>> Brian Holt: In this particular example, that we're gonna do together, it's gonna be like if LinkedIn and Twitter had a horrible social network baby. It's a unidirectional graph that's a professional network, right? So let's say, I wanted to find the most common job title in my social network of the people that I follow.
[00:04:17] So I would check, if I wanted to check just one degree of separation, so people that I'm actually connected to. Then I would check Bob and Maria, right? But if I wanted to check two degrees of separation, I would check Bob and Maria. They're both connected to Alice, so the key here is that I only check Alice once, because Alice is just one person.
[00:04:38] But I would also check Sally, so two degrees of separation. And we can go on and kind of fan out around this network. Now, you might be looking at this and say, well this is, kind of back to trees again, right? I don't have it diagrammed out. But I would essentially, be the root node in this particular tree, right?
[00:05:01] And I'd be traversing this graph, look at Bob and look at Maria and then I'd look at all their children. And then, I'll look at those children and those children and those children, right? So it's again very, very much like a tree. In fact, it's kind of the opposite relationship that trees are just kind of a special form of graphs.
[00:05:19] [LAUGH] So what's cool about that, is we get to apply the kind of the same lines of thinking, the same algorithms, and it still works with graphs, which is really, really cool. There's just some more concerns that we have to worry about, right? We have to worry about there are circular references, right?
[00:05:40] I'm connected to Bob, who's connect to Alice, who's connected to Maria, who's connected to me again. I don't want to count myself twice as well, right? So it's kind of these circles that we kind of have to be careful of, but all you have to do is keep track of yeah, I wrote to Alice before.
[00:05:53] And now, I'm not going to look at her again, right? The same thing as it was like a blocked node in our pathfinding algorithm that we were just doing. So let's say that I have a network, I follow a hundred people and a wanna check within four levels of separation the most common job title.
[00:06:16] Well I'm trying to process all these nodes in a tree and I need to check each one of them, and I wanna stay close to the root node, right? Because I don't wanna go all the way out to the 17 millionth connection level. This sounds a lot like breadth first traversal.
[00:06:31] So I'm sorry, but I'm gonna make you write it again. [LAUGH] This is all just breadth first traversal, except I'm just limiting how deep I'm gonna go. Rather than going all the way to the bottom of the tree, I'm just gonna go four levels deep and then I'm done right?
[00:06:49]
>> Brian Holt: So let's make sure that I cover everything I wanted to. Yeah, I mean, that's the big difference here with grass versus trees. With grass, there's no clear Pparent-child relationship, it's just peer to peer connections really. This could be solved with depth-first traversal, just don't. [LAUGH] You just have to make sure that you weren't going to deep and that would work as well.
[00:07:18] It lends itself very well to that while loop, right? We were doing the while until the queue was empty. Now, we're just gonna do a four loop that only goes x iterations, until you've gone as deep as you wanna go. It just really works well that way. So questions about graphs?
[00:07:40]
>> Brian Holt: Any questions about graphs before we move on?
>> Speaker 2: Is there a different name for bidirectional and unidirectional edges? I feel like I've heard something different, but I can't remember where it was.
>> Brian Holt: The answer's almost assuredly, yes. [LAUGH] Could not tell you off the top of my head.
[00:08:08]
>> Brian Holt: Cool, so just as we have talked about, I worked at LinkedIn. This is stuff that LinkedIn is constantly doing. It's constantly investigating clusters of people and they're trying to figure out where there are holes in the job market. So they could just say, hey, just so you know, you're looking for a job, you're looking for education.
[00:08:29] You could fill this need because in Alaska there are lacking orthodontists or something like that, right? These are the really interesting insights that graphs can give you.