Complete Intro to Computer Science Graphs Practice
Transcript from the "Graphs Practice" Lesson
>> All right, so let's pop over to our code sandbox again. We're gonna go into the graph here. So the first thing here is you're gonna see there is a job site js. This is a very long file with a lot of people with randomly generated names, titles and companies.
[00:00:19] So again, I didn't write these, I generated these. And these people are all connected to random people. Right, so this person 27 Reuven Dowbekin is connected to id 33, 43, 503, 940, 175, 825. So these are all the connections that this person has. And some people, this one has only three, this person has four right?
[00:00:50] So there's a varying amount. So this describes our social network of people I think there are 1000 people in this particular set. Okay. If we head over to our function here I kind of describe a little bit. This is what the items the objects look right? They have an id, they have a name accompany a title and a connection.
[00:01:20] And I gave you here a function that you can just call like get user 12 right? And this will hand you back. The user that's associated with id 12, right? So that's what this function is there for. So actually, if I said get user 308, it would literally hand me back this object Okay.
[00:01:48] So what I want you to do, now if you can get rid of this, this is not true anymore. You can ignore that code pen link that's in there. So, find most common title, what I want you to do is I'm gonna give you an id. So I'll give you id 30 on this one, and then I'll give you degrees of separation, right?
[00:02:17] So, the one degree of separation would be all the people that I'm connected to. And two degrees of separation would be the people I'm connected to and the people they are connected to, right? So on to three degrees, which is the people those people are connected to and all the people that are inside of that.
[00:02:33] And the last one here which is the extra credit one, user id 1with seven degrees of separation, which is all 1000 users except five, right? Yeah, so that's what I want you to do. I want you to find the most common job title amongst all of these people in these networks.
[00:03:00] There should be exactly one answer. If you did help here down in this yeah, I actually do have this little function that I commented out here that logs things out in a very useful way. So that can be helpful to you as well. If you need something that's going to log something out for you So first thing we're gonna do down here in the findMostCommonTitle, we're gonna create a queue.
[00:03:33] So let queue = and then we're just gonna just queue up ourselves, okay? We're gonna create a new set, so const seen = new set. And with one fun thing with sets, you can give them an array of things to see their data with. You don't want to requeue the id right there.
[00:03:58] So we're going to add ourselves to the already seen set. And then we're going to have a jobs objects where we're going to keep track of all the jobs that we're gonna see, okay? Okay, now we're gonna write a for loop, let i = 0, i is less than or equal to degrees of separation.
[00:04:22] Right because we're only gonna go so many degrees and then i++ const new queue. New queue equals empty array. And then here we're gonna say while queue.length. So the first thing we do is I'm gonna say const user get user with queue.shift. So we're gonna pull the first user off.
[00:05:09] Okay, and then here I'm gonna queue up the next iteration. And the way I'm gonna do that is I'm gonna save for. Let j=0, j is less than users, user rather.connections.length. J++. I'm gonna say const connection = user.connections(j). Then here, I'm just going to say if I've already seen this user before.
[00:05:59] Or rather if I haven't seen this user before. Seen.has(connection). So if I've seen this user before I'm not gonna add them to the new queue. Otherwise, I'm gonna say newQueue.push(connection) seen.add(connection) so I don't add the user again. Okay, so this is the part where I'm queuing up for the next particular iteration.
[00:06:38] And then down here, I'm just gonna say jobs[user.title] = jobs, user.title. So if I've seen the user before, or sorry, the job title. So let's say it's program manager. If I've seen it before, if it already exists, then what I want to do is jobs(user.title) +1, or I want to be 1, right?
[00:07:18] So if I've never seen it before, then I'm going to start it with one. So if this is two, then I wanna make it three. If it doesn't exist, then I wanna make it one. That's what this little bit of logic is doing. Okay? And then at the end of this I'll have this newQueue, right?
[00:07:43] Which is gonna be the new things, the next level of connections I wanna process. All I have to do is at the end of this. Loop here is I'll just say queue = newQueue. Okay, so this four loop is gonna determine how many degrees breadth first traversal I'm gonna do it on my connections.
[00:08:13] And what this is doing is this is queuing up the next level of connection. So for example, if this is three right I actually will have the fourth level of connection's ready to process, but it just won't do it right. Because this will prevent me from looping again.
[00:08:32] Okay, now I'm going to end up with this object called jobs which kinda look something like program manager. Designer 3 right and I need to turn this into a sorted list. There's a bunch of ways to do that. I'm happy with any way that you choose to do it.
[00:09:00] I'm just gonna go with object keys. So I'm gonna say const jobkeys. This actually probably would be better off broken into a different function. I just didn't, Object.keys[jobs]. This will give me back an array of all the keys that are in the jobs object. And then here, I'm just gonna loop over it and say let biggestNumber=jobs.
[00:09:29] Excuse me [COUGH] sorry, let biggestNumber=Jobs (jobkeys)(0) Let jobName =jobKeys(0) And then I'm just gonna do a for loop over this and try and find the one that has the biggest number. You could absolutely do this with like a reduce as well. If you're into functional programming like that, but I find reduces are frequently pretty hard to read.
[00:10:13] So I'm just gonna do with a for loop i is less than jobkeys.length i++. Const currentJob =jobkeys(i). If jobs. CurrentJob. Is bigger than biggestNumber then jobName =currentJob and biggestNumber is assigned jobs(currentJob). Okay, at the end of this little bit right here, I will have the job name which will be the job that has the most connections in my social network.
[00:11:23] Okay, and then down here we can just say return jobName. Okay, we'll move skip down here. And we will run this. And if you look down here at graph, you can see here that graph is passing. So one more we'll see if that works with extra credit. And looks like it is.
[00:12:04] So just for kind of funsies, let's go grab over here from the solution this little bit here, which is I just wrote a little utility to log out. All of the. See all the job titles sorted. All right so let's try running that. Play, you can see down here this is actually logging out for us all the various different job keys.
[00:12:43] Let's get that too. And then we'll clear this and run it again. So you can see here geological engineer that has 16. The rest of us have 15 that's the last one. That's the extra credit. The one here with user 307 with 4 degrees of separation. Pharmacist is 11.
[00:13:06] Graphic designer on this one is 5. And the top one here, sorry yeah, that would be 0.5. Librarian here is 3 with 2 degrees of separation. Cool, so any questions on graph? Is it always going to have or is any solution always going to have that amount of time complexity in it for graph traversal?
>> Yeah, okay, I get your question. So, I guess the question really is like what's the complexity of this? Cuz it's not n cubed. So I know I told you to look out for yeah, that's good question. I know I told you to look for for nested loops, but in reality here it's not like this is looping over everything in the particular array, right?
[00:14:03] It's looping over a very small subsection of things to the point that we can basically ignore it, right? Because, like if we look back at our jobs array, right? If it was n cubed, that means for every additional person that we put in this that we would be increasing the amount of processing that we'd have to do by a significant amount right?
[00:14:22] This means that everything would have to relate to everything else in the array twice in like a exponential fashion, which is not true, right? Everything that we add to the jobs array actually really only adds basically, a negligible amount of complexity, right? So and the reason for that, if you're looking at this, I'm looping over everything in the array.
[00:14:52] So it's important to know like what you're actually looping over. In this particular case, we're not looping over everything in the array, we're just looping over connections, right. And those connections is never very much, now, if every user is connected to every other user in the array, then we be in a world of hurt here, right?
[00:15:13] But in this particular one, we're just really asking, have I seen this connection before, no, cool, I'm good. So the complexity here is really determined on how many degrees of separation we're going out there. The more degrees of separation that we're fanning out too, we're getting exponentially more complexity.
[00:15:36] So any big how, here is going to have to include however many degrees of separation that we're going to go out. It's going to be in terms of the degrees of separation. And as far as what a few.
>> How deep in the graph, we're like the tree like structure that we're going, that's just going to exponentially increase.
>> Exactly, okay, that makes sense. Thank you.
>> Yeah, of course. I mean, have you heard of the degrees of Kevin Bacon right? Any actor can be connected to Kevin Bacon with like 7 degrees of separation. That's because once you go out 7 degrees of separation, you're now interacting with half of the world's populace.
[00:16:14] Right or something weird like that. So, yeah, it's exponential based on that. So yeah, good question. Thank you.
>> If we use the max heap to sort the traversals would it be more efficient than just a linear iteration on the results array?
>> If you used. So here, we use a max heap.
[00:16:39] So no, I don't think so. So, this is a linear search, right this is basically the definition of a linear search, which a linear search is just and right because we look at every single item in the array. And when you have an unsorted array, which is basically what this is, your best case scenario is that you look at every item in the array.
[00:17:05] Whereas if you're making a max heap, it's going to be n plus a bit more because you're doing some unnecessary comparisons to get ready to do like heapify and be able to do either dq or like a heap sort. So you're doing extra work that you don't actually really need to do.
[00:17:27] So I really appreciate where that person's head is going because they are thinking about it as, hey, I need to find the biggest number on this array. Heaps are very good for that. So that's a good brain space to be in. And I think that's like, don't lose that.
[00:17:42] But in this particular case, literally the most effective thing you can do is just a for loop saying, are you the biggest number? Nope. Are you the biggest number? Until you eventually get to the end, you will have found the, biggest number. Now the difference in complexity here is miniscule, to the point that I wouldn't really care so much if I saw that in code.
[00:18:03] It's just overkill. And in reality, we could've just used, I think there's an -.max, right? That's probably what I would've used because that does that automatically for you, but yeah, cool, good question. And yeah, definitely go check out that Neo4j stuff, if you enjoyed this. We just basically implemented a bunch of stuff that Neo4j just does for you, allows you to write queries that like, hey, give me the most common job title within 5 degrees of my network.
[00:18:39] And Neo4j is like, cool, I got that and it can just go do all that processing for you automatically just by writing really cool queries. And now you have the foundational knowledge that you would need to understand what's actually going on.