Check out a free preview of the full The Last Algorithms Course You'll Want (Part 2) course:
The "Kruskal Algorithm" Lesson is part of the full, The Last Algorithms Course You'll Want (Part 2) course featured in this preview video. Here's what you'd learn in this lesson:

ThePrimeagen explains Kruskal's algorithm for finding the minimum spanning tree of a graph. They start by sorting the edges of the graph and then proceed to add the edges one by one, making sure not to create cycles. They also introduce the concept of union-find data structure and explain how it can be used to efficiently implement Kruskal's algorithm. Finally, they discuss the time complexity of Kruskal's algorithm and compare it to Prim's algorithm.

Get Unlimited Access Now

Transcript from the "Kruskal Algorithm" Lesson

[00:00:00]
>> All right, step one of Kruskal, sort edges. All right, so that means I'm gonna have this, I'm going to have (e, f, 1), I'm gonna have (c, b, 1), I'm going to have (f, g, 2), I'm gonna o have e, beautiful e, by the way, c, 3, I'm going to have where is it?

[00:00:30] On top of the e, [SOUND] to g, 3, I'm going to have, what is it? (F, d, 4), I'm gonna have (d, b, 8), I'm gonna have (c, d, 8), and then finally, (b, a, 9). So there we go, we've ordered all the lists, right? So we have every single edge possible and now, we just simply start here and go, all right, this is our first edge we need to add.

[00:01:02] I'm gonna use red, so everyone can see we're still producing the same graph, but we're going about it in a different way. So I'm gonna put a little red line right here, e to f is called group red, all right, we're all fans of group red. So we process this first one, it's in the graph, how do we know it should be in the graph?

[00:01:22] Well, minimum spanning trees don't create cycles, and it's the smallest edges, okay, easy. We're gonna go to the next one, my goodness, I gotta get these lids organized, hold on, I'm having lid anxiety right now. My goodness, I'm gonna get them all wrong. Beautiful, all right, now, we're gonna process this one, all right, so that means we go C to B.

[00:01:49] I'm gonna actually color this with a different color, I'm gonna color this one green, all right? So we have two groups kind of forming here, we have the C, B's, and we have the E, F's. Next, we need to add the next item, which is gonna be F to G of a weight 2.

[00:02:07] Since F's already a part of a colored group, we'll call it red, that means G will join him in this group, fantastic. So we do the next item, my goodness, I just colored on myself. We do e E to C, so now, we go, hey, E, E's in group red, C's in group green.

[00:02:24] We can totally add these two together, they're a part of two different groups, therefore, we have no cycles. There we go, fantastic, green, you're there, I just want green in there for just a second. Even though it's my favorite color, it just looks really bad on camera. All right, next we go e to g with 3, well, e is a part of group red, g is a part of group red.

[00:02:49] If we were to add a line right here, we'd create a cycle, you don't wanna add it, we don't do that around here. All right, next one, where are we? We're right here, F to D, D does not have a color, F has a color. Therefore, we can add, my goodness, that's like the thinnest line I've ever done.

[00:03:06] There we go, we can go F to D, now, we go again, we can go D to B, D is in red, B is in red. Therefore, we do not add this, because if we did, it would create a cycle. We're gonna go C to D, C is in red, D is in red, we cannot add that.

[00:03:24] Therefore, we're gonna go B to A, B is in red, A does not have a color, and boom, we just saw the line right here. Notice we created, again, the exact same spanning tree, but Kruskal's algorithm is really really simple, right? Like that one is way simpler than whatever happened over here.

[00:03:42] Like this was a whole thing that was really straightforward. And so it feels like Kruskal's is really easy, but then you start thinking about how in the heavens did we do this group coloring. Cuz that's like not easy, right, that's a very hard problem. It's actually a pretty easy problem, okay?

[00:03:59] So, [LAUGH] the greatest turnaround ever, all right, so we're gonna do what is called union find, all right, union find. This is a pretty fantastic algorithm, okay? So let's just imagine we take this whole graph right here and we have A, B, C, D, E, that is not an E, F, G, [SOUND] all right, there we go.

[00:04:28] Now, I'm not gonna use this graph as a guide, I'm just gonna use the same nodes. And they all point to each other, they're all their own little group. A little group on to themselves, how great is this? Everyone's happy, happy little groups, and now we want to start unioning them together.

[00:04:46] So how that works is you have to create something called a bijection, which is a fancy term for a mapping. So we're gonna have A map to something, B map to something, C map to something and typically,m you'd have them all in different order, because it makes the example more clear.

[00:05:00] So I'll say we're gonna kind of reverse the order, so A = 7, let's see B = 4, C = 2, D= 1, E = 3, F =goes to what is left? 5 and G = 6, honestly, you don't have to do them in some crazy order, you could literally just make A1, B2, C3, D4.

[00:05:25] But if I did that, you may not get what's actually happening underneath the hood. Then I'm gonna have an array underneath that has what group do I belong to in this array? So A has a 7, inside of it, so position 7 has a 7, B, let's see, let's go backwards.

[00:05:45] So 6 goes to 6, 5 goes to 5, 4 goes to 4, 3 goes to 3, 2 goes to 2, 1 goes to 1. And so if you look at the numbers, they each match to that one, that's how you know it's a root node, it is its own color is that when your index is the value.

[00:06:01] So 3 goes to 3, that means I mapped myself, I am the root spot. We're all roots to begin with, so let's just pretend we're gonna union E and G. So I start at E, I go, hey, E, let me actually write each one of these numbers. My goodness, we're gonna move this slightly.

[00:06:18] So much work, lost completely, all right, A, B, C, D, E, F, G, awesome. They're gonna map to themselves, to themselves, to themselves, to themselves, to themselves, to themselves, to themselves, all right. So this will be A, B, C D, E, F, G, okay,, fantastic, we see all these things now, right?

[00:06:46] Everyone's there, all right, so we're gonna union, let's do E and G, we're gonna create a union between these two. So I go to E and I say, hey, E is its own parent, it's a 3, because I have a 3, it's a 3, it's a 3, its own parent.

[00:07:00] I go to G, G is a 6, its bijection is also a 6, it's at its own parent. All right, so therefore I'll just make G point to E, all right, fantastic, so that means this becomes a 3. So now when I go to G, it says, hey, different index, I jump to that index, it's now E, awesome.

[00:07:21] So let union B and C together, B and C, exact same situation. I have no rhyme or reason which one I'm picking right now, I'm just picking one. There we go, if you keep counts for groups, you could always pick the group with the bigger count, gets merged into versus the other way.

[00:07:37] There's extra things you could do, but we're not gonna do that. So that means C now points to B, which is 4, all right, now, we're gonna do D and A. So A will point to D, so A is 7, D is 1, so there we go. I think we have everything all mapped out except for F, F is by itself.

[00:07:57] And so now if F, If we wanted to join union D, we would take F and we pointed at D. And we do that by going hey F, all right, position 5, that's still you, you're your own parent, D. Okay, point to you, you're your own parent, so I'll just update F to point to that.

[00:08:14] Now, what happened if we wanted A union B with A? This is where things get a little bit wild, are they a part of the same group? Well, B goes to, did we not update? We didn't update, we didn't update B, look, I actually did it backwards. Look at that, B's pointing to C, but C's pointing to B, that's incorrect, I picked this intentionally, okay?

[00:08:38] So we go to B, B points to C, so we have C as its parent, and then we go to A, A points to D, D points to itself, so C, D two different groups. So therefore, we can union these two groups together. So that means I'll just make, which one were we doing again?

[00:08:54] Yeah, B and A, I'll simply make B, I'll make C instead of pointing to itself point to D. Because we took the parent and redirected the parent, so the parent becomes D. So that means B points to C, C points the D, D points to itself, all right, so that is unioning, and we can keep on going.

[00:09:18] What happens if we do E and B? Well, B points to C, C points to D, so we're unioning B, and we're gonna also do, what did I say, E? E points to itself, therefore, we have E, these are two different groups, we can now union them. I'll just pick the larger group, cuz I can do that with my eyeballs, so it goes right here, E now Points do D.

[00:09:38] So that means if we were to go to any of these, we can find which group are you part of, if I say, hey, F, what group are you in? F goes to D, D goes to itself, it's a part of D group. If I go to B, B points to C, C points to D, it's part of the D group, and so you can tell right away what group you're in.

[00:09:53] But still, this isn't necessarily the most efficient algorithm. You can imagine a situation where you have A points to B, B points to C, C points to D, D points to E, E points to F, and it just becomes this really long path. That is where we have to apply a second algorithm here.

[00:10:10] We're gonna apply a second one called path compression. Now, this is pretty clever, because if you think about it, as you walk this array, you're eventually gonna find out who the parent is. So if you unwalk the array, meaning that like say you use recursion, you'll be able to update every single node along the path to point to the root.

[00:10:34] So let's do that again, let's pretend we do path compression and we're gonna start with B, and let's pretend F. Yeah, look, just my head's in here, let's pretend F points to itself and B is right here. So we're gonna use path compression to kind of fix this.

[00:10:48] So I'll write a very simple little find parent, right? So find parent should literally be as follows, if self-index, we have found it, we will return out our index, right? Very, very simple, if we refer to ourselves, we return ourselves out, else, I'm gonna move this code goes right here.

[00:11:10] This is like classic interview white boarding right now, moving code all over the place. We're gonna like this, root = find parent, right? We'll pretend you can grab the parent index, whatever it is, find parent with the next one, it goes to the next one, keeps on going, returns it out.

[00:11:29] And when we do that, we'll do we'll call it like self.parent = root return root. So along the way back, we update every single index. So if you had something that looked like this, F, G, D, A, and we wanted to union it with E, I didn't use E.

[00:11:55] I'm gonna use different letters now, cuz I forgot which ones I've already used, H, I, J. So if we're like, hey, union D and J, we'd go, okay, D goes to G, G goes to F, F goes to A. Therefore, update F to point to A, therefore, update G to point to A, therefore update D to point to A, we're unioning A, J, I, H, E, update H, update I, update J.

[00:12:25] And now, we've updated them all along the path. And so that means at any one point, the path gets more and more compressed. Every single time you look it up, you may hop a couple times, but the next time you'll never hop again. And so it's amortized constant time lookup, you just have to look up the path one time, that's it, pretty cool?

[00:12:46] Pretty neat, and so that's how you make Kruskal's algorithm super efficient, but it's still dictated by this, what is the runtime of any good sort?
>> Nlogn.
>> Nlogn, boom, this is E though, so we do E log E, so Kruskal's gonna be E log E, Prim's gonna be E log V proper.

[00:13:06] When I say proper, if you really think about it, when you really think about it, the size of E is proportional to V squared in worst case situation. It's technically V times V minus 1 over 2, but don't worry about that, that's V squared still. Because you can just imagine an adjacency matrix, and every single one has a 1 on it on both sides of the adjacency matrix.

[00:13:30] So you have an adjacency matrix completely filled up, therefore, there are V squared edges. So, if you wish to do some sweet math, you could put a V squared in there, then move the square out here, and then be like, the bullshit, get the 2 out of there.

[00:13:43] And then now, you got E log V, but I feel like that's cheating. So I'm not gonna do that, I feel cheated when I do that, it makes me upset and angry. So we're gonna keep it at E log E, all right, there we go, that's the difference.

[00:13:56] So when should you use Kruskal's? Well, if you have a graph that's sparse, sorting's not that expensive, it's pretty low, and the algorithm's really simple. Well, I mean, sort of simple, right? But Prim's, it's kind of expensive, right? Like you're doing 2ElogV, then you're also popping everything else off, which is like log|V|.

[00:14:17] And then you have to do that a whole bunch, you have to do that V times, and so it's just there's a lot Prims. Because really what you're actually seeing underneath the hood is you're actually seeing (E + V)logV. But since we had that whole E is greater than or equal to V, this becomes irrelevant.

[00:14:34] And therefore, you drop it off, cuz remember, you could technically put this V squared. And you do all that, and then you always do the highest polynomial, and you get the idea, okay? Awesome, so that is tree stuff, we just made more trees, we made trees and graphs.

[00:14:50] Everyone like that, everyone's happy about that? Do we have any questions? Just some tree stuff, you know what I'm talking about? Just a little bit of tree stuff.
>> Is every graph problem something that can be reduced to a tree?
>> I don't know, I don't know if that's true, we did see that, because not all trees are graphs, not all graphs are trees, right?

[00:15:15] A tree is a directed acyclic graphm but not all directed acyclic graphs are trees. It's kinda like the whole square rectangle thing, you can imagine a graph that looks like the following, why did I choose green? We already talked about how we should not choose green. This is not a valid tree, but that's a valid directed acyclic graph, no cycles, it is directed.

[00:15:36] There you go, so graphs can be trees, you can't break all things down into trees, but you can break down some of them. And so this is pretty exciting, I like this one, this one made me very excited, I really did enjoy this. All right, we have one more really kick-ass graph algorithm coming up, okay?

[00:15:54] I hope everyone is very excited about the graph algorithm.