Transcript from the "Priority Queues" Lesson

[00:00:00]

>> So we can make this faster and this is where we need to figure out how to make it faster. I can see it behind everybody, priority Q called it. Absolutely, we're gonna do a heap and that is how we're gonna pull out things. So, does everybody here know what a heap is?

[00:00:14] Min heap, max heap. We got one yes, two yes, three yes, four yes, five. Was that peer pressure, yes, or was that a real yes?

>> No.

>> Okay, I believe and then chat and you probably all know it. But just in case you don't know it, a priority queue is effectively a binary tree.

[00:00:31] It's like really not a tree, but it is totally a tree. They're very, very cool. Where there's something called a heap condition. A heap condition is either the min or the max. Meaning that let's pretend this is A, B and C are heap conditioned according to B or A.

[00:00:50] Meaning that if we have the minimum, A is the absolute minimum out of all elements, but B could be larger than C or could be less than C. The heap condition only works up the tree, meaning that B's children, we'll call it x and y, are both larger than B.

[00:01:05] Now, they may also be smaller than C, they may both be larger than C. The ordering does not matter this way or across the level, it only matters going up or down. So that's a very important part of a heap. Some fun facts about a heap, if you wanna visit the nodes, you always start at some place say A, to visit this child, you have to do 2 your index.

[00:01:28] Index of the root node is 1, so 2 index would be 2, which happens to be this, 2 index plus 1 would be your other side. That's because underneath it's actually stored in an array which makes it so neat. So A would be here, then B, C. And so if we wanted to find X, starting from i, it would be 2i.

[00:01:50] So since B is 2, we'd have to go down to 4, so that'd be 1, 2, 3, 4. And it'd go right here, this is X, and this is Y, because that's 2i plus 1. Pretty fun, and this keeps on going. So heaps are kind of cool. If you go into any position, you're actually jumping around inside an array using some fancy math.

[00:02:08] Quick maths right here, it's only 2's, 2's and pluses of 1's. That's the thing about being a computer scientist. You never have to add more than one, and you never have to multiply more than two. It's very, very nice. I absolutely love that. As a teacher once said, all answers are 2 to the n or 2 to the n minus 1.

[00:02:26] That was 2 to the n right there. That were two to the n in r right there. All right, so we're going to use that idea to now keep track of our edges but there is a cost to using a heap when you add something to the heap.

[00:02:42] No matter where it is, you always add it to the end of the underlying array, the length of the array, or the position within our array that is last defined is the next position to add to. It's kind of one of the optimizations of a heap. Then you have to heapify up as it's called.

[00:02:58] Am I less than my parent, if I am swap it up. Am I less than my parent, If I am swap it up, if not you're done, heapified great you're dead. I don't know I like saying that phrase, it's just super fun. All right, there we go, so that is a heap.

[00:03:15] Hopefully, everyone understands that and so we are gonna be putting our edges into a heap. That means we are gonna have to potentially walk a binary tree that is perfectly balanced at all times. So what is the runtime of walking up a tree that is perfectly balanced?

>> Log h.

[00:03:36]

>> h or log of n. Technically, h is log of n in this case. h equals log of n. So it always is gonna be that, because it's perfectly balanced. So every single time you insert into a heap, it is log of n. Every time you want to remove the minimal element, it is log of n.

[00:03:56] So the exact same thing. So that means we go from having to do this in E time to having to do a log N lookup, or in this case, it'd be log E. E being N, because we're using edges. All right, which also means our add to collection now gets changed.

[00:04:15] Our add to collection, every time you add an edge to the collection, that is a log. We now actually have a log situation going on here. Obviously, you can totally check for visited, so you're not doing as many, but practically speaking, we are E log E at this point.

[00:04:36] And so now we have dynamically changed our run time of our algorithm from V.E to, whoops, that is totally wrong, to E log E. But we can still do even better than that. We can make it faster, stronger, and more amazing. Are you ready for the more amazing version of this?

[00:04:58] I'm gonna write that over here. You're probably thinking, it can't get any better. Well, if he acts now, we can. Alright, so instead of doing this whole, just dropped the idea of edges, you don't need to think about edges. You're thinking about this right now, this is actually one of my favorite data structures of all time.

[00:05:15] It's a priority queue, except for it's indexed, meaning we're literally gonna use the PIQ data structure. If you don't know who PIQ is, great guy. So we're actually using the data structure named after a very swell fella. So, the PIQ data structure, a priority indexed queue. What we're gonna do here is that we're going to initialize a priority queue, except the priority queue, instead of having edges in it, is going to have vertices with the minimum known distance.

[00:05:45] I know pretty exciting right? Which means we're going to do this. Let's just say again we start at E, that means we're going to have a priority queue of A, B, C, D, E and G. E will be zero, we already know the distance right? We don't know any other distance.

[00:06:03] All of them are infinity. And since this is a pick, a priority indexed queue, I can say, hey, get me C. O of 1, look up for C. Get out C, and I can say, hey, is 3 less than infinity? It is, we'll put a 3 there, right?

[00:06:19] We've removed E, by the way. I forgot to do that step. We remove E, now we're adding in its edges. And so we look up C, we do that, now we have to heapify it because we've changed its weight. So we have to heapify it within the heap and it's going to be moved higher up to the top.

[00:06:34] So that's going to be a log V operation because there's only V items in there. So log V, V is always smaller than E, other than this one single case. After that, it's always larger, right? You will always be able to do the same size or larger. So just don't bring up the two node case, okay?

[00:06:58] It's just unfair mathematics. All right, so we do that we can do with F. F turns into a 1, G turns into a 3, and now we've just done this with log V time, E log V, right? And so we've properly made it into an E log V algorithm.

[00:07:19] And just for the sake of completeness, I am gonna run through this really really quickly. That means we take this, we've now visited E, and we now are gonna pop off the next item which will be F. So now, we'll have the line with the priority queue right here.

[00:07:35] Then we'll update all the distances, G becomes 2, D becomes 4. And then now we can pop off G. And you can kinda see this thing go on from here on out, right? Everyone's following along, everyone's tracking, same algorithm. We're now using nodes with their best known distance as the way we pop things off the priority queue.

[00:07:54] Super, super cool, right? And we've now managed to get it all the way down to E log V proper. And so I think that's a super cool. I didn't know about this trick until not too long ago. I didn't realize you could organize it by nodes and do that.

[00:08:06] I always thought it was E log E. So super cool, blew my mind when I saw that, and I was just like, wow, that's awesome. All right, we need to go on. We need this to be even better. We're gonna now look at Kruskal's algorithm. His algorithm is a little bit different.