Transcript from the "Heap" Lesson

[00:00:00]

>> We're back in, we just got done doing a binary search tree. We did the actual finding of it, we explained how insertion and deletion, some of the problems that go with it. We're back in, we're feeling ready, we're very, very, very excited. At least, I am super excited.

[00:00:14] And then I forgot to do this, we already talked about that actually, we need to do all that. So we're gonna move on to our next structure. No, we're not gonna be doing any bouncing of a tree cuz of course, like I said, it's a very long operation.

[00:00:23] And I think that you if we were to do a follow up course with this, we'd probably cover things like tree rotations, and other fun trees that we're not covering here. So I always make this joke, if you just say the word priority queue, I'm gonna hire you in an interview.

[00:00:36] I just wanna hear you say that phrase because those things are useful. I know it's a joke, but it's not a joke, but it's definitely not really a joke. Is that really a joke? And so this is a very important data structure, it's often referred to as a heap or binary priority queue.

[00:00:54] And a simple way to put this is that it's a binary tree in which every child or grandchild is smaller, that's called the max heap, or larger, that's a min heap, than the current node. Meaning that whenever a node is added, we must adjust the tree. And whenever a node is deleted, we must adjust the tree.

[00:01:11] You don't traverse this tree, though we could traverse the tree, you don't traverse the tree. And it'd be actually extremely simple to traverse it in breadth first ordering. So you don't even need a data structure at that point to do it, it's already in that order. So let's first go over what a heap really is.

[00:01:28] So I'm gonna restate it, but this time with sweet circles and arrows. So I'm just gonna put a value, 50, let's start with a min heap. So a min heap means that the top value must be the smallest. Obviously, if you have duplicates, say we had more 50s, they could be children of 50, that's just fine.

[00:01:48] But that means both of these must be larger, so I can put something like 71, I can put something here like 100. Now here's where the ordering gets a little bit goofy. I can have children here, which they have to be larger than 71, so we can put it say as 101, and we can put this one as 80, right?

[00:02:07] Now, you're probably thinking, well, shouldn't that be right there? Well, no, it shouldn't be right there cuz heaps maintain a weak ordering, meaning that they are ordered, but it's not perfectly ordered, right? It's not in order of anything specific, but there is a rule at every single point.

[00:02:24] And so, if we were to do some right here, we could have 200, and we could also have 101, right? And it is fine, just as long as it keeps on getting larger as we go. And so, what this is called is the heap condition. The heap condition effectively states, if it is a min heap, that every node below me must be larger than or technically equal to.

[00:02:44] If it's a max heap, it states that every node below me must be smaller than or equal to. Cuz obviously, a max heap means the maximum item is the root, a min heap means the minimum item is the root. And so, this means if we needed the smallest item out of this tree, it's really simple, it's right there, it's at the tippity top.

[00:03:02] We could grab that item, or we could pick it at least at over 1. So it's super super fast to be able to inspect and say, that's the smallest item. Obviously, we couldn't get the median very easily right now because we don't actually have that available. There is a trick to get the median if you use two heaps.

[00:03:17] We're not gonna do that, it's a stupid Google interview question and doesn't really show you anything other than you can say the word priority queue. All right, so now that we have that, let's think about how we would add a node to the stream. Now one condition I didn't say is, is it always complete like this?

[00:03:35] Yes, a binary tree or heap is usually full or complete tree, meaning that it's always adding from left to right, and it never has any empty spaces. That would never exist in a heap because that is an empty space, meaning, there is nothing on the left-hand side, that just does not exist.

[00:03:51] So every single node is filled in from the left to the right, always tree level, and it's always complete at each level, right, there's no gaps, there's no holes. All right, so now that we did that, we can kind of go on. So how would we add a new node?

[00:04:06] Well, the simplest way to put it is that we go to the final spot in our tree, and we add the node. So let's say this node value is 3. All right, we've added 3, is our heap condition ordered? To min heap, and that's definitely really small, so the answer is no, of course it's not.

[00:04:27] So what do we do? Well, we just have to bubble up, we say hey, am I smaller than my parent? I am, okay, so let's swap us. This becomes 101, this becomes 3. We do it again, am I smaller than my parent? I am, so this becomes 3, this becomes 71.

[00:04:47] Am I smaller than my parents? Yes, I am. Now the root becomes 3 and this one becomes 50. Is our heap now correct? Yes, it is. And we can do this again, we can simply add to the next spot and do the same bubbling up as we go.

[00:05:04] Let's just say we added 200. And so we don't have to do any bubbling at all, we're good, we're larger than our parents, so therefore, we do not bubble up. Awesome, but what about deleting? Deleting is, we do have one question.

>> Sorry, is this a doubly linked list implicitly then?

[00:05:21] Or how would you establish the relationship between 100 and the new red node?

>> So the question was is this a doubly linked list implicitly? Or how's it? Or what's happening? How do you establish that parent child relationship? The answer is sort of. I'll show you here in a moment, we'll get there cuz that part is just, [SOUND] right, it's fantastic.

[00:05:43] So it's gonna use a data structure we've already created as our backing data structure, we only just simply understand it in a certain way. It's beautiful. All right, so how do we delete? We have 3 at the top, we want our most minimum item out of this heap.

[00:05:59] Well, let's go like this. We now remove 3, we're gonna return 3, but now we have a hole in our tree, right? The top node is empty, so what do we do? Well, we take our very last node in the tree, we delete that out, and we put it at the top.

[00:06:18] Now, we need to heapify down, right? So, we bubbled it up or heapify up, now we need to heapify down to put it back into a correct spot. So, what do we do? Well, since we're using a min heap, we're gonna wanna take the minimum of the two children and compare against that.

[00:06:36] So what's smaller, 50 or 100? Clearly, 50 is smaller than 100. So is 200 greater than the smallest of our two children? Yes, it is. So let's swap. 50 takes its rightful place as head of the queue, 200 goes here. We do it again. We take the smallest of our two ,which is 71 or 81, it's 71, so we do it again.

[00:06:58] So let's do that. So is this thing larger? Yes, it is larger, so swap. 71 makes it all the way up here and 200 goes right here. Again, we do one last time, and of course 200 and 101 will be swap. I'm out of room at this point, and this becomes 200.

[00:07:17] Now notice that we still maintain that property where we fill things in from left to right, there is no gaps and our heap condition has been maintained. Every node that is below any other node must be larger than or equal to. All right, that's pretty fancy. So now hopefully, if you've been following along thus far, you're probably thinking, that sounds great now, but how the hell do you get this node, right?

[00:07:42] I'd have to traverse my tree to know where the next node needs to be added, right? It would actually be really, really hard to know that answer given how we built trees thus far, which is like a node based data structure, correct? Yeah, it would be really, really hard.

[00:07:57] So let's not do that. Let's not do that at all. We need to kind of reimagine how we can store a tree. This is reimagining by the way, and that's the light bulb going off. [LAUGH] All right, fantastic. So let's put an index to each one of our nodes.

[00:08:12] So this one is 0, 1, 2, 3, 4, 5, 6. Okay, do we see anything going on here? Do we see anything? Why start at 0? Wow, that sounds a lot like an array prime, that's because it could very well be an array prime. So let's think about this.

[00:08:34] If we could store this in an array, then 50 could just be in the 0 position, right? Then the next two children which is 71, and what's the other side? 100, this side, what is that one? 101, 80, 200, no, running out of space. 200, 101, and then I think the last one, which I forgot to put a number by, which would be 7, would be right here, 200 again.

[00:09:06] So, look at that. 0, 1, 2, 3, 4, 5, 6, 7, 7 for 200, 7 for 200. That worked out, we were able to put it correctly into an array.

>> This would be implemented in an array?

>> This would be implemented in an array. So we'll technically could implement it in an array list.

[00:09:28] I mean, you can see why here in just one second, because that way it's growable. All you have to build an array list is a growable amount, because you don't know how many items are gonna be in the heap, right? So they can keep on inserting items, you're gonna have to keep on growing the underlying data structure.

[00:09:40] So great. We found a way to store it, but now the obvious question, how do you do parent/child relationship, right? There is no next, there is no previous, there is no left, there is no right, there is no parent, how do you do it? All right, well, I'm pretty sure we could come up with a formula to be able to do this.

[00:09:58] So we could do this fun game where we try to guess our way there, or I can just tell you. So let's just look at this one right here. 2 has a child of 5 and 6, so I can say 2 multiplied, whatever my index is, +1 is the right-hand side, correct?

[00:10:18] 2 times 2 is 4, plus 1 is 5. I can do 2i + 2 to be the right-hand side, 2 times 2 is 4 plus 2 is 6. But you're saying I don't know if this shenanigan does it really actually pan out? So that'd be 7, 8, 9, 10, 11, 12, 13, 14, right?

[00:10:40] That would be the next set of children in the placement if we were to do that. So 6 would have a children of 13 and 14. So let's find out. 6 times, what's this, it's not i, it'd be 2(6) because that is i, + 1 should be equal to 13, simple inspection, 13.

[00:11:03] 2(6) + 2 should be equal to 14, it's 14, awesome. Look at that, we just found a way to math ourselves into our children's area, right? I don't know how to describe that, we've traversed to our children, right? That is actually pretty dang incredible, cuz there's no links, right, there's no management of links.

[00:11:21] So in some sense, it makes management of this data structure way, way easier, cuz you don't have to be, okay, my parent, crap, I delete the wrong one, my data is gone, okay, let's rework this, right? You don't have to do any of that because it makes it so much simpler.

[00:11:36] But we still have a problem, we have solved the generic case for going downwards, but we haven't solved the generic case for going upwards. So to 2i + 1, 2i + 2. Now we could go like this, okay, if my index is odd, therefore I could do a left-hand equation to jump up.

[00:11:59] Or if my index is even, I could do a right-hand equation. I just picked a number, I don't actually even know, it's true, it true based on here, that is right-hand side, perfect. So that would have worked out, we could do some sort of, case statement for the parent, but we don't actually need to do that.

[00:12:13] We can think of it as a single equation. Now, in real programming languages that aren't JavaScript, when you divide an integer by another integer, you don't get a floating point number. So sorry, Python, you actually get an integer, correct? So 5 divided by 2 is 2, this is just saying, it's a very important concept here.

[00:12:34] So let's try this out. Let's try the idea of i- 2, obviously, not 2, what, -1 divided by 2. So let's go back to our six case of 13 and 14. So that means we'd have 14- 1 divided by 2. Well, that's 13 divided by 2. Again, real programming language, that equals 6, right, because we don't include the 0.5.

[00:13:03] There we go. So in other words, since we're using JavaScript, we'll need to do the floor operation. We'll have to floor out the value to get that,. Which means if we did same thing, but we did 13, 13-1 divided by 2 is the same thing as 12 divided by 2, which is clearly 6.

[00:13:18] So both 13 and 14 can go up to 6 by doing this formula right here. And so now we had just done both the child, the right-hand, the left-hand, and a generic parent calculation, and now we can traverse our list. But there is one more problem, how do we get the end node?

[00:13:36] The length, just keep track of how many items you have in your queue and that will be literally where you need to insert into your array. [SOUND] All right, like we've solved all the problems. That means we heapifyDown, we start at 0 because we're always gonna be deleting the head.

[00:13:56] We heapifyUp, always starting at our length. We already know these two things, so we'll know the indices to start with, which means we can actually write them, and do all that. So that is pretty dang incredible, right? It's actually not all that hard. There is one more operation that we won't be doing today, but we will be covering later cuz it greatly affects how an algorithm runs, which is updating.

[00:14:17] So, if you ever want to update a node, update a value in a priority queue, you really have to keep track of every single node with a value to index or index to value, kind of hash map. That way you can say, hey, I'm 17, where am I?

[00:14:34] I'm right here. Okay, therefore, let's bubble me up or bubble me down, because I'm gonna change my value from 17 to 25, right? So we're gonna need to know how do we bubble ourselves in that. So that one just adds a lot of complexity to it, but it will also make some algorithms way faster if we can do that.

[00:14:53] Typically, priority queues you usually don't consider update as something you do, but in this case, we could. So does that make sense? Do we feel like we have a heap all heaped up in our brain? This is like the funnest data structure, I think of them all. It's really, really kinda crazy, it definitely makes you think about the world completely different, right?

[00:15:13] We've always been doing this node, node, node, node, node, node. Hey, let's not do it, let's just imagine what an array could be. I just think that is just super cool. All right, just really prioritizes my queue, if you know what I mean. All right, so [LAUGH] that's a weird joke.

[00:15:28] Anyways, this thing is self-balancing, right, it always maintains a perfectly balanced binary tree, cuz we're only really removing or adding right at where the length is, and then bubbling it up into the correct position. It can be used for priority, so think of thread scheduling, think of all the great stuff right there.

[00:15:45] And of course, it's the funnest data structure to implement, easy to get wrong, but still the funnest one to do.