Transcript from the "QuickSort Algorithm" Lesson
>> So now, we're actually to the end of the ArrayList section. Actually, for real, this is it, we're gonna do one last algorithm, and that's it. So hopefully, we all feel pretty dang good about this. Remember, you may have forgotten, but I said we're gonna do two sorts, right?
[00:00:18] We're gonna do bubble sort, now we're doing it, we're doing it. And two, QuickSort. QuickSort is a fantastic algorithm, it's shockingly simple, and at the same time it's shockingly brilliant how someone could come up with this. To me, it just blows my mind. I don't think I could come up with it by myself.
[00:00:34] But whoever did it, maybe the constraints of having a really slow CPU would have made me come up with this, but I just didn't have that problem, right? I've been born in the gigahertz days, right? Kinda makes you forget how great things are. There's actually an algorithm technique known as Divide and Conquer.
[00:00:49] So normally in algorithm's book, often you'll see Merge Sort as the one they use, but this one we're gonna use QuickSort, cuz I wanted to be different, you can go read about someone else's Merge Sort. That way you have more breadth of things people explain. So what is Divide and Conquer?
[00:01:04] Effectively, divide and conquering is to be able to split your input into two chunks, three chunks, four chunks, whatever that splitting is. And then be able to go over those smaller subsets and solve things faster. Then you could resplit it again, and resplit it again, and resplit it again.
[00:01:19] It becomes something in which progressively gets smaller and smaller until it gets to some sort of fundamental unit, in which you can solve really easily. So think about sorting, an array of one element is always sorted. So that would be, say the fundamental unit in Merge Sort, is once you get down to that point, your array is sorted.
[00:01:37] Now, we need to just do the Undo function, which involves merging, hence the name, Merge Sort. All right, there are a bunch of different algorithm strategies. I didn't technically really name any of the algorithm strategies as we go, there's greedy. So linear search is a greedy search. We go until we find the first one, we're done.
[00:01:54] There's also dynamic programming. [SOUND] Dynamic programming is one of those things that, until you really get it, it's really, really hard. And for whatever reason, almost every single company asks you the same, if you had a stock chart, what is the best day to invest in? Which is just a dynamic programming problem.
[00:02:12] It's very annoying. All right, so first, we're gonna go over QuickSort algorithm, it really is a super fun algorithm. Yeah, by far, one of my most favorite ones, I think I've get the program. I have a fun story about it which I'll go over here in a little bit.
[00:02:26] But for now, let's just talk QuickSort. All right, you liked that the like, we can even use QuickSort. All right, so QuickSort works in such a way that it divides and it conquers. So let's just pretend that we pick some element out of the array. Doesn't really matter what the element is, we're just gonna call it p.
[00:02:49] And the reason why I call it p is, often this element is referred to as the pivot. And so let's just say, for our sake we'll just make it simple. We pick the last item in the array. We pick the last item, and then what we do is we have an index that hangs out over here, and we walk our entire array.
[00:03:08] So then we have a second one, that's gonna walk our entire array. Any element that is less than or equal to our pivot will be put in the first little runner's position, right? So we have that first pointer that just starts at the beginning, a second one that just walks the array.
[00:03:24] Every time we find one that's smaller than our pivot, we put it in that first position, increment our first runner. So we're putting all the ones that are smaller than, or equal to our pivot on one side of the array, then our pivot, then all the ones that are bigger than it on the other side.
[00:03:38] So, starting to kinda look like sort, you can already see a sorted item in here. So eventually, what we should see is that our first pivot should be somewhere in the middlish, right? Who knows where it's going to be? And everything on this side is going to be less than or equal to, everything on this side should be greater than.
[00:03:59] This is referred to as a weak sort. At this point, you'll see other data structures tomorrow that actually take advantage of an idea called weak sort, to make super fast operations. And so it's actually pretty cool. And then at this point, we need to split our problem in twain, yes, twain.
[00:04:17] We are gonna take the positions betwixt the pivot and the ending, and betwixt the other side of the pivot and the beginning, and we repeat the process again. Pick another pivot, we'll call this one pivot 1. Eventually, pivot 1 will be somewhere, say in the middle. And pivot 2 will be somewhere, say in the middle, and then we'll split it again.
[00:04:40] Or we don't include the pivot, but we include everything around the pivot. And we just keep on doing this operation until we get to a specific case. Of course, you can imagine an element of nothing, or an array of nothing, or an array of, say a single item, they're always sorted.
[00:04:58] Once we get to that point, we're completely sorted. Makes sense? Hopefully, it does. And what can we say about, say P1? Everything on this side is greater than P1, but we can also say something else. Everything right here, it's still smaller than or equal to our parent pivot.
[00:05:19] So it becomes progressively more sorted at this point, progressively more sorted at this point. So let's kind of think of it in a different sense, let's draw it in a different way. How I'm gonna draw it is in a tree. This really would help visualize also the execution of it all.
[00:05:35] And from here on out, I'll probably be referring to this thing as the low, one side, right? And the high, the other side. Should kind of be reminiscent of binary search. But there is gonna be one difference. As I program and as I do things, I'm gonna have my low be inclusive, and I'm gonna have my high be inclusive.
[00:05:54] Very unusual, most algorithms don't do this, I find that this one's much easier to be inclusive instead of exclusive. You'll see why as we get there, just remove some plus ones that can make things tricksy. All right, that's Gollum for tricky. All right, so let's just say we have an array that is 32 big, and so we have from 0 to 31.
[00:06:16] And we pick a pivot and it ends up being the middle element, which is going to be 16. That means when we do it again, we're gonna have two subarrays, right? And that subarray is gonna be from 0 to 15, we don't include 16 cuz we've already sorted that singular item.
[00:06:35] And it's going to be from 17 to 31. Does that make sense? Yeah, all right, then again, we get the middle element. We can do it on both sides. What is the middle element here? 24, that seems to make sense. I'll just keep on going down this one side because I don't think you need to see both sides being created, just repeat it.
[00:06:57] That means we're gonna have to do two more. We're gonna have to have 1,7 and we're going to have 9-15, right? Oops, I'm changing up my symbols there. I used a comma for whatever reason. All right, awesome. Again, we pick a pivot, it's in the center, we do it again.
[00:07:15] So what is the center of this one? Is it 4? I think it's 4, and this one, we're gonna have the center of here of 12, right? So that means we're going to have 9- 11 and 13- 15, awesome. Man, I'm running out of space, aren't I? Then we're gonna have 5,7.
[00:07:38] Apparently, I'm using commas every now and then. Things happen, your brain just makes new decisions as time goes on. And eventually, we get down to the point where if we pick another pivot, we'll have 2, then we'll have 1, 1, 3, 3, 4, 4, oops, it's 4, 4.
[00:07:57] 5, 5, 6, 7, 7, 9, 9, 10, 11, 11, 13, 13, 14, 15,15. See how that kinda broke down? We slowly kept dividing the array in half and half and half. And eventually, when we're at this bottom part, all of those pieces are sorted at this point. We're done sorting, we have them all.
[00:08:26] And so again, what we're seeing here is that we're seeing a way that we break the problem down more and more to the point where everything eventually becomes sorted. Obviously, everything in this direction is less than or equal to, which this direction is less than or equal to, less than or equal to.
[00:08:44] And so eventually, then it becomes simply sorted. Then this whole thing is sorted, then this whole thing is sorted, then this whole thing is sorted, then this whole thing is sorted, right? It walks back up and it does the whole sorting, and it can do it all in place, which is really impressive.
[00:08:58] Instead of having to create a bunch of temporary structures for holding values. And I think that that is a very kind of key point to this whole QuickSort business. Again, this is why we did recursion before we did this step, is because you can see right away, we have this branching factor within how we're solving.
[00:09:15] The branching factor is, of course, two, we have two branches every single time we execute QuickSort. And something else you should probably observe right away, which is based on this little thing that I've drawn right here, we had to go over all of the input once, right? Then when we were at this level, we had to go over still all of the input, right?
[00:09:35] Save this one little pivot. Then we went over all of the input again, save this one little, the couple pivots that we've now created. We keep having to scan the entire input every single time until we get down to the very bottom level. Which means that we have to do n operations all the way down to when n equals 1, which of course, is our beautiful equation of n over 2k = 1, remember that?
[00:10:01] Do we all remember what that equals? Log N, there we go. Which means our running time for doing this is going to be log N, or n log N. We scan n times, we do an n log of N, we do a log n amount of n's. Yeah, there we go.
[00:10:21] I knew eventually we'd come out of it, eventually I'd get there. But is that true? Is that true? Is that what's gonna happen? Does anyone see any problems with how I kinda describe this?
>> I'm thinking like, what if you don't get in the middle?
>> Boom, that is exactly it.
[00:10:36] QuickSort doesn't always sort quickly. I know, it's a very deceiving name. You never know what you're gonna get out of this one. So, can anyone think of a condition, based on how I've described the algorithm, which will actually produce the worst possible answer? I'll let you go back up, and I'll show you how I did it.
[00:10:59] Make it a little bit easier.
>> If you started at the beginning.
>> If we were right there, yeah, so, my, perfect! It's ridiculous, a reverse sorted array, right? If you are handed an array that goes 10, 9, 8, blah, blah, blah, blah, blah, down to 1, and I pick this number, what happens to our pivot?
[00:11:19] Well, our pivot looks something like this, 1, cuz it's just hanging out there by itself. Cuz remember, we put everything on this side of the array that is smaller or equal to it, and then we put it at the pivot point. That kinda sucks, cuz we now are sorting n- 1, right?
[00:11:36] Then we do it again, which is now 2, goes over, goes over, goes over. It goes all the way up for n amount of scans, n- 1 amount of scans, n- 2 amount of scans. What's the running time of this one again?
>> N squared.
>> N squared, that is right, because it follows the old third grader formula of n(n + 1) over 2, fantastic.
[00:12:00] We have now just done that, which of course, breaks down into the classic n squared over 2 + n over 2, which drop all the constants and lower all the stuff. And all we get left is that, right there. So that's what happens. So QuickSort is not just simply n log n, it can also be n squared, which totally sucks.
[00:12:21] That's not what you want. And so there are some strategies. One of the big strategies is always pick the middle element. If it's reverse sorted, you should get perfect bisecting. If it's sorted, you should get zero swaps. If it's just completely random, you'll have a random chance of being somewhere nearest the middle, right?
[00:12:42] I think over time, you'll probably fall within a certain middle region more often than the fringes. I'm pretty sure everything usually always ends up being a normally distributed function, but you'll get somewhere in this n log n to n squared range. So this is why QuickSort is kinda tricky.
[00:12:58] If you read about Merge Sort and you see it, it always does n log n time. But the constant in front of Merge Sort is rather large, because it has to do this whole array copying, kinda moving things around, type item that isn't necessarily the most conducive to high speed, low memory algorithms.
[00:13:15] And so QuickSort is often the one that is reached for, that does really, really well, but it can have poor performance, if you just hit all the right conditions. Again, the chances of you hitting all the right conditions all the way through to get n squared is exceptionally low, I would assume.
[00:13:31] So, it's somewhere between n log n best case, n squared worst case, somewhere in betwixt there. So, pretty fun one, it's kinda interesting one. It's one that you can't really fit into a running time, which I think just makes it more fun in general to learn about.