Transcript from the "Binary Search Algorithm" Lesson

[00:00:00]

>> So, let's keep on going because we kinda need the blades a little bit faster than this. All right, so algorithm time, we're gonna go to our second algorithm, this one is a doosy, if you will. This is the most classic off-by-one problem I think there is in the universe.

[00:00:15] And of course, this is a basis for other algorithms, or how we will look at other algorithms. All right, so an important question you should always ask yourself when it comes to your data set, is it ordered? If your data set is ordered, you have new advantages you can take with that data, all right?

[00:00:34] So first, let's come up with the algorithm ourselves, okay? So I wanna kinda derive this in some sort of sense. So, let's just go like that, we'll create a new one, plus plus it, and let's kind of derive this algorithm ourselves. So, if you have an array, but it is ordered, we don't have to go here and say hey, is X0 equal to the value we're looking for, right?

[00:01:01] We don't have to start in the first position. We don't have to go to the second position cuz we can make some set of optimization, right? So one thing that immediately probably comes to everybody's mind who is not familiar with this problem is that well, you know what we could do is we could say, jump 10% of N, right?

[00:01:19] I could jump 10% of N and say, hey, Xi are you equal to my value? You're not, then I'll jump 10% more. Hey, Xi are you equal to my value? You're not but you're now officially larger than me? Well, I better walk back 10% and then linear search my way through that, right?

[00:01:39] Cuz that means the value is somewhere between the previous one and my current one. We can make some kind of guesses here. Now this is why this type of style of algorithm kinda breaks down, which is first, let's ask ourself, what is the worst possible case of this algorithm?

[00:01:56] Well, the worst possible case is we jumped by 10%. We're larger than the current value, we still need to keep jumping forward. We need to keep looking. We jump by 10% again. It's still not there, we're still large. We're still larger. We're still large, we go all the way to the end of the array.

[00:02:14] We're still larger, which means we have some portion of the very last bit of the array unsearched approximately 10% of the last part of the array unsearched. Let's now walk through that last portion and look. It wasn't there at all. It was larger than every element in our array.

[00:02:29] We've totally failed. This sucks. So what is the running time of something like that. Well, the problem is we used N in our jumping, right? We jumped by 10% of N. So that means we're gonna do 10.1. We're effectively gonna do, what is it? We're gonna do, what is that there?

[00:02:50] 10 operations of jumping. There you go, sorry. 10 operations of jumping plus 1.1 of N, right? Cuz that's effectively what we're doing right here.. So if we denote that in big O, what is rule number two for big O?

>> Ignore constants.

>> Ignore constants, so bam and bam.

[00:03:11] So what is our running time right now? It's still N. Isn't that just emotionally defeating? This is still way more efficient. And if you can do something like this, obviously it's much better than linear searching, right? Practically speaking, but theoretically speaking, if your elements go from a 100 to a million, your algorithm is gonna run that much slower, right?

[00:03:34] Whatever that number is, I should have chosen a better one, 10,000 to a million. Your algorithm is gonna be a 100x slower, even though it's still 10 times faster than your every single index. And so that is problematic. We didn't improve the algorithm. We just improved it practically speaking.

[00:03:50] So what happens if we could jump differently? So how can we jump differently in this? Well, let's do another thing. Well, what happens if we just jump to the middle of the array, all right? And say hey, middle value, are you equal to V? If it's not, well, then we can check only one half of the array.

[00:04:08] If it is the value, awesome. So let's just say that our value is larger than this value, what do we do? Well, if we just simply linearly scan back this way, well, then what happens? Worst case, 1/2 of N. Well then, you can say, well, let's just do one more time.

[00:04:25] We could half it again, right? Okay, so let's just half it again. So now we are only looking over half the space to 1/2N. So we half it, we add a 1/4 of an N. And if it is larger or smaller, we can then scan. Well, again, what's the problem if you scan at any point?

[00:04:43] You drop the constant, it's just simply linear. So we can't do that. We can't scan. So, what we could do instead, though, is we could, well, let's see. We could do another one, this is Xi = V. It's not? Well, we could then half it again, right? So let's go here, here, we then half it again.

[00:05:04] We can go here, this is if we chose a really small number, and we could keep on halving it until either there's no half left to look at or we find our value. And so let's think about how that works. That means effectively, we'd have an array of size N, right?

[00:05:21] Then we half it. So now we have a value space that we're looking at that is really 1/2N. Then we half it again. So now we have a value space. I'm just gonna start writing like this. That's 4, then we have a value space, that's 8. And we have a value space that's 16.

[00:05:37] And it keeps getting smaller until a specific condition is met. Or in other words, N/2k = 1, right? So let's do this. This is the most mathy part of the whole course we're gonna do. So let's do the classic multiply both sides. There we go. Or in other words, N = 2k.

[00:05:58] And of course, for those that remember their logarithms, you can say log of 2, [SOUND] let's see, of N [SOUND] = k, all right? Log of base 2 of N = k, or in other words, our running time is log of N, right? Cuz until we get down to 1, we have to keep halving it.

[00:06:21] So we're going to half it a log of N amount of times. So practically speaking, if our array was 4096 and we half it, we'll have 2048, then we'll have 1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1, right? And what is the logarithm of 4096?

[00:06:49] It is 12. How many steps do we do? 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, right? At that point, you don't technically count the first one, I counted wrong. So this would be the 12th position because that's the starting position. You didn't have that one to get there.

[00:07:05] You get the idea though. There you go. So you do 12 halvings to get to 1. And so at that point, you would know whether or not the value's in the array. So the worst case situation is when the value is not in the array, and you have to have it over and over again till the end.

[00:07:19] You will do a log N amount of halving. So this type of searching is referred to as binary search. Does that make sense? Binary, either 1 or 0, right? There's only two possible states, technically. I mean, in real sense it's kind of like a trinary search cuz it is the value, the happy case.

[00:07:39] But in this case, it's either it's this side or to this side, right? You're always splitting it in half. You're always traversing down one path of execution. So let's look at this. Blah, blah, blah, yeah, another good little trick of Big O, by the way, is when you have the input at each step, it's either log N or N log of N, right?

[00:08:00] So just remember that. That's depending if you're scanning that input or not. We're not scanning, we're only looking at a value and then dividing it into the half, looking at a value, dividing it in half. We're not scanning anything. So that's usually a pretty good trick to think about.