Transcript from the "Binary Search" Lesson

[00:00:00]

>> Bianca Gandolfo: So we are going to switch gears a little bit, not too much. We're going to keep searching. I feel like just reflecting on this class today, it's like we've been going through this journey. It all started with waking up in the morning, trying to be better, and finding things, exploring things.

[00:00:23] Doesn't this is feel like the hero's journey or something? Anyone else feeling that? No, just me? [LAUGH]. But anyway, it's like we've been searching a lot for things, and we're gonna keep searching. [LAUGH] That's life. So we're gonna look for something specific, and let's just take a look, different strategies for searching.

[00:00:50] So we have a linear search, which is what you'd expect, you start at the beginning, and we're looking for eggs. Because that was the recommendation from the graph. So we're looking look for eggs, we start at the beginning, and then we loop through until we find it, and then we find it.

[00:01:15] So that's linear search. What's the time complexity for linear search?

>> Speaker 2: Linear.

>> Bianca Gandolfo: Linear, yeah, great.

>> Bianca Gandolfo: Well, let's talk about binary search. So we're still dealing with an array, but we have a first index and a last index that we keep track of. First index is gonna be 0, last is gonna be 5.

[00:01:40] And the first thing we wanna do is find the midline. Because what we're gonna do with binary search is we're going to cut the problem down in half every time. The reason we can do this, and this is important, is because it's sorted, right, it's in alphabetical order, hopefully.

[00:01:58] Because it's sorted, at the midpoint we can derive if the value is gonna be on the left or on the right. And so we're gonna continuously cut our problem in half and look both left and right each time. So what does that look like? So first we have the start and the end as 0 and 5.

[00:02:19] So the entire data structure, we find the middle, which is 2. So we are going to look, is it gonna be less? Is eggs gonna be less than c, like is e less than c? No, but it would be greater. So then we wanna increment our midpoint, or we need to update our midpoint.

[00:02:48] And add it as, we update our midpoint, so the one after cereal, we want that to be our new search area, right? Cuz we already know it's not gonna be on the left. So we wanna only look here, right?

>> Speaker 2: How did we take out Apple?

>> Bianca Gandolfo: What's that?

[00:03:08]

>> Speaker 2: How did we remove apple?

>> Bianca Gandolfo: We don't remove it, we don't actually change the data structure. We're just not gonna look there anymore.

>> Speaker 2: Yeah, sorry, that's not what I meant. How did we decide not to look for apple?

>> Bianca Gandolfo: Because we know that since this is all in alphabetical order-

[00:03:22]

>> Speaker 2: Yeah.

>> Bianca Gandolfo: We're at c.

>> Speaker 2: I see, okay.

>> Bianca Gandolfo: We know that e is after, right?

>> Speaker 2: Right.

>> Bianca Gandolfo: So I mean, we're not actually checking left or right, we're just basing it off of this value. The arrows are kinda just of just demonstrative. Yeah, and so we wanna update our first index to 3.

[00:03:43] So this part, we know it's not gonna be there. So now our problem is half as big as when we started, right? And all we had to do is look in the middle, and then we know that we need to look over here. So then we need to calculate our next midpoint, which is just simple math, right?

[00:04:00] We subtract the first index from the last index, and then we divide it by 2.

>> Bianca Gandolfo: So that gives us 4, which actually lands us where we need to be, right? Cuz it's 0, 1, 2, 3, 4, and we found it. But you could imagine that this could be really big.

[00:04:22] And we would have to keep doing this step where we figure out if it's before or after, and then you cut the problem in half. And then you keep cutting the problem in half until you land on your solution.

>> Speaker 2: What if the what you're looking for is bread?

[00:04:42] Because it's a c, then you'll go that way, and then that's how you're gonna continue, all right, cool.

>> Bianca Gandolfo: Exactly, exactly.

>> Speaker 2: But that's only if it's alphabetical, but what if it's just by, if it was by index, then it would be something else, or okay.

>> Bianca Gandolfo: Yeah, it has to be sorted, and it has to be.

[00:04:56] And because for something to be sorted, you have to be able to compare it, right? So easy things for us to compare are numbers and letters. That's like has an explicit ordering. So I just got tired of all these data structures and algorithm stuff with just numbers all the time.

[00:05:23] I have to do it, cuz it does simplify. But when I can sneak in some words, and I think I was hungry while I was making this. So I always find, these are all breakfast foods by the way, fudge included, just thought you should know. So that's binary search.

[00:05:40] So binary search, the time complexity is something called O (log n). Who's familiar with this? This is a little trickier than constant and linear, right? Let's hear it, Sonny.

>> Speaker 3: What, do you want me to tell you?

>> Bianca Gandolfo: Yeah.

>> Speaker 3: So every time you split the range, you're looking into half, so it becomes 2 to the power n.

[00:06:07]

>> Bianca Gandolfo: Yeah, so whenever you're seeing a problem that gets halved every time. So how we know this is halved is because, we need to identify n. So n is the length of this array, right?. So every time we get closer to our goal, we're only looking through half of the data structure.

[00:06:31] I mean, each look, we only look once, right? So,

>> Bianca Gandolfo: When we get here, we look at cereal, and we know it's to the right. So we don't look at apple or bread. So we're halving the amount of work that we need to do, etc, etc. So you keep doing that over time.

[00:06:53] And so if n is 6, or yeah, n is 6, then the first time we look, there's only gonna be 3 left, and then 2, and 1, for example. So that is, anything that's gonna be halved every time is gonna be logarithmic time. And it's kind of like the opposite of an exponent, I guess you could think about it like that.

[00:07:25] But I don't like to get too much into the mathy stuff. I just see okay, this problem set is being halved each time, logarithmic. When you get into like sorting kind of stuff, then it's n log n, cuz it's linear and logarithmic. And that's pretty interesting, and something to look into if you're not familiar with it.

[00:07:44] I have other courses on Frontend Masters that talk about that. All right, so let's talk about binary versus linear search really quickly. So we have our binary search as log n. It's gonna be faster, especially for really large data sets. And then we have linear for our linear search.

[00:08:04] I guess that makes sense, right? The nice thing about linear search, though, is that it doesn't have to be sorted. But for binary search, it definitely has to be sorted. Whenever you get thrown a problem that's here's a list of things, you need to search for it, always ask, is it sorting?

[00:08:26] Cuz if it's sorted, then you know that they're probably going for a binary search type question. And if not, you can always ask, can I sort it? So you can take into consideration the amount of time it takes to sort an input, which is O(n log(n)) usually. There's a little asterisk with a link to big O cheat sheet, which will talk about different sorting methods and their big O notation.

[00:08:58] But in general, you could say n log n, and people will believe you. They're not gonna really want to ask you specifics. Do you mean quick sort, do you mean merge sort, etc, etc. You can say either one and it would be right enough for this case.