Transcript from the "Introducing Space & Time Complexity" Lesson

[00:00:00]

>> Bianca Gandolfo: We're gonna talk about time complexity. Time complexity is how we reason about the speed of an algorithm. So as you can imagine, speeding up an algorithm allows us to process more and more data inputs. However, we can't time the start and the finish of an algorithm, so we use a way of estimating the speed of algorithms.

[00:00:30] So if we run some code on a machine, there's a lot of factors we can't control, such as the speed of the machine, and language, the environment that the language is executing in, all of these things. So instead of saying, this algorithm is three seconds long, we say that this algorithm, as the input grows, it will grow in this proportion.

[00:00:58] And so we're gonna talk about that a little more in depth. So there's two things, there's space complexity, which talks about how much memory is used. And there's time complexity, which tends to be a little more difficult for people, which is, how many operations are executed to solve your problem?

[00:01:19] And we wanna think about this in respect to input size. So as your input grows, how many more operations do we have to complete? And we also are pessimists in our field, and so we always wanna assume the worst case scenarios. And so we're thinking about always worst case scenario and,

[00:01:41]

>> Bianca Gandolfo: Yeah, but there are other ways to think about it as well, okay, questions?

>> Bianca Gandolfo: No, [INAUDIBLE], okay, all right, so let's look at a few examples. So here is kayak.com, let's say you're an engineer here. Your PM or your whoever's telling you what to do says, okay, given this list of hotels, you need to return the minimum and the maximum price value for all of the results from some search query.

[00:02:20] So how might we solve this?

>> Bianca Gandolfo: Cool, one note, so as more data is being processed. So say we have a list of 3 hotels versus a list of 1000 hotels. You can expect, right, you're gonna be doing more operations, that's the obvious part. So that's why we like to think about it in terms of proportion.

[00:02:46] So as you're adding one, how many more operations need to be completed to get to your solution?

>> Bianca Gandolfo: And the cost as a result, as your data set grows, the cost can either be really great, or it can grow really slow.

>> Bianca Gandolfo: All right, so one approach to find the range is, we simply have some nested loops going on and we compare all numbers, all prices, to the other prices.

[00:03:19] And so you can imagine,

>> Bianca Gandolfo: That we won't compare one to another. But we say, okay, so is 50 greater than 200? No, and we're just gonna keep track of that. And then we compare, keeping track of the lowest and the highest. As we're looping through, we have some variable that we're updating.

[00:03:44] But we have some nested loops cuz we wanna make sure that we're comparing everything to everything. And so you can see, every time we do this comparison, and we are doing some operation, right, which is the greater than or less than operation. And so every time our list grows, so say we have three items, kind of like this graph here.

[00:04:11] Right, we're gonna do nine compares, versus if we have five, it's definitely 25. Because our table is gonna grow, as we compare numbers, our x and ys will compound. So 10 is gonna be 10x, 10y, and so there will be 100 comparisons. So this is what we call quadratic time.

[00:04:38] So as our number of n, which is the number of hotels, increases, our number of operations increases by the power of 2. So 3 times 3, 5 times 5, 10 times 10, and this is how I want you to think about time complexity. As our input grows, how much more work do we need to do?

[00:05:04]

>> Bianca Gandolfo: Okay, here's another approach, what if we have two loops? On the first loop, we check what's the minimum, in the second loop, we check the maximum. How many operations do we need? So for our first loop, we'd say, is 200 the maximum, yes, is 50 the maximum, no, so we just keep 200.

[00:05:28] You can imagine we might do that a bunch of times. Is 175 the maximum, no, we'll keep it at 200. And then we have another loop it's gonna loop through. Is 200 the minimum, at the very first point, yeah. And then is 50 in the minimum, yes, now it's the new minimum.

[00:05:42] And we keep going, and then 50 is still the minimum. And so in this technique, we are looking and doing operations on each item one at a time, so this is two loops, not nested. The first example, nested loops, where every item gets compared to every other item.

[00:06:08] This one, every item gets looked at once, we just keep track of that min. And we'll look at it a second time and we keep track of that min. And so if you had to put this in terms of n, I gave you the answer, I always do that.

[00:06:30] What do you think, if you had to guess, so if the other one was n squared-

>> Unidentified Male: Two times n?

>> Bianca Gandolfo: Yeah, two times n, exactly, so for the first loop, that's n, right? So from zero or one to three, the second one, one to three. Cool, and so yeah, so every time we increase n, our work increases by 2n, questions here?

[00:07:02]

>> Bianca Gandolfo: Is everyone clear on the difference between 2n and n squared, okay.

>> Speaker 3: This is shorter.

>> Bianca Gandolfo: Yeah, it's faster, less comparisons are made. And you can make this table and you can see that this one only has two here, and this row here is n, or the number of hotels.

[00:07:27]

>> Bianca Gandolfo: All right, so what if the list was already sorted, how might we find the min and the max there? How about,

>> Bianca Gandolfo: Who do I wanna choose? So many people, I like all of you. How about Josh?

>> Josh: I missed the first part of that.

>> Bianca Gandolfo: Yeah, so if the list was sorted, how might we find the min?

[00:07:57]

>> Josh: If it was sorted, well, there's that thing where you just divide it, and then you divide it or whatever into chunks.

>> Bianca Gandolfo: That's true, that's if you're searching for something

>> Josh: I guess one end, you just go through the list and-

>> Bianca Gandolfo: Mm-hm, but if it's sorted, you know the minimum is where?

[00:08:13]

>> Josh: Or you just go to the first.

>> Bianca Gandolfo: Yeah, exactly, so we just know exactly where it is, so we'll go to the first one. And Sarah Rose, what about the max?

>> Sarah Rose: Well, you'd go to the last one.

>> Bianca Gandolfo: Yeah, we'd just go to the last one.

[00:08:24] So that speeds things up considerably there, right? No loops, definitely no nested loops, and so every time, the number of operations is really, no matter what, no matter how long the list is, it's still going to be two.

>> Unidentified Female: Yeah, why is it three, I was kind of wondering.

[00:08:44]

>> Bianca Gandolfo: I don't know why, there might be something else in here.

>> Unidentified Female: Because it's sorted?

>> Unidentified Male: It has to sort.

>> Bianca Gandolfo: No, it's already sorted, because the sort, if there's a sort on there, then the time complexity would change. This is the correct slide for, okay, so if it's sorted, we just do two operations, no matter the length of the array.

[00:09:07] So we can have a list that's 100, it's still we get the first, we still get the last. And so the number of operations doesn't change.

>> Bianca Gandolfo: Cool, questions?

>> Bianca Gandolfo: And here's just a review of that, so if we compare all the numbers to one another in that nested loop situation.

[00:09:32] The very first one, that's gonna be n squared, because as our n increases, each loop is also going to increase, and so we need to multiply it by itself. Anytime you have one loop, and it's looping from the beginning to the end of n, that's gonna be just n, right?

[00:09:49] Because when it's 10, you loop 10 times, when it's 11 you loop 11 times, right? So 10 is n, 11 is n, so we just have n. And in this case we have 2n because we have one loop, and then we finish that loop, and then we do a second loop, and we finish that loop, they're not inside each other.

[00:10:08] And then with just the 2 here, we just go directly to the source. Right, either the beginning or the very end, because our list is already sorted, and that's nice, cool, all right.

>> Bianca Gandolfo: Here are some fancy names for these things. So when we say n squared, that's called quadratic time.

[00:10:34] We talk about time complexity using something called big O notation. And so we would say, this is O of n to the 2, or O of n squared, or quadratic time, that's how we talk about it. And then we say O of n which is linear for our 2n example.

[00:10:54] And then O of 1, constant time for our, this is another corrected slide, sorry guys. For our case where we're going directly to the min and the max here.

>> Bianca Gandolfo: So the next logical question might be, what's with these 2s? This one says 1, that one just says n, then we have 2n, then we have 2.

[00:11:28] When we are thinking about time complexity, we don't think about every small detail. We wanna have a big picture estimation, and so we drop what we call non-significant numbers, which is anything that is not pretty much listed here. So if it's not n or log n or n squared or k to the n, we just drop it, essentially.

[00:11:54] And we just give it a big label versus getting to the nitty gritty of, if you take an algorithmic analysis class, you will do all of the calculations that go into this. But in the interview setting, it's not expected that you're gonna be proving this. So you could just think about it like that, and what else do I wanna tell you about this?

[00:12:19]

>> Unidentified Female: Bianca, is this the entire range of the big O notations?

>> Bianca Gandolfo: Yeah, well, more or less, yeah, it's the major ones. But there's also n log n, which is another major one, which is linear times logarithmic. But we'll get to that when we get to merge sort we'll talk about that a little bit.

[00:12:44] A good question, thank you. And also, the fastest algorithms are in constant time, and the slowest are in exponential time. As this input grows, it's like an exponent, it grows exponentially, and this one grows n times n. This one grows just by n, so it kind of is a straight line, or not a straight line, but a straight diagonal.

[00:13:10] And then we'll talk about logarithm in a second, and then constant is always gonna be the same. No matter how your input grows, it's always gonna be the same number of operations.

>> Bianca Gandolfo: Okay, here's a graph, so here is our constant time rise. No matter what, it's always gonna be the same.

[00:13:29] Here's some exponential one, here's a cubic one, quadratic. These are all really, really, really slow, look how fast they grow as the input grows. So when your inputs are sufficiently large, you just can't compute anymore. It'll take a billion years and you won't be able to solve your algorithm.

[00:13:46] And that's why it's important to know how slow it is or fast, and then be able to reason about how to speed it up, or understand the constraints, that maybe you can't do that. And you might need to make some trade-offs, like maybe algorithmic correctness, which is this idea that, is your solution to the algorithm the optimal solution or the correct solution?

[00:14:08] Sometimes you have to make a trade-off, because your algorithm otherwise would be so slow that it would never compute. So it's better to have some solution that's good enough, versus no solution at all.

>> Bianca Gandolfo: What else, and then I'll talk a little bit about logarithmic in a second.

[00:14:27] But what is good to know is that as log grows, you can see that this is really steep at first, and then it tapers off. So as on the logarithmic solution, as the inputs grow for a logarithmic solution it grows, but then it kind of tapers off. So as the work increases, it doesn't,

[00:14:55]

>> Bianca Gandolfo: The delta, or the increase, lessens, we'll talk about that more, yep.

>> Unidentified Female: So it looks like there are some places on the graph where the quadratic is actually more efficient than the logarithmic if you had a very small data set. So would there ever be a point where you might choose that over the other-

[00:15:24]

>> Bianca Gandolfo: Usually time complexity isn't so important when your data set is small. So I would just, if your data set it small, I would just focus on readability rather than performance. When your dataset is large is when you wanna start thinking about the speed. Really, you don't wanna start thinking about speed until it's lagging and someone's complaining about it.

[00:15:46] You don't wanna pre-optimize, but yeah, you really don't need to start thinking about it until your data set grows, and there's some sort of lag in your computation. It was a good question.