Complete Intro to Computer Science Big O Trade-Offs
Transcript from the "Big O Trade-Offs" Lesson
>> So we just talked about spatial complexity. Let's go ahead and pop into a whole section just on tradeoffs. And again, I actually think this is the most important part of this course. I'm gonna show you a bunch of novel techniques for sorting things and storing information, all that kind of stuff.
[00:00:21] But the point here is, I wanted to show you can do things multiple different ways. And you can decide which one's better or worse depending on what the problem constraints are. And I think, again, when you're getting into interviews and talking to an interviewer at a tech company, if you can discuss the tradeoffs with them, you're gonna be a leg up on everyone else, right?
[00:00:49] So you can get in there and it's like, okay, well, we could do it this way, but we're trading off time. Or we can do it this way, and it's gonna cost more on our servers, blah, blah, blah. And you can discuss money versus customer experience or things like that, right?
[00:01:04] To me, I have, at this point, done way too many interviews. I find that as an interviewer very, very impressive when a developer is able to kind of discuss these kind of, I guess business logic kind of things based on the constraints that they have. So I'm just giving you a big hint there that I think that's a really good idea.
[00:01:33] And then just like when you're writing code, looking at something and just kinda getting these, people call them code smells. But the idea of just these intuitions of hey, I have a for loop here, and I'm putting another for loop inside of it. That's what n square looks like, this might be a bad idea, right?
[00:01:51] That's another thing that I'm kinda trying to help you understand here. Yeah, and just thinking about something in terms of, could this be done better? And, again, better is a very subjective term, and I'm trying to help you understand the trade offs so that you can decide what better means in any particular case.
[00:02:14] So my first rule here that I put, and I'll just zoom in here a little bit on this is, there are no rules here, right? There is no always do blink, right? That's one of the very, very big keys here. We talked about best and worst case scenarios, right?
[00:02:31] Sometimes you wanna lean into those best case scenarios, and sometimes a less effective algorithm in terms of average case might have a better best case. And you might always have a best case scenario in front of you and you know that, in which case, you wanna lean into the best case.
[00:02:47] But sometimes, you don't know what you're gonna get, in which case, you really wanna lean into the worst and average case scenarios, right? So that's one thing here is, just learn that there are no rules and you have to reevaluate every problem in a unique light. There are frequently multiple right choices.
[00:03:09] It's not like someone's up there grading you saying you should have chosen quick sort instead of merge sort this time. Frequently, those are both very effective answers to the same question. Remember how I told you Big O says ignore the 3 and the 3x squared? Sometimes you can't ignore the 3, right?
[00:03:31] So even though I taught you Big O and Big O ignores all of the coefficients, the incidentals, right? If you have a for loop that does has 10,000 lines in it, sometimes that coefficient ends up being enormous and you can't ignore it, right? So, again, Big O is just one tool, it's a very blunt instruments in terms of that it ignores a lot of the incidental details, and sometimes the incidental details are actually critical.
[00:04:05] So be aware of that. Yeah, Sometimes Big O can even be a bad tool to use in the sense of, let's imagine, and this is actually a real story from my career. I had a job that ingested a bunch of data from a partner once a day and ran and did a bunch of processing and then stored a bunch of stuff in our database.
[00:04:36] And I spent just tons of time trying to optimize that cuz it was a lot of data, right? But here's the thing, that job happened once a day, and it didn't matter if it took 30 minutes. No one was waiting on it, right? It was just updating our database with the freshest data, right?
[00:04:53] So this was really early in my career. I spent just hours, and days, and weeks, trying to optimize this to get this down from, I think I had it 45 seconds I was trying to get it done. I was like, this should run in under a second. It was a massive waste of time, it didn't matter that it took 45 seconds, right?
[00:05:11] So in this particular case, I was doing myself and my company a big disservice by focusing on something not important. So those things are only important when a user's being affected or it's causing downtime or things like that. If you're just making a computer wait a little bit longer, who cares, right?
[00:05:32] Who cares that that job took 45 seconds? No one. I guess, except me, right? I cared [LAUGH]. This is probably the most important thing in here. Readability and maintainability are actually the most important things in code. I think I heard Kyle Simpson say this, I don't know if he invented it.
[00:05:55] He's a smart guy, but it might have been someone else. But he said that code is communication and that it's more important that a human can read it than a computer can read it. And that really challenged my worldview, the first time I heard it. And then I realized further and further it's like, if the only thing was important was computer time, we would just be writing assembly, right?
[00:06:34] It's a lot easier to write as well, but it's a lot easier to read and it's a lot easier to maintain. Why do we like TypeScript? TypeScript is even easier to maintain. I maintain that this is at all kind of derived from the fact that code is communication.
[00:06:50] Code is communication to your future self, it's communication to future developers that have to maintain your code. So always, always optimize for readability and maintainability. If you have something that's very simple and it's n squared and you have something that's hyper optimized and it's n, the n squared one might still be the better answer, if it's a lot easier for you to maintain later.
[00:07:13] Only go for that n code, if the constraints actually call for it, right? So that's what I'm saying here, readability and maintainability should be number 1 on your list, and only optimize when you actually need to optimize. Always err on the side of simple. If you can choose from complex and simple, simple code tends to just have less bugs because you can understand it more holistically.
[00:07:43] So where you can, try and write simple code. Human time is almost always more valuable than computer time. In the sense of what I'm saying is, if you can throw an additional server at the problem and it solves your problem, as opposed to making you write a lot more code, usually, it's cheaper to run a server than to pay you to write that code.
[00:08:04] So usually it's the right choice for you to spend your time doing something else. Talked about this already, but don't prematurely optimize. Try to have a perf problem before you solve a perf problem. Again, these are high level principles. I refer you back to rule number 1 here is that there are no rules.
[00:08:21] Sometimes you can look at something as, I know this is gonna be a problem and I'm going to fix it right now. But just know that you rarely know your problems you're about to have. None of us are code psychics, or if you are, I would like to hire you [LAUGH].
[00:09:20] So they can drop into the internals of the engine, sort your code and hand it back to you frequently faster than you can do it yourself. But the other thing I'm gonna say is that .sort has been so tested that it's pretty much guaranteed to be bug free.
[00:09:37] If you hit a bug on .sort, you are doing some just wild stuff and one, I congratulate you [LAUGH]. And two, it would actually be really cool thing for you to go fix. But suffice to say, .sort has been called on just about every website out there on the Internet right now, which means that you're probably not gonna hit the edge case on it.
[00:09:59] Whereas, if you write your own sorting algorithm, I guarantee you're gonna have some bugs that you just don't think about. So where you can, try and lean into what other people have done. Install that module off of NPM, call the .sort method, whatever you need to do, but you'll have less surface area for bugs.
[00:10:18] And again, I'd rather spend my time writing new and novel cool stuff for my company, rather than solving bugs. I don't know, maybe I'm strange in that capacity, but I don't think so. So the question is, you go out there you do what I just tell you, which is, do the unoptimized thing, do the simple thing, and then you hit hockey stick growth.
[00:10:41] And now instead of having 10 users, you now have 100,000 users and you find that you're gonna run into a lot of performance problems. So one, congratulations, [LAUGH] you're living the dream right there, which is that you hit scaling problems. And that's kind of the advice that I give startups, it's the advice that I give developers, try and have good problems.
[00:11:10] And having scaling problems are good problems. Now they're terrifying and they're very difficult to solve, right? But it's why people say never write Ruby on Rails. It's this unoptimized things and look, Twitter had to rewrite and GitHub had to rewrite and blah, blah, blah, and all that kind of stuff.
[00:11:27] Well, one, they're billion dollar companies, and they can now pay people to solve those problems. And if you hit hockey stick growth and go from 10 to 100,000 users, I hope that you're making enough money now that you can hire someone to go back and solve those scaling problems.
[00:11:42] I think the YC thing, the Y Combinator, which is the name of a VC firm is, do unscalable things. I quite agree with that in the sense of, again, normally you're not gonna, when you have 10 users, you typically don't know what problems you're actually gonna have when you have 100,000 users.
[00:12:03] You only know that you're gonna have those problems when you hit 100,000 users. Which means you're probably not gonna solve the right problems. You think, okay, this is a bottleneck, this for loop is definitely gonna get bugged down, and frequently it's actually over here and it's a different problem and it's way bigger, right?
[00:12:21] So yes, that's gonna happen and you should feel very happy when those things happen. Does that answer your question? Cool, other questions?
>> I have one follow up question. So should we wait for that situation when there is an increase for user audience in our app, or we need to do some performance testing and fix code?
[00:12:49] And when should we do that performance testing?
>> Yes, I think that the crux of that question is, when should we do that user tests, load testing and how much should we do in those kinda things? It depends, [LAUGH] I think the answer to that question is, how many users do you anticipate in the very near term are gonna be using your application, right?
[00:13:21] So let's say I work at a startup and that startup has 10,000 users daily. I would probably try and load tests for up to 30,000 users or something like that, or 50,000 or 100,000 or something like that. Something that's within an order of magnitude difference, so that I'm not gonna just fall over my next, kinda scale up next little growth area.
[00:13:44] But I'm not gonna use or test, or I'm not gonna load test for a million. I'm not gonna load test for a billion, right? Really only Facebook gets to do that. [LAUGH] So, yeah, try and keep it within, Growing distance, within sight. But it's gonna really depend on what you anticipate growth and then I guess it also depends on what else is pressing, right?
[00:14:09] If you're doing that instead of shipping new critical features, then you probably wanna minimize that and go work on the critical features. So it's really a pretty delicate balance depending on where you are in your startup's growth pattern.
>> So what you're saying mean that we need to be prepared for the most nearest future.
>> Not far away, but in a way for reasonable future, yeah, okay.
>> Yeah, I think that's a good summation is be prepared for the visible future. Cuz it's possible to spend just infinite cycles on optimization, and that's not useful to anybody, right? What's useful is providing value for your users.
[00:15:00] Let's go with unnamed company, one that many of you have probably heard of. I had one particular problem, their search bar was incredibly slow, just while the inappropriately slow. And I was so upset, and I still use the products I liked the product, but the search bar drove me just bananas.
[00:15:20] And I asked one of the engineers, I was like, why is it so slow? I'm so upset all the time. [LAUGH] And basically what they said is, we looked at that and is just the gnarliest problem of how we did this when we first started it. They were loading just this huge amount of data set into a device that has no business loading that much data.
[00:15:39] And they basically said, it's gonna require a huge rewrite of this, and we have to ship a bunch of stuff before we can get back to this performance problem. And I was upset on one hand as a user, on the other hand, I totally agree with that approach of it did work, and those features that they were shipping, they did need.
[00:15:57] So in that case, ignoring the performance problem temporarily was actually the correct call. And it's fixed now, right? I still give them endless crap about it, and they deserve it, yeah. Anyway, it's a balance. Okay, let's get into writing some code. You're all probably sick of listening to me talk at you.
[00:16:24] We are gonna get into Bubble Sort.