Practical Problem Solving with Algorithms Optimizing the Lookup Function
Transcript from the "Optimizing the Lookup Function" Lesson
>> So I wanna talk about several things that are less than optimal, we'll say. And again, I wanna start with the LOOKUP function, because that one was a little bit easier to reason about, we didn't have to do quite so much thinking to get to that solution. The fact that we had to do anything in the LOOKUP function should be a little bit of a code smell to us.
[00:00:24] We have all of that information there. We know all of the symbols upfront, that's not a thing that needs to be computed later. We know all of that, and we're not saving any of that information, we're not remembering any of that information. Just think about the fact that we call the LOOKUP, you could call it multiple times with the same symbol and it would research every time.
[00:00:45] It's not remembering any of its work, and it's having to search potentially through the whole list every single time. So there are red flags that should be going off all over the place, just about those few for lines of code. Now, this is not gonna break anybody's performance bank.
[00:01:03] This is not gonna be the reason why your app fails, something like this, but it's a paper cut. In fact it's a couple dozen paper cuts and we know that paper cuts add up, you've heard of that adage the death of 1,000 paper cuts, we know that they add up.
[00:01:20] And an algorithmist says, if I can avoid the paper cuts, I should. Can't always avoid them, you're not gonna get a perfect solution. But if I can look at something and I've got a red flag that tells me there's a paper cut there that I could avoid, then we should think about it.
[00:01:37] And in fact, here's another principle, it could have been one of the ones that I put on the slides earlier at the beginning of the workshop, in the intro to the workshop. Here's another one of those takeaways. If you know something up front, remember it. Don't throw it away and recompute it later.
[00:01:53] Don't be lazy like, I can always recompute the answer to that question later. If you've discovered it, remember it. That's gonna be a key to developing more optimal algorithms. So when we load up the whole list of elements here, we effectively know all of the mappings between an element's symbol name and the element entry, but we're not remembering any of that when we throw it all into an array.
[00:02:22] So the first thing that we're gonna do is remember that. And you might recall that I discussed this notion of creating indexes, we're not gonna duplicate our data structure, our data structure is what our data structure is. That's how it comes from the JSON, there's no reason for us to construct a new data structure in this case.
[00:02:40] We're just gonna leave it that way. But we are gonna construct an index that allows us to look up something with a single operation rather than having to loop through the whole thing. Constructing an index sounds a whole lot more complicated and sophisticated than it actually is. So, go back to your co-workers the next time you're at work and say, I constructed an index and it increased the optimization of my algorithm.
[00:03:04] And they'll buy you lunch that day, I promise. Like cool, constructed an index to optimize the algorithm, is pretty cool. All we're gonna do is create another object with some properties in it, it's not that fancy, but it is gonna get the job done. All right, so first thing I wanna do is, in addition to this element, so I'm gonna create a variable here that I'm gonna call symbols, that's gonna be my index, it's gonna be an object, it's gonna be my index.
[00:03:29] So I'll go ahead and initialize it to an empty object. And this is the right place to do that work, I'm not gonna lazily discover these indexes, I'm gonna upfront do the work. Technically, you could lazily discover and build the index as you go, but we're just gonna build it all up front, especially since it's only 118 elements, no big deal.
[00:03:52] So what am I gonna do? After I got that array, I'm gonna go through the elements. I do a for of them. I occasionally get questions, why don't you use for each or why aren't you getting maps and reduces here? I'm not doing any maps and reduces cuz this is not the functional programming course and that would be more mental context for me to throw at you.
[00:04:15] Hey, now you got to go learn what a MapReduce is, map or reduce or filter whatever. I don't used those unless I'm doing functional programming, so that's the answer that. I use forEach because that's not, it's neither good imperative coding nor is it functional programming. It's just the bastard child, so I don't use forEach.
[00:04:32] I'm gonna use For Loop 9,000 times out of 10. All right, so I got my for of loop here, let me zoom back in because I had zoomed out a little bit, it's a little hard to read. We got our for of loop here. This is about as straightforward as you could probably make it.
[00:04:48] Remember we have symbols, which is an object. And we're just gonna store properties in this symbol and what property name do you think we ought to use? Anybody think what property? Well, let me show you what we're gonna assign. We're gonna assign the element. So what are we gonna put here for the property name in the symbols object?
[00:05:09] Anybody got an idea? What's our task at hand? What are we trying to do? What is the LOOKUP function trying to do? Yes.
>> I've got a couple of different answers here.
>> Symbol, element dot symbol, number.
>> Could use number but those are not probably the best keys for us to use, and I'll explain why.
[00:05:45] Any others?
>> Lots of people saying symbol.
>> Okay, we are gonna wanna use the symbol as the key but I just wanna make sure everybody understands why, because that's what the LOOKUP function is doing. The LOOKUP function is mapping from a symbol to an element. So we're gonna create an index that maps from the symbol, the string, the lowercase one or two characters which is unique to the element entry.
[00:06:13] So that's what we're gonna do up here. We're gonna say symbols of, and the property name that we're gonna use is element.symbol.toLowerCase. So, at the end of that, if we were to log out that object, we're gonna have an object with a whole bunch of properties in it, 118 properties to be specific, and each one of those is a string property that has a value.
[00:06:39] That is not a copy of the element entry, it is simply a link, a reference into our main data structure, our array, it's pointing at each one of the entries in that array. And now with this, separate data structure I'll call it, with this index having been created, this code gets real easy.
[00:07:03] Return symbols [LAUGH] of elementSymbol. You see why this is better, because we did all of that looping. We did 118 iterations loop once up front and remembered some information, instead of re-looping for every time we needed to do the lookup. A little hard to show both those snippets on the same screen, but this is the important stuff.
[00:07:41] We're doing that loop here, this is effectively building an index. I'm using some air quotes here, that's not quite as sophisticated as a database index might be. But we're basically doing a helper data structure that's gonna allow us to index into the main data structure in what we call an O of one operation.
[00:09:03] When I say O of one for something like that, big O notation, which we'll come back to a bit later in the workshop. But big O notation is a way that computer scientists talk about the complexity both the time processing complexity and the memory complexity that a particular algorithm takes, in best case, average case, and worst case.
[00:09:28] It's a way of describing what's the worst possible outcome if this algorithm was given the worst possible data. What's its worst possible performance, that's what we call this big O. O of 1 is what we call constant time because it's a single operation, single unit. We'll talk about all the other ones that are above it, but it doesn't get any better than constant time other than no operation at all, which I guess the best operation is when you don't perform.
[00:09:58] But if you got to perform an operation, the best performance is O of 1. Yes.
>> So we just turned this from O of n to O of 1?
>> Yeah, exactly. So actually, it was a little bit worse than O of n, because we're doing O of n for each letter, or for each symbol, I guess.
[00:10:18] So it was really kind of like O of n times m, m being the number of symbols in our return value. And we went from whatever n times m was, which wasn't big, but whatever it was, to 1. So that's good.