Transcript from the "LRU Cache" Lesson
[00:00:12] And he gave the interview and he gave me a really hard Rx problem, and of course at that point I was really good at interviewing, so what's the thing I did? I convinced the interviewer to give me the answer through a series of questions, and I just wrote out what he was saying, like, yeah, it's exactly what I was doing.
[00:00:30] Switch latest, switch, switch latest, wrote out his exact answer and he told the other person interviewing that I was the best Rx candidate he has interviewed to that date and I didn't know anything that happened. I couldn't even tell you what I programmed, it was terrible, but sometimes you got to play the game, okay, and I played the game.
[00:00:50] All right, so what is an LRU so for those that don't know what an LRU is, it is a fantastic data structure and ultimately at Netflix I did end up implementing an LRU that did run for quite a few years on the television platform and on the website.
[00:01:06] And for a little bit even on mobile before we switch to native. So an LRU stands for least recently used, and what that means it's a caching mechanism that says that we will evict the least recently used item. Now there's a couple of things about caching that makes it unique in a sense.
[00:01:29] So let's think about how we would do that, how would we accomplish first the eviction of the least recently used. Well, one thing we can think about is that cash really is some sort of container, right? We're gonna build some sort of node container based system in which contains some sort of value.
[00:01:47] Now this value is what the user gave us to cache so you can think of website caching when you go to www.google.com, it's gonna pull up likely a lot of cache, right? And it has ways in which the VIX the cache based on expiry headers blah blah there's I'm sure there's plenty of different ways in which it can do expiration.
[00:02:05] So anyways we have a value we want to store it, bu at least recently used cache has one more item in which it has a linked list that has other values inside of it, so value 0, value 1, value 2, value 3 and so forth, right, it just keeps on going for some amount of time.
[00:02:30] Now the thing about this is that all these values need to be stored in such a way that if the user says hey, do I have value 2? I should be able to pull out value 2 and say yes you do. Here's the value, but more than just that, I need to take value 2, and I need to move it to the head of the queue, right?
[00:02:56] Because now it has been used. So here I'll just go like this. So now value 1 connects to value 3. And value 2 is now at the front of the list, this is our most recently used, this is our least recently used, item within the cache. So that is the basic idea of an LRU, now the execution of LRU gets a little bit tricky so obviously, what's a data structure we're going to need to use for an LRU.
>> The doubly linked.
>> Doubly linked list is a great answer, right, because I mean [LAUGH] if you look right there, obviously it's a doubly linked list, we're gonna definitely want to use that but there's a second more important problem that's happening. How did I ask for value 2?
[00:03:57] Well, what do you think back to our Falkor days, this is where I implemented it for we've talked about this multiple times, we would hand in some sort of path that looks like video 1, 2, 3, I don't know, title. So I need some sort of way to be able to look this up quickly.
[00:04:15] Now in the Falkor case, it was a trial, right, it was a trial that we discussed earlier. But in this case, it doesn't actually have to be a trial, does it? No, it can just be any key, right? I need some sort of key, so you can imagine, hey, I want to request this piece of data from the back end.
[00:04:32] Do I have this piece of data in my cache? So some sort of way in which you can key, the value so you can go right to it. So what does that sounds like?
>> HashMap, my goodness. It's like we were just discussing this, so we also have the potential for a HashMap here, but how can we use these two things at the exact same time?
[00:04:55] That is really the hard question, right? Well, hold your horses right there. So the idea is this is that you have a HashMap that has two generics, right? There's always two generics associated with a HashMap a key and a value. But the problem is, of course, that our key and our value, well, what is our value?
[00:05:22] Well, our value needs to be a node within the link list so we can instantaneously jump to that point within the list. So if you think about it, this could solves the traversing effect of a linked list, you don't have to go through it sequentially you jump right to the value that exists, if it doesn't you don't jump right to it so we used a tree.
[00:05:42] Inside a trial tree, a pixel tree, a dizzal tree whatever they call it, we use that to jump straight to the value. Here we're gonna use a key of some sort to jump straight to the value, and the value is actually gonna be a container class of node, which means we're literally going to have a HashMap that has items inside of it that also have pointers to other items.
[00:06:09] Right, and so makes it pretty kind of confusing. You can see why this is probably not the coolest question asked somebody when I get to an interview, I don't feel like this is the greatest question, I've literally asked this question too. So I have to feel somewhat bad.
[00:06:23] But that's the effective idea of an LRU. And I figured we could try to implement one of these, I've only implemented it once in the last month, so hopefully it doesn't just completely break down live right here. But let's talk about running time before we do that. What is a HashMap lookup?
>> Constant, 0, 1, let's go. What is the breaking from a linked list and what's the insertion in the front of a linked list?
>> Depends on the traversal, right? So if the traversal to the node is constant, it does not matter how many nodes are within the link list, you have to break two connections on each side to pull it out, you have to put two connections to insert it into the front plus adjusting the head.
[00:07:17] So a total of 7 0(1) operations, pretty neat. So that means we're gonna have to have some bookkeeping to be able to accomplish this. What we're really doing right now is we're now composing data structures, we have left the realm of simply defining a data structure. We are now composing them using multiple of them to create a pretty nice behavior, and we have actually unconstrained ourselves from a linked list.
[00:07:45] And so it makes it kind of a unique way to be able to accomplish this. All right, so we can probably just jump right into programming. Shall we jump right into programming and see how this thing is done? Let's see if I can see how this thing is done, all right-
>> Sorry, where did the 7 come from?
>> The 7 the 7 yes, well, when you remember when you break a note out of here, you need to take its previous and pointed to your next one operation, you need to take if your next and pointed to the previous two operations.
[00:08:13] You need to take your next and set it to undefined three or that'd be three operations, you need to set your previous undefined four operations, we've now broken it out of the linked list and made sure that the connections are good. Second, we have to come up here.
[00:08:26] We have to set our next to the head. We have to set our heads previous to the value we're about to insert, then we have to set our head, three operations. So seven operations that all take O of one time. Boom, so we effectively just do a prepend, we do a delete prepend.
[00:08:44] Yeah, yeah, now the tricky part about this is when we actually remove an item from the LRU, so how we remove an item from the LRU is we do need to still have a tail pointer, right? We need to be able to point to the we are able to point to the front of the list, the front is for when we update the cache, the tail is for when we need to remove from the cache an item.
[00:09:08] So let's say we set our capacity to 10, the moment we have 11 items inside of our cache, we now need to downgrade and remove the oldest possible item which always will be at the tail because every time we get something, we have to update the cache. Make sense?
[00:09:22] Awesome. Now I did have a problem with the unit test earlier, I believe I fixed it. Hopefully it didn't leave it on my previous computer and then not commit the fix unit tests just disastrous. And so for those I assume we remember the tail removing operation, the pop operation, effectively it's a 3 step operation.
[00:09:42] We set the tail to the previous, right, tailed off brief. Or we don't actually, then we also save off a value to the tail then we unhook the saved off value to the tail and we unhook it the other direction and then we can return out the tail.
[00:09:57] So that would be that type of operation. In an LRU, we don't return out the last item in the cache, we just get rid of it. We no longer need it, if it isn't in a more traditional language, we'd have to free it or delete it. So awesome.
[00:10:11] So is there any other questions with the operations of an LRU? I know that was pretty brief. But I made the assumption that you remember how a map works. You remember how a linked list works. So we're just combining these two data structures to make a superstructure. I always wanna use the term superstructure.