Lesson Description
The "Caching" Lesson is part of the full, Backend System Design course featured in this preview video. Here's what you'd learn in this lesson:
Jem introduces caching as storing frequently used data close to where it is needed to reduce latency and load. He covers common caching strategies and their trade-offs between performance, consistency, and complexity.
Transcript from the "Caching" Lesson
[00:00:00]
>> Jem Young: All right, let's get to it. It's a fun one. They're all fun. Let's talk about caching. Caching is a form of data storage. That's why we say data storage, not database. Let me softball this one out. What's the difference between, say, caching and writing to a database or something like that? Are they the same thing? And caching is temporary. Yeah, any other? You've lost power, your cache might go down, but your database is permanent, not permanent, but yeah.
[00:00:44]
Would an analogy be like in a computer, hard drive is like a database, but the RAM is like a cache? That's a fair analogy, yeah. You could have a disk cache though, so you're not guaranteeing we're always writing to memory. But generally, when we think about a cache, yeah, it's a form of data storage. You're not doing complex queries on a cache because cache should be fast. So with cache, we think key-value store.
[00:01:12]
When we say key-value store, you're thinking starts with an R? Redis, thank you. Yeah, they're just hacks. It's always going to be Redis. Key-value store, cache, it's probably going to be Redis. Caching is about storing frequently accessed data as close as you can to the thing that needs that data. So there's different levels of caching, we'll talk about here. That's really the difference between, say, a cache and databases.
[00:01:41]
The database is going to be much further down your stack generally. A cache could be at different layers, and you could have a cache at every layer, but it's about what data do I need frequently and how close can I get it to the thing that's making the call? And I have this really very intuitive diagram explaining why we cache, you know. Which one's going to be faster? Not a trick question. It's going to be the shortest hop.
[00:02:09]
So if we can make a client cache, oh, that's the dream. Client cache is the dream, as much as you can. And you can have caching at every layer, and the more caching you have at the layer, it just means you don't have to go further down the stack. The less hops you have to make, the faster and lower latency your network requests are going to be. So we're always going to be caching. Do I have that here?
[00:02:38]
Yes. Always be caching. ABC. It's not a free lunch though. There are trade-offs with caching because, you know, nothing's free, but caching is a very good strategy. If you're talking about improving your performance, the first thing most experienced engineers are going to say is, do you have a cache? It's like a cheat code to performance. So caching is going to reduce latency, which is going to improve your user experience.
[00:03:05]
Hey, that was snappy. That was a snappy, oh, get the cache. It's going to decrease your system load because we're not making all these hops to the database anymore, so that's nice. And it's going to be cheaper. You know, we touched on costs here that doesn't usually come up if we're talking about designing a system because it's all paper designs, but in real life, a cache is going to save you a lot of money because it turns out, again, data is not evenly distributed.
[00:03:31]
There's going to be data that's frequently accessed. If there's data frequently accessed, just put it in the cache, you don't have to do a lookup. So there's lots of different types of caches. There's caches at every layer. Local storage is, you know, used as a cache on the clients. HTTP has a cache, it's built-in cache, does it automatically for you. It's pretty great. Your API is going to have a cache.
[00:04:00]
If you have really expensive computations, I don't know, I need to think of better examples, but calculate stock market real-time diffs or something like that, I don't know, something some quants are doing, you can pre-compute those calculations to save them instead of doing it every single time. You could have an in-memory cache, so it's writing to memory. That's something you all do automatically most of the time.
[00:04:26]
You're just writing to memory, very, very cheap cache, very, very easy to use, but you can also write to disk if you want. That also works. Databases have caches built into them. That's the cool thing about database technology. I know I said they're not my favorite thing to work with, but under the hood, all of them have caches. You can have a query cache, so, hey, here's a query that people run all the time, select star from some user table.
[00:04:55]
That's going to get cached because that's a common query, that's an expensive one to pull all that in. Yeah, do you have to like explicitly set that in the database for it to cache often or it just does it automatically? Yep, yeah. It's pretty cool. You can configure, everything's configurable if you really want to, but yeah, it's better to let the database figure that out for you. Hey, this has been a query that's been called 10 times in the last second.
[00:05:24]
I'm just going to cache this. Cool. You can put a cache in front of your database, which we'll talk about in a second here, but you can preempt that as well, so you can save the database cache for maybe tertiary results that aren't as popular, but they're still somewhat popular that you wouldn't want to cache. And CDNs, CDNs are, what do they stand for? Content delivery, yeah. Yeah, we all use CDNs.
[00:05:56]
CDNs are everywhere. It's this idea of, hey, what are static assets, and we put them as close to the user as we can. Open Connect is an example of CDN. Netflix is a content delivery network, the biggest one in the world, actually, and they actually sit in data centers or not data centers, ISPs. There's little Open Connect appliances in ISPs around the world. You probably have one somewhere, somewhere close to you.
[00:06:26]
That's how Netflix is so fast. Literally hardware that's in the, yep, yeah, one of the few pieces of hardware we actually produce ourselves. But yeah, that was the, maybe some Netflix lore. That was the thing because Netflix takes up, you know, 30% of the internet traffic, which is making a lot of people not happy, and you know, we're not going to get into a debate about common carrier and should the internet just be like a bunch of dumb pipes or should you have to pay metered use of the pipes.
[00:06:56]
But you know, avoid all that by not having that debate by just saying like, cool, we hear you, we don't want to use up all your bandwidth, let us put Open Connect, CDN in your data center. Better for you, better for our customers, win-win. Well, the trade-off there is it's complex, so we have to have a team to manage all that, but great use of caching. Fantastic use of caching. And we say, which one of these caches should we use?
[00:07:21]
All of them, as much as possible, wherever possible, you always want to be caching. A well-implemented system is going to have caching at every level. So this isn't just something like, oh, it's so nice to have, it's something you should be doing automatically. So similar to, because we are still talking about data storage, nothing's ever straightforward and there's different ways of doing things.
[00:07:50]
So you have lazy loading or sometimes called cache-aside. So what happens is, and we won't talk of database here, but take this and put it wherever you'd like. The same pattern applies wherever you'd like to implement cache. So first thing that happens is the server or whatever is saying, hey, let me check the cache to see if there's data I need, if it exists. If it doesn't exist, let me read from my data store.
[00:08:20]
And then we're going to update the cache. The benefits here on cache-aside is you're not caching everything, you're only caching things that were actual real requests. And deciding what to cache is an art in itself, and lazy loading helps with that because it's a real request. Hey, this request is coming in. We know that at some point someone needed it. Versus some of the other strategies was like it just caches everything.
[00:08:47]
So that was a real scenario, and if you're writing even simpler, it just writes to the database and writes to the cache at the same time. So cache-aside, it's going to be the most common caching strategy for most things because it's pretty straightforward to reason about. You have a write-through cache. So the service is going to write to the cache and then write to the database. So it's kind of like the cache, the cache is in between.
[00:09:23]
The tricky thing here is everything's getting cached. You could have logic on, in the caching service itself to say, here's what I actually want to cache, here's what I don't want to cache, but what's the trade-off there? Computation. Yeah, it's going to be, you're slowing down the cache at that point. It's not just a dumb, I take some data, I write it to the disk or memory. I have to think about it.
[00:09:46]
I have to do computation, which is going to slow down the cache. That's a trade-off you might want to make. But this one, the difference between the write-through and say cache-aside, everything's getting cached versus cache-aside. With this one you can decide, hey, the server can decide, do I actually want to cache this or not, because there's already some latency there because at that point you're already, the service is already returning back that data, so you're not really hindering the user in any way.
[00:10:19]
It's more, you're adding more load on the application, but you're keeping latency low. So then you have this idea of a read-through, which is, hey, I'm going to read from the cache, if there's a cache hit, I'm just going to return immediately. If there's a cache miss, I read from the database, then when it comes back through, I write it to the cache. So in this one, the caching, the cache itself is doing all the work.
[00:10:45]
Which again, that's a trade-off you think about. Do I want my cache to do anything or do I want to just keep it really, really easy? This is a really simple one. I know it seems like a lot of arrows, but it's kind of exactly how you think a cache would work. Hey, does this thing exist? Nope, let me make another hop. When it comes back through, I'm just going to write it to the disk on my way through.
[00:11:12]
There could be extra latency here though, because you are making multiple hops, but it's a pretty straightforward way of thinking about caching. If you had a very read-heavy workload, this would be perfect for that because most things are already going to be in the cache, you don't even have to check the database. Then you have write-behind. This is an interesting one. So write-behind, the application or service immediately writes everything to the cache and then returns.
[00:11:49]
Then later, at some point, the cache says, oh, I'm going to write to the database. Which is pretty interesting. So most transactions you do, you wait for the database to say, hey, this transaction was good, now I can return. Otherwise, hey, I'm going to retry and then I'm going to return. But this one, it automatically says like, cool, you did it, you wrote to the database, so long. The application doesn't even know, it doesn't know that it wasn't writing to the database.
[00:12:28]
So you have some advantages here and some disadvantages. What's the advantage? Performance, more available. Performance, yeah. Yeah, more available. It's faster, much faster. You could run a smaller scale in your database in theory. Yeah. Will this continue to retry if it fails, like the cache in writing, is it kind of like a contract and like, okay, I've committed to doing this, so I will continue to try until it happens or?
[00:12:52]
That's where it gets tricky because like it didn't actually write to the database, so we can't guarantee that data is stored. This is a lot faster. Writing to a database is going to be one of the slower operations, and by slow I mean, you know, it's still very fast, but in terms of computing time, one of the slowest things you can do, especially in a transactional system, because a transaction has to write the transaction and then run through and make sure, hey, this actually lines up with the database schema and everything and then it returns, so it's going to be pretty slow.
[00:13:31]
This is going to be faster, by far, however, we're not guaranteed consistency. So Nick, like you're saying, highly available, not guaranteed consistency, that's a trade-off we're making here. Is this Google Drive? This one, is this like an example of like, if like on your desktop machine and you're using Google Drive, like I feel like it always gets messed up. Like you're always like, that's where your files go to die if you have the desktop solution for it.
[00:14:03]
Yeah, so that's an interesting example. If we're implementing autosave, that's an interesting one to think about, which is I could cheat, and this is what a lot of systems do, is I'm going to cheat. I'm going to say your data is autosaved, but actually what it is, is it's writing to your local cache and at some point, whenever the network, the network's available, then we're actually going to send it through, but you don't know that because generally we consider the network pretty reliable.
[00:14:32]
It all works out. That's how you get into a weird state sometimes. You're like, hey, it said it's autosaved, but you open up on a different machine and it's not there. Why? Because it didn't actually do that. It was asynchronously doing it. But whenever something's async, you can't guarantee it happens. So is this the concept behind optimistic updates? Yeah, you mean as a user, like a user pattern, yeah, because on the frontend you're like, let's hope it works, and then if it doesn't, you'll get an error later.
[00:15:02]
Yeah, optimistic updates are this idea of like, hey, kind of like write-behind, hey, it happened, don't worry about it, even though it didn't actually happen yet, we're telling the user that in the sake of performance. And it's really serving like the 99% case because 99% of the time it's eventually going to write and it's all good. You don't have to worry about it. However, if anybody who's done like weird UI stuff and you see like, hey, it said it saved and then it didn't save and it's like maybe unchecks that box or something like that, that's what happens.
[00:00:00]
Hey, it wanted you to optimistic update. It told me it did it, but it didn't actually happen. And then the system comes back later and is like, hey, that write didn't happen. Send it again, and then you get into a weird user state. But that's a trade-off you want to make. Do I want to serve the 99% or do I want to not think about the edge cases, or do you want to make sure the edge cases are all covered?
Learn Straight from the Experts Who Shape the Modern Web
- 250+In-depth Courses
- Industry Leading Experts
- 24Learning Paths
- Live Interactive Workshops