Video Upload Architecture & Services

Netflix

Lesson Description

The "Video Upload Architecture & Services" Lesson is part of the full, Backend System Design course featured in this preview video. Here's what you'd learn in this lesson:

Jem outlines a video upload system with a web server, source video storage, a notification service, a processing service, and a metadata database, emphasizing asynchronous workflows and clear separation of responsibilities.

Join Now

Preview

Transcript from the "Video Upload Architecture & Services" Lesson

[00:00:00]
>> Jem Young: OK, we have a pretty good workflow here. Are we ready to start drawing this out? Where's the enthusiasm? Remember I was like, oh Jim, what are we going to diagram? More like, here we are. It's that time. It's like, uh, I don't know about this. In my case, it was just more just I was wondering if this was like a, like not quite a trick because I know you said you weren't going to be doing this anymore, but more like if there's any extra steps you should be doing.

[00:00:32]
This is my own fault for like asking so many rhetorical questions. Like, what do you think, is that correct? Do you think that's the right answer here? Do we care about error handling at this point? No, OK, yeah. We're not, there's different levels of the weeds we can get into. We'll keep it simple for now. So I think if we design a good high level flow, we can add in and say like, cool, what's going to go wrong here, and then we start filling that in.

[00:01:02]
But if we try to do that from the start, we're going to end up nowhere. We're going to end up overdesigning one particular part and the video actually never got processed or something like that. OK, just out of curiosity, um, I feel like the first thing we did was the functional requirements and the entities. Is this kind of just the user flow of the app, I guess you would call this step if you had to call it something?

[00:01:28]
Yeah. It's good to think out loud what's actually happening because most of these problems, the hard part is the domain. Yeah, we can design like, what was the uh, you know, we can design something that supports a million users. Is that the most important thing right now? In our old app, we determined it was the database. What is the problem here? Like what is the thing that's going to bottleneck us?

[00:01:51]
We can't really say until we understand the domain. And unless you already work in video streaming, you'd be like, this is going to be the problem, I've seen it before. That's unlikely. You have to, it's good to start slow and say, I as a user, what am I trying to do? And that puts me into the space where I can start thinking, OK, these are the problems I might run into. So, over time, you do this enough, you probably get faster.

[00:02:15]
You say like, hey, I already have this flow in my head, I don't need to write it down. But it doesn't hurt to talk out loud and say, here's what I'm thinking. Did I miss anything? And right then they're going to be like, yes, you missed this thing, you missed the whole point of the interview, or no, it seems about right, directionally correct. Let's keep going. Yeah. Would you practice design different apps like at home kind of?

[00:02:44]
So, I'm going to design Twitter and you have a whiteboard and you kind of practice doing that out personally. I'm just thinking about the preparation schedule like if I do that once a week or something. Maybe I'm getting too into the weeds about it, but I, I won't say don't practice. That's bad advice. I would say you could do this without trying too hard. As long as you're vaguely familiar with it, you said design Twitter.

[00:03:12]
We could do that now. We can just talk out loud. What is the flow of a user? What are they trying to do on Twitter? Is it we're designing the social feed? Is it someone types a post and they send it out? What happens there? That might be the first question we ask. Well, I mean, it's just read and write tweets, and it's read and write tweets. But we'd ask more questions like, are we designing the flow for a tweet that gets posted into the service and then fans out to all the other services so people can read it, or is it, we're designing the news feed overall and they're pulling in all these fresh tweets, or is it everything?

[00:03:55]
But you could do that. You could sit at your desk, close your eyes, and pick any scenario you want. We're designing a car. I love it. Alright, straight back to it. You're in the car, what do I need to do to make the car run? You know, and you can just walk through that in your head. And that's a skill that I don't know if you practice as much as just relax and think about it. Don't overthink it. I want to design an autocomplete search bar, something like that.

[00:04:27]
What do I, what would I do in that scenario? Hmm, so I'm typing, step one, I'm typing something, I'm looking for something, but I don't want to type all the way to the end, because then it would defeat the purpose of autocomplete. What stage do the autocomplete come in? What happens when I hit the button? You know, and you start thinking of that flow, that makes the whole thing a lot easier versus diving right in and saying, oh, so this is a searching problem, and it needs to be a fast search.

[00:04:55]
Here's the database I'm going to use and all that. That's really tempting to do instead of walk through the workflow. The workflow seems almost, I guess, obvious when you say it, but if you haven't thought through it, that's what I'm trying to hammer. It's like, we took this whole course and we're all nodding along, you're like, yeah, yeah, it makes sense, it makes sense to me because I wrote the slides.

[00:05:18]
But when you're sitting there and you're throwing out a whole new scenario, it's really, really difficult to do. So it's good to follow these steps. We asked the right questions. We got answers. We made some assumptions here. Now we're going to get to the high-level design and to do that, we need to talk about what's going to happen. I think this step is usually skipped in a lot of training materials I've seen for system design.

[00:05:44]
It's just like, oh, we know the user is going to do this and this. You're like, how did they get there though? In this case, we're not assuming we got there. Yeah. I don't know what the moral of the story is. Practice, don't practice. Well, there's also domain modeling, I think hits that a lot, especially domain driven design. Yeah, and you do it with your group, it's even better. A different way we've approached this was we could have done this a different way.

[00:06:18]
We could have created the API, which is essentially doing the same thing. We think about, hey, what is the user, what is the flow? And the flow is defined by the API and that's a really easy way of doing this. Instead of doing that, which maybe we should have done, we just said, hey, what's the user flow? And from here we can design the endpoints. But yeah, generally, yeah, the API modeling is taking, it takes longer, but it's going to get you the same end result.

[00:06:51]
You could take each of those bullet points and essentially just convert that into one or two HTTP action requests, yeah, because you use your sites to plug in video plus request video. Let's do that. It's our party. We can do what we want. So, what endpoint do I need? Post videos. Make it simple. Upload. Some sort of endpoint to upload to. User is notified when the upload is successful. Notify. Anything else?

[00:07:31]
Yeah, authentication. We'll leave that out for now. Fair point though. Teaching me my own lesson on security. Don't assume authentication. I'm like, don't worry about it. We'll keep that a scope for now. But yeah, authentication. I haven't upload a video for a long time or added that functionality when you're like streaming a file, can you provide like a request body alongside that? To provide the metadata, do we need another endpoint for that, or is that?

[00:08:11]
You know. The video has metadata embedded, like every image, every video. If you're naming it. It doesn't though. Yeah, that's something we assume all the time, but a video doesn't inherently have metadata attached to it. Yeah, but do we need another endpoint to do that? Because the, we'll say the user uploads a video. We could have said one before that, which is user gives the video a title. We'll keep the metadata simple for now.

[00:08:45]
But yeah, that's not inherently attached to a video file. Now do we need a new endpoint for that or can we just assume that metadata is encompassed within the video itself? If it's user metadata then that we'd want it to be different because you have metadata associated with video itself which you usually communicate through headers and whatnot right or something content type and the file extension and all that kind of stuff.

[00:09:12]
Some of it for something like a title, that's going to be different. That's what I mean it's like for the title and like user that they are purposely putting out, not just data that computer extracts from the file itself, you'd want to be able to record that separately, I would assume. Yeah the question is, do we need a different endpoint for that? Yeah, I'm going to say no. Let's assume our video upload is just a JSON blob.

[00:09:36]
It has a title and some sort of reference to the video itself. Keep it easy. Then when it hits our system, that's one of the things we want to do is pull the metadata apart. I think because we want to reattach it down the road when it's all done, we have to reattach it to every single copy. Someone in chat suggested a videos endpoint. Videos, what would it do? You did not specify. I was thinking the same thing.

[00:10:23]
I think if you're thinking about restfully with like resources, video is your resource instead of upload. I'll take that. And you could have post on it. Post videos. Or what is reload? Oh, I don't know why I put reload. My head must be in a different place. It's, OK. Now what? Can we start designing? Yeah, I think we've got enough here to be dangerous. OK dangerous. So I'm taking a web server and it's called, not even going to name it, called a just server.

[00:11:17]
And it's our API. Say the user is making a post the videos. Let's call it videos, we'll need to see the post part. Videos, so now we're here. Some sort of server. So the server right now has the video and has the metadata. We haven't separated out the video and the audio yet, so it's something we're going to need to do at some point, but we haven't got there. We need to trigger the processing once the servers got it.

[00:11:52]
Probably async. We want like static file storage. Yes, yeah, because what's the thing about user video, uncompressed video? It's big, very big. We're trying to get it small. Is it still true that audio is surprisingly large, I know I got bit by that before, is it? I think in some of the formats, the audio is bigger than the video, there's multiple tracks. If you have like wave files and compress audio, that's like not removing different frequencies from higher or lower ranges would be way bigger.

[00:12:40]
All this uploaded storage or source video storage. So the video is going here. We could shortcut this and just say the client directly uploads to some blob storage like S3 directly, but, you know, we'll just say it hits a server and the server redirects. In the real world, that's an optimal. We're definitely going to just have the client talk directly to that, and then we just, we're actually just uploading the metadata, which points to the S3 bucket or whatever we're showing it.

[00:13:11]
But, you know, it's our exercise, we can do it how we want. When we, if you were like letting the client go straight to an S3 bucket or whatever, how would you like have some sort of wall from authenticating them and then that whole business? Yeah, you'd have to credential them for one, and then you need a way of notifying the server that the uploading is done as well and notifying the client the uploading is done because it's going to return where in which S3 bucket it uploaded to.

[00:13:44]
So yeah, you'd have to credential them, or you can just have an open S3 bucket. You could, I mean, people do that all the time. They do, yeah, oh yeah. Yeah. It's really tempting because you don't have to think about it too much. Before, before we do processing, we did say that we want to notify that the video is complete or the video uploads, we did say that, yeah. So once the web server gets the video, it needs to notify the client.

[00:14:35]
Or if it was done on the same request, they might already know, but probably not. So I'll do the cheat code. This will get you unstuck. This gets me unstuck. We need something to get done, just create a service. Notification service. And notification service is going to read from source video storage. And it's going to notify the clients when the upload is done. And Kyle, you mentioned earlier, we don't, we still have to trigger the process.

[00:15:14]
So notification service can also talk to a processing service. Sure, video processor. We already went microservices. Is this an easy way to think like to diagram and just think of like, even if in the code it's going to be like different than this. Just each box has its own microservice is an easy way to just reason about it. Yeah. Everything's either a service or a database, really, when you think about it.

[00:15:39]
No, we're not throwing load balancers in here. We're not talking about replication. We only hit a database per se. You can say video storage is a form of database. We'll let it slide for now. This is a good way of getting yourself unstuck. Just talk through what are the steps you need, what are the steps along the way that we need to talk about and let's create a service for all those. If you underestimate and say that actually this process is multiple steps, it should be multiple services, cool, we can build that out.

[00:16:15]
But yeah. I don't want to say don't overthink it, because like that's, I don't know, it's almost insulting in some way. But also don't make it hard on yourself. It's create a service for it. We're not paying for these, these aren't real. OK, so we also need, we're going to run out of room, so let me move this up. Alright, so we got the video, it's in source video storage. Notification service is watching for that video to be done.

[00:16:44]
Do we need another notification service to know when the process has finished and when it can, like, I presume, like get sent to the database? We will, we'll lean on this one for a while, OK, because that's one key thing is like something needs to know what's going on. We could also say, I don't know how complicated we want to make this. We could say the notification service needs to know when an upload started and it needs to know who started it in order to speak back to the client.

[00:17:18]
But I'll leave that out for now. We'll connect it later and otherwise we end up a lot of errors. But yeah, we're leaning heavily on this notification service for now. Who knew that would be a critical component of video uploading? We didn't, so we wrote it all out. We also need something to do with metadata, because we didn't do anything with that yet. I was wondering if the process service could handle pulling the video, audio, thumbnails, and metadata apart.

[00:18:03]
Is there a reason why I couldn't? Is that what you want to do? OK. And tell her that's a problem. So we're saying source video storage has the video and metadata. Damn, well, so I was thinking the metadata. Oh, we're triggering it. I wouldn't, I would want to leave this source video as raw as possible and it when it hits the processing service, that would pull the metadata out, write it to the database.

[00:18:37]
If there's thumbnails, pull that out, write that elsewhere. If the audio video is stored separately or wherever they're stored, they get, the processed version gets stored. Is the metadata also get embedded in the video at this point and started in the database? Not yet. First, OK, that's what you're thinking. So we set the note once it hits the source video storage and notifies the notification service, which notifies the client but also notifies the processing service that I would think the processing service can read from the source video storage.

[00:19:31]
Oh, we didn't send the metadata along, so we have to. Yeah, we have to find that somewhere, or maybe it goes from a web server to the database if we already have it. Yeah. So what we need, because all this is stateless. Let's get a new one, loop, because there's going to be a difference in speed here. So we're going to call this metadata DB. Why though? Why, why am I passing this to a database and not just passing it along?

[00:19:59]
It's more to deal with and it's a different type of data. Yeah, yeah, because blob storage is more optimized for video, audio database is great for text, yeah. Yeah, exactly. But also remember our goal of system design is we want to keep it stateless. We could keep the metadata on the server until the uploading is done. But then we run into this problem of the uploads done, but there's no metadata.

[00:20:26]
Then you have a really hard problem is you have a bunch of uploads with nothing attached to it, they have no home, and they're kind of useless, a waste of space. We can't delete them, we can't upload, we can't move on. So, and metadata is pretty small. It's going to be a blob or a text of some sort. So we'll just write it to a database. If we do something a bit later, fine. But I think something else to think about metadata, it's not really going to change if we're just saying it's just the title, there's nothing else attached to it, it's just the title of the video.

[00:00:00]
We're going to attach it later, but we're not doing too much with it, so we probably don't have to process it, it's just hold it for now.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now