
Lesson Description
The "The Scale Phase" Lesson is part of the full, Cloud Infrastructure: Startup to Scale course featured in this preview video. Here's what you'd learn in this lesson:
Erik introduces the scale phase, which focuses on supporting multiple teams, reducing deployment friction, and architecture solutions for long-term support of the application. Terraform is required for this phase, and the Terraform CLI should be installed.
Transcript from the "The Scale Phase" Lesson
[00:00:00]
>> Erik Reinert: And so you can see that in this phase, that's one of our first key points, we wanna support multiple teams, right? You could probably look at App Runner and go like, yeah, sure, like it could work with multiple teams. But I would argue that at some point somebody would probably trip over somebody else and something would happen.
[00:00:17]
And so we want a more like a better environment for multiple people who are gonna be touching it, right? We want minimal deployment friction. So that's in itself a really big challenge, which is like if you have, I don't know, five or six teams of 10 people on it each and they're all deploying at the same time, how do you make sure that there's no friction there?
[00:00:40]
How do you make sure all of those deployments are going out? We're talking hundreds of deployments a day, potentially to hundreds of different microservices that exist in the system. We want long term architected solutions, again, this kind of goes back to the whole App Runner thing. App Runner is a great tool, especially when you're starting or when you're a smaller team.
[00:01:03]
But there are other solutions that can be explored with containers especially, and even infrastructure in general. There's a lot that we are not really in control of right now. We don't have control of the load balancer. We don't have control over really any part of the container orchestration stack.
[00:01:21]
We really just have the ability to set a port, set CPU and ram and then the things that you would really just need to configure a service that's all we can really care about. I would say that this last one is more than likely a part of the scale part, which is we have some money to spend in the first two phases.
[00:01:43]
You're probably also hearing we need to save as much as we can. Do it for cheap, do it free if you can. That is very common in a startup or even a growth scale because you don't want to spend tons of money yet where you, you know, where you don't necessarily need it.
[00:02:00]
You'd rather put that towards whatever your platform or your service is. But there does hopefully become a point in an organization where you are able to kind of take a step back after running so much and go like, okay, we've got some money to spend, like, let's clean up here, let's optimize where we can.
[00:02:17]
You really should, like, hopefully you shouldn't be at a point where pretty much your whole experience there is just, you know, like, we can't because of budget. That's not a good answer. Always. Most of the time it's fine, you know, but the reality of it is is at some point if you're making a decision to better things and it does cost money, that's part of business.
[00:02:40]
That's just like, here, take money, solve problem. There's bigger solutions that come out of that, right? And so I do believe that like at some point you will have to spend, spend some money, but it's about allocating that money in a way where it actually rewards more back.
[00:02:56]
Like, okay, we are going to spend money on this, but we're not going to have to manage that. And that's actually an example of the whole CI system at my work. Again, as I said before, we don't really worry about our CICD system at all. And one of the reasons for that is because a long time ago I and my team decided that we didn't want to manage the CICD system scheduler.
[00:03:26]
We didn't want to manage that part because that's the part that sucks, that's the part that's annoying to manage. The queue went down, the database isn't working, the system's locked up, we got to restart the. All that stuff sucks. All I want to do is just be like, you tell me when I need to run a job and I'll give you an instance.
[00:03:45]
And that is what's made it so that CICD has been thoughtless and not worried about at all because it's just provisioning. And as long as it keeps provisioning, we're fine. So having some money to spend there and say, okay, we'll use CircleCI and spend that, we know that that's a cost that we want.
[00:04:06]
But then knowing within the return that because we're not managing that part of the system, we don't have to worry about it. Downtime for us hopefully would be never, unless it's our fault. Okay, so the phase goals are going to be create infrastructure automation. So we haven't touched infrastructure automation at all yet.
[00:04:27]
I want to kind of point out here that think about that right after everything we did, we haven't touched Terraform at all, right? That is kind of a sign to show and something I wanted to, you know, exemplify, which is you don't need to write infrastructure as code to be successful.
[00:04:45]
Like you don't. If you really want, you can go directly, you know, to minimal, you know, mvp, get it out the door and start like that's how a lot of companies are Actually making tons of money right now. You know, they don't care about what their deployment system is or whatever.
[00:05:02]
Like they only care about their application that they're building and how quickly they can get it out the door. And so I have a friend who works on a really big, very popular bot on Twitch, and I talk to him often and I'm like, how are you doing this?
[00:05:17]
Are you using containers? He's like, no, dude, it's a VPC or a vps just running as big as it possibly can because he scales vertically, he doesn't scale horizontally. And I was like, why do you do that? And he's like, because it's easier to manage and I get better performance.
[00:05:30]
And I was like, yeah, okay, that makes sense. So there's a lot of different cases where you just don't need it. And you can be in scenarios where you just don't need infrastructure automation yet. Although it always can help, there still is a cost sink into how much time it takes to set up and manage and all that stuff.
[00:05:50]
So we want to create cloud environments as well. Basically, once we have our infrastructure automation set up, we want to be able to compose environments with that automation. I want to be able to say, okay, this is staging and then this is prod and so forth and so on.
[00:06:07]
We want to be able to create a promotion process as well. So we want to be able to say, okay, cool, I have merged into my main branch and that'll go to staging. And then whether I tag or sorry, tag or push to a branch, whatever, then it'll go to production if we want it to.
[00:06:30]
So we want to actually have a promotion process because right now we've just been basically pushing to prod and then create application observability. We had a little bit of observability yesterday, but we want to be able to create more if possible. Something like App Runner will only really vendor lock us into CloudWatch and Amazon products, but.
[00:06:52]
But with something like ECS, there's a bit more options on the table because you have access to the vm, right? So you can install your own log push and stuff like that. Yeah, so those are our phase goals. Requirements is pretty much exactly the same. Again, this whole project revolves around pretty much these requirements, so there really shouldn't be anything that you guys need to set up.
[00:07:14]
The only thing I will say is, again, just make sure. Actually the one thing that we're missing on here is the one thing we need today, which is Terraform. I don't know why that's not there, but. Yeah, make sure you have terraform installed. Right. And we're going to be using terraform heavily today.
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops