Aggregates

Stripe

Lesson Description

The "Aggregates" Lesson is part of the full, Domain Modeling for Humans and AI course featured in this preview video. Here's what you'd learn in this lesson:

Mike explains aggregates as a way to keep related data consistent in complex systems. He shows how setting clear boundaries helps make sure updates happen together safely. Using the garden example, he demonstrates how picking the right group of data helps keep things reliable, even if other parts of the system aren't perfectly in sync.

Join Now

Preview

Transcript from the "Aggregates" Lesson

[00:00:00]
>> Mike North: The last topic I wanna cover today is dealing with this concept of aggregates in domain driven design. And this has to do with thinking about consistency, by that I mean data consistency. So imagine this scenario, we're able to move plants within a raised bed. And let's say we build a system that is modeled this way, where we would say we've got a task that needs to be performed.

[00:00:47]
>> Mike North: And we build a nice little semantic action, right, instead of saying, I have this big update API call and I'm gonna set the new array of your items within your bed and here are the XY coordinates. Let's say we had a really nice thing that kind of looks this,
>> Mike North: Right, something like this spiritually.

[00:01:23]
And let's say our app's really popular and we end up having to get big beefy databases to handle all of this. And we decide that what we can do is we can say, if we implement something called charting, which is the idea of saying we actually have multiple databases and Seth's raised beds and my raised beds, they can actually be on different databases.

[00:01:51]
It's fine. We can kind of spread the data out a little bit, there's no ambiguity. Like Mike's raised beds are always on database A, Seth's are on database B, and everything's fine. And maybe in fact we can sort of mix them up a little bit where like I might have some beds on database A, some on database B, but it doesn't affect this because this, you can think of this as like one, right, happening in one place.

[00:02:16]
But what happens when we want to write across those different concepts? You could imagine how we would say we've got bed one and bed two and we've got a plant.
>> Mike North: When we start getting into this, it gets really interesting where we're trying to move this into another bed.

[00:02:47]
Now if we were to say, look, there's an update plant position, or let's say it's something like this.
>> Mike North: Think about what might happen if one side of this fails and the other succeeds. This inevitably happens once in a while. We think about our big distributed systems as having some number of nines of reliability and sometimes errors are thrown.

[00:03:22]
But imagine a world where I'm trying to move plant from bed 1 to bed 2 and the remove plant operation fails, but the add plant operation succeeds. And so now I've cloned this plant somehow. Like there are two of it. Or you could have it disappear entirely where you remove it and then add plant fails.

[00:03:44]
And you could say, well, let's check the success of the first call before proceeding with the second. But sometimes you lose state if you do that. Okay, you check to see if imagine remove plant is successful, add plant fails. Are you gonna add it back into the prior raised bed?

[00:04:06]
What you're trying to do there, if you go in that direction, is you're trying to create this illusion that there is an atomic operation, that both of these things, the removal from the old bed, the addition to the new, is a single operation that either all succeeds or all fails.

[00:04:25]
And this is really important when we think about what is the aggregate here. And I would say in this case it's this,
>> Mike North: Garden. Because what you could do is you could say, no, no, no, it's really garden.
>> Mike North: It's really like this. And what you could do, if you get deeper into databases and things, you can create what's called a database transaction, it's basically doing the in one atomic operation that either all succeeds or all fails.

[00:05:09]
It's the removal of the plant from bed 1 and the placement of the plant in bed 2. Inevitably, especially if you work on something with significant complexity, you're going to have to make these choices. You're gonna have to decide what are the atomic operations you can perform. And usually what that means is we would say this, actually, I'm gonna change the model here a little bit.

[00:05:39]
We'd say this whole thing and I'm gonna have to move this to back. This is the aggregate, that whole larger rectangle and the garden, we would say is as a lot of aggregates have. Can I move this to the front, please? Yep, perfect and this to the front.

[00:06:12]
>> Mike North: Well, we can move it in here. It's fine. You'd say, well, garden is kind of like the root node. And so often when you pick an aggregate, you have to say, what is the entity? That's sort of the main purpose of this thing. And yes, there may be like a lot of other things embedded within it, but the important thing is like choosing the transactional boundary.

[00:06:36]
And in doing that you're making choices. You're saying, well, within this brown box here, that's where we are internally consistent. You're never gonna be able to load the page at a weird time and see a plant has been added to a new zone but not removed from another zone.

[00:06:56]
And that would be internally inconsistent. But it does also mean when you make choices like this, you're designing for inconsistency, at least momentarily, to appear in other places where it may be okay in a neighborhood where you're moving a bunch of plants around. All right, two different gardens may be one is totally fresh data and the other is somewhat stale.

[00:07:25]
But at least the totality of data within each of those two gardens will be the same level of freshness, if that makes sense. So thinking about this as part of your domain modeling is really important, especially working at Stripe, when you're thinking about transactions and refunds and ledgers that have to all add up.

[00:07:46]
So that at any moment in time when you load this page, you're not seeing that the balance in your account is different than what you're seeing in another page. So sometimes these can be really important to your user. And personally, I find this to be kind of one of the trickiest areas to have those relatable discussions with users.

[00:08:12]
And often you wanna hone in on this idea of freshness. Is it okay to view this data if it's stale, but at least it's all the same amount of stale and you're never seeing a partial state there. And if you talk to an accountant about that and they're like, you're gonna show me a ledger where certain items are still waiting to sort of percolate through, that's useless.

[00:08:38]
That's gonna be a real problem there. So this is the concept of aggregates and designing transactional boundaries with intent.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now