Check out a free preview of the full Introduction to MongoDB course:
The "Modeling Data Q&A" Lesson is part of the full, Introduction to MongoDB course featured in this preview video. Here's what you'd learn in this lesson:

Questions are posed about how to structure your data in Mongo, benefits of denormalizing data, general questions about how to architect databases, and if it's possible to mix databases.

Get Unlimited Access Now

Transcript from the "Modeling Data Q&A" Lesson

[00:00:00]
>> Scott Moss: Yeah what's up?
>> Speaker 2: So how do you sort of decide when to just make a really deep tree of data versus a very sort of normalized or sort of split apart, more relational data.
>> Scott Moss: What I found out is that Mongo is really good at changing how you model data.

[00:00:20] Because if you think about modeling data from a SQL perspective, it has nothing to do with application, it's just data. It's just how do I model the data so it fits the database? Whereas in Mongo it's so flexible, you actually wanna model the data how it fits the application.

[00:00:34] So my answer to that is how are you querying the data? What needs to see it? And what does it need to see? If you have a screen on your app that's got the user profile, it's got this information about settings, and it's got all the posts. Maybe you should put that all in one thing, so you just hit it once and grab it all, right?

[00:00:53] Cuz it's built for your application versus built for the database. Now if you've built SQL you might be thinking, that's disgusting. Yeah, in SQL it is disgusting. But like in Mango it actually works, it's better for you if you do it that way. So for me, we model our data around what the application needs.

[00:01:08] That way we can just hit one route, do one database call and get everything we need for that page, and then we're done. We don't have to do like multiple database queries to like get this, and then this, and then this, and then this just to put it on the page.

[00:01:19] It's like, this one page needs all this, put it in one thing. Done.
>> Speaker 2: This was on Firebase that I saw this, which is also NoSQL, but I've seen instances where they literally write the same content in different places, just duplicates of the same content.
>> Scott Moss: Yeah, yeah, they denormalize, yeah.

[00:01:37]
>> Speaker 2: So literally, the blog post will go under user/posts.
>> Scott Moss: Mm-hm.
>> Speaker 2: And also it will just go under posts, and also it will go under category/categoryname/posts in there. And so you've written that blog post content in three places. Is that madness or is that cool?
>> Scott Moss: That takes it to an extreme level.

[00:01:54] I think at some point it's just like, that's a lot of work, because you gotta do a lot of middleware to keep track of all that, right? But there is some really good benefits of denormalizing your data, right? So if you start off with normalizing it, as in they got a reference here but no reference here.

[00:02:08] But eventually you start to denormalize the words in two places. I think that actually really is a good place to be, but only if your application needs it. If your application's not asking for that, like if I have a org and a project and they both reference each other.

[00:02:22] When I ask for a project do I need the org there? Why do I need that, you know what I mean? So you gotta think about the queries that you're asking for. And again, it's really gonna be based off what is it that you're asking for? At some point, if your data structures look like that, if you're modeling data like that, then I would imagine your user experience is probably bad to.

[00:02:40] So you should figure out, you gotta take a step back and realize what is the information you're trying to show you users? And then how do you model the data but you know, denormalizing is not that bad. I would just stay away from like arrays and arrays together.

[00:02:56] Just working with arrays in general is always gonna be tough if you've probably noticed, those are some of the most challenging thing in exercises today where arrays. So, I would try my best to stay away from that as much as possible and add a reference on the other side as much as I can, unless there's just no way I can get around it because the way I need to query data, so, yeah.

[00:03:14] You had a question?
>> Speaker 3: Yeah, so it relates to the previous question is that, you said whatever links together should be put in a single document. So I work for a commerce application and we have a lot of things going on in one page.
>> Scott Moss: Yeah.
>> Speaker 3: We have shopping cart, we have shopping list, we have items and all the items available, and lots of stuff.

[00:03:37]
>> Scott Moss: Right.
>> Speaker 3: So if we cram all these into one document, then we are at a disadvantage, so.
>> Scott Moss: Well, you can't put the whole app in one document cuz if you put the whole app in document, and then you query one document, you've queried the whole database and that's just-

[00:03:50]
>> Speaker 3: Yes, but just for like, to show just the homepage we need all these stuff loading up?
>> Scott Moss: Yeah, yeah.
>> Speaker 3: So-
>> Scott Moss: So the way I would think about that, if you had a homepage, what specifically is on a homepage and only on a homepage? Put all that in one document.

[00:04:07] You probably have other stuff around the homepage that exists on other pages. Maybe those don't go in the homepage document then, they go in some other document cuz they are gonna be used somewhere else. They just happen to be in the homepage. So I would just put specific things that are always together in the document together and try to.

[00:04:23] For instance, if I was doing SQL, I would have a user table and then I would have a user settings table. And then I would join them get the settings for the user. Whereas that doesn't really make sense cuz I'm always gonna get the settings for the user, always, so I'll just put the settings embedded in the user then I'm done, right?

[00:04:37] So like you kind of got to take it step by step like that, to understand why am I even joining this when I'm always going to get this anyways, like default,I'm always going to get the settings for this user. Why not just put it here. So it's like that type of thing so, yeah, you can't take it to the point where you're putting everything in there because a document has a collection of a top of like sixteen, I can't remember if it's 16 megabytes or 10 megabytes, the limit of what a document can be.

[00:05:03] I think it's 16 megabytes. Yeah, I'm pretty sure 16 megabytes, that's the biggest a document can be, so you can't do that. 16 megabytes is just ridiculous, so yeah, don't push it that far but group things that make sense together as much as possible. Especially if you're fighting around transactions and undoing things because things break, you gotta go back like all right how can we change our data model to where we can just do all of this in one operation and take advantage of the nested structure.

[00:05:36]
>> Speaker 4: In this case would it be possible to have multiple instances of node? I mean, let's say like shopping cart, I mean that's a good example. Even if someone is on a home page, you still need to keep up our system shopping cart with items and everything. Would it be possible to have one server only handle like this direct content on the homepage for example, another server taking care of only the shopping cart.

[00:06:02]
>> Scott Moss: Yes, that's like a market service architecture, a lot of people do that today, or they have these little servers that do one thing, this server does authentication, this server only handle users, this server handles like, whatever, yeah. So that's a very common pattern, that right there, that architecture, it can get pretty crazy cuz now you gotta figure out a way to communicate between those different architectures.

[00:06:25] You need a license to operate the queueing system, all types of different things. You might need a gateway that is a proxy that sits in front of it. So that just introduces a lot more problems, but yes, a lot of people do micro service architecture and there is nothing wrong with that.

[00:06:38] They can all talk to the same database too, cuz you can connect with just a string.
>> Speaker 5: So you'd probably just do a mix, a relational database also with Mongo? Or, because I mean that will complicate things, I don't know.
>> Scott Moss: I think you can. I think if you're the Targets in the Googles of the world, I think you can't just use one database.

[00:06:56] It's just not gonna solve your problem. You're gonna have different scenarios and you're gonna have different needs. So there's nothing wrong with mixing two databases. We use Postgres for our analytics database, but we use Mongo for our application database and they work together just fine.