Complete Intro to SQL & PostgreSQL JSON vs. JSONB
Transcript from the "JSON vs. JSONB" Lesson
>> For those of you that are either joining us for the first time, or just getting restarted, or you need to just like need a fresh copy of your database, from this point forward you can actually just use the complete-intro-to-sql container that I set up for you. So let me just show you how to get that up and running really quick.
[00:00:20] Okay, so I have multiple of these running, I'm just going to kill my pg one cuz I'm not gonna work on that one anymore. And I'm gonna say docker kill sql as well. All right, so what I'm gonna do is I'm gonna say docker run. I renamed mine SQL just so I could differentiate between pg and SQL, you can call whatever you want, doesn't really matter.
[00:00:44] And basically everything's the same, I'm gonna put this on 5432 but this is relatively the exact same command that you ran, except, I changed the name to SQL here. And instead of Postgres:14, I put btholt/complete-intro-to-sql. I already had you pull this one at the beginning of the course.
[00:01:05] So if you run that, that'll come preceded with all of the various different database things that you're gonna need. You might see a warning about something like this isn't built for your computer, it should still work, so don't worry too much about it. You might lose some performance, but it's not a big deal.
[00:01:22] If your database is fine and working as from yesterday you don't have to change that, that's totally fine. You will at some point, but you don't have to do it now. So now if I say docker exec, make sure you're running against SQL not pg if you call it the same thing as me.
[00:01:41] If you call the pg, then you'll put pg here,and this is just whatever you call the container. You could also literally copy and paste this whole hash here that would be fine. And then you run that, and if you hit /l, you should see omdb, that's a new one for you.
[00:01:55] You should see recipeguru, which is a copy of what you did yesterday. Just to prove my point here, you can see these are relatively the same tables. If you're running that, and you're not seeing the recipeguru one, let me know cuz I'm curious cuz someone else saw that.
[00:02:21] But you might have to pull again, just when I say pull, you'll you'll kill your container. And then you'll say docker pull, btholt/complete-intro-to. SQL you might even put latest there or something like that, so that's implied. You can see here for me it's just saying yours is up to date so don't worry about it.
[00:02:43] All right, so I'm gonna reconnect mine docker exec that. I'm gonna connect to my recipeguru, that and then I'm going to clear my screen. I just say Ctrl+L by the way if anyone's curious I'm clearing my screen, I don't know if you care but that is what I'm doing.
[00:03:02] Okay, we are going to start in the JSONB chapter, or section, which is section five here,right here. So, JSONB. So I imagine most of you have probably heard of MongoDB, right? Or have you some sort of document store before. When MongoDB came on the scene, I don't remember exactly, when it came out.
[00:03:36] But I remember when I first heard of it which would've been 2011 is when my company that I was working at said this is a really interesting piece of technology we wanna put this in our stack. The first tech conference I ever went to was Mongo World in San Francisco.
[00:03:54] First time I went to San Francisco, I was so excited. It was a really, really cool piece of technology cuz we could do databases but they didn't have to have these like hard schemas. You could put objects in there. You could have heterogeneous objects, like this object has this shape to it, this object has this shape to it.
[00:04:12] They exist in collections is all really, really exciting. Still is, it's a really cool database that has really good uses for it. I still use it today stripe uses it a bunch and a bunch of different places. And if you're interested in a course on it, I left a little link here to Scott masters course in FrontMasters if you wanna learn more about specifically the Mongo database.
[00:04:34] I also talked about it my complete-intro-todatabases. So it's a document or your database, we talked about that. Postgres is a relational database. But Mongo excels at data that is unstructured. Which is to say, you have data that's incoming, that you don't have like a future knowledge if this is exactly how this is going to look, right?
[00:04:58] That's where Mongo is really, really useful. So a great example, this is our photos table. We have a separate table for our photos because we don't know how many photos we're gonna get, right? If we knew exactly upfront we were gonna have exactly five photos for every recipe, we wouldn't need a separate table.
[00:05:19] We would have five columns on our, Recipe schema. But we don't know that, right, it can be 0, it can be 1, it can be 100. This is where Mongo would be really useful, because in Mongo you can just put an array, right? You can just say, this is gonna have an array of photos that I don't know how many photos it's gonna have, or maybe even has photos.
[00:05:48] So, You can do this with Postgres now. With I think it was Postgres 9, don't quote me on that, it doesn't matter, so no one check. [LAUGH] But at some point, Postgres added the ability to be able to attach a column to your database that is JSON. And then you can query, not only the column, but you can query into the JSON.
[00:06:12] This is called the JSON data type. So it's actually a data type, we'll verify that it's valid JSON and then you can actually make queries into the column and into the JSON structure underneath. And they came up with this JSON data type and people loved it. But they found out that they could actually make it even better.
[00:06:31] What they were doing before as far as I know, storing it in plain text. And then later they went back and said, we can actually make this better. We can actually optimize how much space this takes and how fast it is to query. So they came out with a second data type called JSONB.
[00:06:50] The B literally stands for better, I'm not joking it actually stands for better. So is JSON better, in other words you should basically never use the plain JSON data type for Postgres, you should always use the better one. I thought that was hilarious. They literally put in the name, that this is the better one.
[00:07:09] You might ask, is there any reason to ever use the plain JSON datatype? Because it does more processing on those JSONB ones, the better ones, if you don't care about queering it very frequently. If you don't actually wanna see that data very often, it actually technically is faster to store the JSON ones because they do no processing on it.
[00:07:33] The other thing, is that, if you care about the whitespace in your JSON, which one why. I have questions for you, but if you do, JSONB drops all the insignificant white space, JSON will put there at the white space for you. I think there's probably some other minor nuances, but those are the two big ones I could research when I was looking it up.
[00:07:55] In other words, I'm just gonna tell you, just don't use JSON, even though there's such minor like things that you can publish, don't just get away with using JSONB. The one thing that they called out on their particular website is that, if you're doing it for logging purposes, right, and if you're writing a lot and not reading a lot from it, which was what people do with logs, JSON could be potentially useful in that situation.
[00:08:20] But everything I'm gonna show you after this point, you can actually literally do on JSON or JSONB, all the querying works exactly the same on both of the columns. It's just how they're stored that's different. There's a whole blog post here from Citus Data, which is one of the best Postgres companies.
[00:08:38] It's actually a company that Microsoft bought, because they were better at Postgres than Microsoft was. So I got to work with those people they were nice people to work with. I'm gonna throw out here that there's another thing called hsore. And this is closer to like Redis, is like a key value store for Postgres.
[00:09:00] Every time that I go to use hstore, JSONB is a better fit for my thought modeling. So I ended up always using JSONB, so I decided to not talk about it because I think JSONB is just overall more useful thing. But if you're finding yourself, like I need some sort of key value store like Redis, but I don't wanna use Redis or memcache or something like that, hstore would be for that.
[00:09:25] You get basically all the functionality of hstore from JSO B for the most part. Okay, that's kind of an overview of what we're about to talk about. We're gonna add a little bit more to our recipeguru database. We're gonna add tagging, but just so you know, I showed you how to do the one-to-many relationship from recipe to many photos, right?
[00:09:55] These days I would use JSONB to do the recipephotos, I would not use a separate table. I would actually use JSONB, because it's way easier to deal with. You don't have to do any joins, you can just query the column, right? So if you have those one to many relationships, high probability that you probably wanna use JSONB instead of trying to use another table.
[00:10:20] The many-to-many relationships, you're still gonna have to do the intermediary tables, the ingredients to recipes connection, that's still going to be another table. I guess you could model in JSONB, but I probably wouldn't, I'd probably still do the relationships with the tables.