Check out a free preview of the full Intermediate Gatsby, v2 course

The "Add Custom Data" Lesson is part of the full, Intermediate Gatsby, v2 course featured in this preview video. Here's what you'd learn in this lesson:

Jason walks through adding custom data to Gatsby's GraphQL layer by using the sourceNodes API to pull in data from objects set in a .json file. The basic Gatsby node structure and how to determine if the node has changed is also covered in this segment.


Transcript from the "Add Custom Data" Lesson

>> The first thing I'm gonna do, I think, is start committing these changes. So I will check out a new branch called progress. And I'm going to add everything here, and we'll say, set up for dev and create a customer page, all right? So then I'm gonna just push to this steadily as we go.

So that if, at any point, you get stuck, you will be able to head over to this repo here, which just opened in the wrong browser, so let's pull it over this way. Any point you get stuck, you can just pull up this progress branch, and you'll be able to see whatever we're doing.

You can pull that down and get back up-to-date with where we are. So next up, let's get some custom data into Gatsby. And this is something that I have so much fun with, because it's just a powerful approach. You can do some really, really cool things with custom data in Gatsby.

And there's some work that was done between Gatsby 1, Gatsby 2, and now Gatsby 3 that took this from being really esoteric and kinda, you had to know a whole bunch about the internals. To now, you still have to know what the things are, but it doesn't feel nearly as mentally taxing.

I don't feel like I have to know exactly how Gatsby works. I feel like I just need to know these APIs, and that is really exciting to me. So why don't we start up by going into our gatsby-node again. And I wanna work with this authors and books data.

So looking inside these files, I put together, we're gonna build a little book club website here today, right, that's our goal. So I found a few authors and put those into a JSON file, and then I took a few books. And these books, I grabbed their ISBN, so that's a universal identifier.

The name, the author slug, and whether or not it's part of a series. So these three are part of a series, these ones are not. And anyways, these are some books. They're good books, if you haven't read them, you should. And I wanna pull these in and make these available as custom data types on my Gatsby site.

Start this up before we get rolling. So the site, when it's in development mode, is gonna generate this GraphQL API that we can run. Now, it's important to remember, this GraphQL layer is only available during the build time. So this isn't something that you can query from the client side.

But what's nice about it is that it gives us the ability to kinda explore our data. And as you can see right now, there's not much in here, because we haven't added a lot of data. But we can see that, here's all of our pages that we've created.

So our custom page is there, our home page is there. The dev 404, which is for during development, if you try to create something that's, or try to hit a route that doesn't exist, it'll give you some hints. Which I can show you, actually, if I go over to the 404 page.

This is their dev 404. So this only shows up in development, it doesn't show up in production. But it's really handy, because if you get lost and you're trying to find a site, you can look here instead of having to build a way to dump the data, or something like that.

So let's get some customer data in. So what I wanna do is, I wanna import my authors. And we're gonna get that from /src/data/authors. And then I'm gonna duplicate that, and we're gonna get our books. So once we have our books, I wanna be able to load up data into Gatsby.

And the way that Gatsby does that is through a hook that it calls sourceNodes. And sourceNodes is another API hook. It's a cycle that runs during the build. We can see it up here somewhere. They all happen in order, source and transform nodes is when that one's happening.

I don't know why that one has a different name than these other ones that are just called what they are. But sourceNodes will give us an actions object, and then it also gives us some helpers that we're gonna need. I wanna be able to create a node ID.

We'll talk a little bit about node structure here in a second. And then we also need a contentDigest, which is a way to determine whether or not the node has changed. Which is really important for making Gatsby build faster, it's part of the caching mechanism. So let's talk about what a node structure is in Gatsby.

So a Gatsby node is going to be an object, and inside of that object, it needs to have some kind of an ID. And the way that you generally create a node ID is to just use this helper. And you can put any string in here, as long as it's unique.

Then we need to say whether or not there's a parent node. In a lot of cases, this is gonna be null. But if you're doing something where you want, you're deriving data from a node? A good example of this is if you have images in Gatsby. You'll have the file, and then underneath the file, you have the child Gatsby Sharp.

That's a parent relationship, in that case you would set the parent. In this case, we're not gonna deal with it. Children is also an empty array, there are no child nodes here. And then there's this internal block, and these are things that Gatsby uses to determine whether or not this node has changed.

So the first thing we're gonna do is, we're gonna set a type. And that can be any arbitrary string, it just needs to be unique, so MyCustomType or whatever. And then we wanna set what the content is. This one's optional, but it's kind of a nice way to make sure that you've got whatever your nodeContent is.

And this would be an object, I'll show this in an actual example here in a second. We're just looking at the details, and then you need a contentDigest. And so what a contentDigest is is, if you've ever seen a checksum or other way of validating that a file's contents match what they did at the time of publishing, that's what we're doing here.

We're creating a digest of what's inside of this node, so that Gatsby can say, is this the same node that it was before? If so, skip it, don't do any work, cuz we know that nothing changed. If the digest did change, then we need to blow this node away and create it fresh, because there's new information.

If we don't set this, Gatsby will recreate every node every time, and that's gonna slow down our build. So setting this is very important, and you would do that again by setting the nodeContent. And you can put whatever you want in here. If you only have two fields that are unique and one of them's randomly generated, cool, whatever.

I don't know why you would have a randomly generated field in data, but you do have that option. You only have to digest what needs to be digested, and you can make the call for what needs to change, or what should change and when. So this is the basic structure, and what I wanna do is, I want to create nodes, we'll we'll do using a helper from the actions object.

This is another nice one, so this one's called createNode, and we pulled that out of actions. Inside of this, I'm going to then just loop through my authors. And I want to, for each of these, create a node. So I'm gonna create a node, and it's gonna start by just putting in the author fields.

Because you'll notice down here, these are the fields that are required by Gatsby. I didn't put any data in here, but I can put whatever I want. Right, and then this is what I'll actually be able to query inside of Gatsby, is I'd be able to get my MyCustomType sum, and then that would give me the value of data, right?

That's what a node looks like. So I'm dropping in the author so that we get access to its slug and name. Then we're gonna create an ID, and we're gonna use createNodeId and pass in some kind of a unique string. So this is a way to do it that's guaranteed unique.

In the off chance that you had another section of people, and you had a people object that included N K Jemisin, you wouldn't want a collision where the author and the people would both have the same slug. So then the data types would collide, because they'd have the same ID.

This is a way to just put the type name in there, and then you're guaranteed it's gonna be unique. Then we're gonna set, there's no parent. I'm actually gonna just copy-paste this part, because why type it twice? So here's my parent and children, and then for the internal, our type is going to Author.

Our content will be json.stringify(author). And the contentDigest is gonna be createContentDigest(author). Okay, so if I get rid of this, Let's just give that a try, right? Let's see how that goes. So I've now created, theoretically, three new nodes inside of Gatsby. Cuz we've got three entries in this Author object.

All right, so now that this is running, what we should see is, when this pops open, we should be able to query for an author type in Gatsby. And remember, that's not available now. So without refreshing the page, here's Directory, File, Site, SiteBuildMeta, Function, Page, Plugin, that's it, right?

So let me refresh, hey, there's our author. And inside of it, look, we've got filters, we've got the ability to sort, limit, skip. And then I can get into my nodes, and I can get the name and the slug. And if I query these, look at that. We now have custom data in Gatsby, what a great day.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now