Create a Cron Job

Databricks

Lesson Description

The "Create a Cron Job" Lesson is part of the full, Build a Fullstack Next.js App, v4 course featured in this preview video. Here's what you'd learn in this lesson:

Brian shows how to create an API directory and set up endpoints in `src/app`, covering authorization checks and implementing a job to summarize articles. He also discusses using API endpoints and cron jobs for lightweight tasks.

Join Now

Preview

Transcript from the "Create a Cron Job" Lesson

[00:00:00]
>> Brian Holt: So we don't have an API directory, so we're going to go create one. So in source/app, we don't have any API endpoints. We've been doing everything with server actions. So we're going to create a folder here called API. The only way you can run these jobs is as API endpoints. And I'll also mention here, I would only use this way of doing a cron job for very lightweight things, and if I only have one or two jobs to run over the course of the lifecycle of the app.

[00:00:31]
As soon as I had like 30 jobs running, I would no longer use Next for that. I would use some service, right? Either I'd have a cron service that I ran myself, or there's plenty of job frameworks that'll do that for you. Okay, we're going to go into, we're going to create API. We're going to create another folder in here called summary.

[00:00:58]
And then we're going to create a new file in here called route.ts. This is how you do API routes with Next. Again, if you're taking this course, you've probably seen this in previous courses. But now we will have an API endpoint that will be localhost:3000/api/summary, right? It'll invoke automatically the route file in here.

[00:01:25]
Okay, and we're going to make a job that just runs however, whenever we define it to. Trying to think if this is that interesting or, yeah, I mean, we'll do it. Okay, so we're going to import NextRequest and NextResponse from next/server. We're going to import eq and isNull from drizzle-orm. We're going to import summarizeArticle from AI/summarize.

[00:02:01]
We're going to import db from db. We're going to import articles from db/schema. We're going to import redis from cache. Okay, we're going to export async function get, and a req is a NextRequest. I don't think you even need NextResponse here, to be honest with you. Maybe you do. No, you do. I lied. Okay, so here's the secret sauce here.

[00:02:50]
I hope you're thinking about: we're making a public API for a cron job. What's to stop someone from just calling it from the public internet? It's an astute question. I'm glad you asked that. And you're going to say if process—so if we're in process, sorry, process.env.NODE_ENV. So if we're in dev, we're just going to say, run it, right?

[00:03:24]
Because we won't be able to test these things locally. That's what this will test. And what Vercel's going to do for you here is they're going to give you a header, req.headers.get authorization is not equal to bearer, and they're going to give you this thing called the cron secret, process.env.CRON_SECRET. So let's see, where am I wrong here?

[00:04:12]
Oh, and I need to have this like that. Okay, so this says if we're in dev, ignore this, just run it anyway, and then we can just run it locally as a normal endpoint. If you are running in production, they set an environment variable called the cron secret, and they'll pass it to your endpoint as Vercel, as your authorization header, and if that's present, then you'll know that the request is coming from Vercel.

[00:04:35]
If it's not present, then it's just some random request coming from the public internet. So at this point it'll say like, you don't have the cron secret, I'm going to bounce you out of here. You're just going to say return NextResponse.json, and you're going to say error: unauthorized, and give it a status of 401. What I'm missing here, JSON air.

[00:05:31]
One of the brackets needs to be after authorized. Closing bracket. The closing bracket needs to be after authorized before the comma. Oh yeah, you're right. And then drop one of those. Okay, awesome. Okay, so now this will bounce the person out here of like, you're unauthorized, you can't call this API. Again, you might be looking at this and think like that seems like a pretty big hack.

[00:06:05]
Kind of is. It feels a little gross to me, but again, for like—I'll tolerate this that I don't have to set up another service right now. I have one job for my entire service, by all means, just do it this way. Okay, const rows equals await db. We're going to go find everything in the database where the summary is null.

[00:06:50]
So dot select, and we're going to select here the id: articles.id, title: articles.title, content: articles.content. Okay, that from articles, and where isNull articles.summary. Okay, so this is going to select everything from our articles table where the summary is not set yet, and it's going to get the id, the title, and the content.

[00:07:21]
We're going to use that to go call the summary function, and then we'll just update the database with it. Okay, and then we'll keep track of this. We'll go say let updated equals 0. Console.log, and we'll put a robot face here. Starting AI summary job. Okay, and then we're going to say for const row of rows. And then we're going to wrap this in a try-catch because, again, we never want to crash our server because of a job that's running.

[00:08:05]
That's another reason why I find this a little hacky is that you're putting additional load on your server kind of artificially, and if you have a lot of these jobs to run, you end up kind of bogging down your app server, which is going to slow down your users, which is not what you want. And here we're just going to call the summary API, await summarizeArticle, row.title or empty string, and row.content.

[00:08:57]
And again, you could put here empty string if you want, but it should be fine either way. Const summary equals that. Then we're going to say if summary—and so, I guess we already do this. We do this already in, I guess, maybe not. We're going to put this in here. If you get back an empty summary, then you wouldn't want to insert that into the database.

[00:09:50]
Dot length, sorry, dot length. Okay, and then we're going to say await db.update articles dot set summary, where eq articles.id is equal to the id that was passed in, row.id. Okay, that all looks good. And then we're going to say updated plus plus. This is more so that we can log it out at the end of this. We put a catch here, e, and here I'm just going to say console.error something like "Failed to summarize id" row.id.

[00:10:40]
And then you just put a continue here because might as well try and figure out how to do the other ones. It's unnecessary. I just like putting continue here just so it's very obvious like I'm explicitly opting in to continue. It doesn't want me to do that, so sure, whatever. Yeah, Mark. So that cron secret is just defined by Vercel and we never have to add it to our .env, correct?

[00:11:13]
Yep, this is it. This is all you have to do. You shouldn't define this cron secret. It's only going to be defined by Vercel. All right, so if updated is greater than 0, then we want to clear cache, right? Because everything will have changed at that point. So we're going to say await redis.delete articles:all. Oh, catch e, console.warn.

[00:11:51]
We'll put a warning symbol in here, something like that. Failed to clear articles cache. Something I didn't do for this course, but you definitely could have done, is we could have made some sort of logging framework. You see I'm just throwing around emojis willy-nilly, like it probably would be better to be a little structured about this.

[00:12:13]
We could have used some sort of Next logging framework or something like that, or even just written our own, but wouldn't be that hard to replace here. Console.log. The nice thing is that Vercel makes these very searchable no matter what. All right, we'll put a robot face in here. We'll say "Concluding AI summary job. Updated" updated rows.

[00:13:10]
Okay, and then we're going to say return NextResponse.json, and it'll be ok: true, and you can put updated in here as well if you want to. Okay, so looks like this should be okay. So because we're in development mode, we should just be able to go to /api/summary, like this. And this will actually take a while because it's going to go run on 24 different things, so we had one that had been summarized, gotten a summary.

[00:13:41]
We had 24 that hadn't. You can see here it's starting AI summary job. I might even get rate limited here, because it's going to do 24 right in a row. We'll see. Yeah, see, I think that's what's going to start happening here. So you can see that I got through 7 before Vercel said, you're a free tier, we're going to rate limit you.

[00:14:05]
So you can see it, I did 4. And so if I go refresh my page here. But I mean, this is actually a perfectly good example of failing kind of gracefully here. This is not the job that we wanted to bring down the service, like we didn't want this to crash our Next server, so we were pretty defensive with our try-catch statements.

[00:14:38]
But some of these, it's probably down here at the bottom. You can see some of these end up with summaries. Yeah. But the worst case thing is, some things didn't get summarized, that's very okay. And now we can go schedule this so that it'll run again, eventually over the course of however long that we set up for our cron jobs.

[00:15:06]
They'll all eventually get summarized, right? So let's go. So far this is just an API endpoint that we can invoke in development. How do we make it a cron job now? Well, let me show you. We're going to create a new file in the root directory, so not in source, just in the root, and it's called vercel.json. And this is a specific configuration that you can pass to Vercel.

[00:15:34]
There's other things you can put in here, but we're going to put crons, and it's going to be an array of just jobs that you want to run. So we're going to put a path, so you're going to define the API path, which is going to be /api/summary. And we're going to give it a schedule, which is just going to be a normal cron schedule.

[00:16:05]
If you don't know how to write crons off the top of your head, what's wrong with you? No, I'm just kidding. I have to go to crontab.guru every single time. You can do whatever you want. I did 0 0 0. This will run every Sunday at midnight GMT. Now, if this was my actual production website, I would want this maybe an hour, every day.

[00:16:29]
The reason why I'm doing every week is because I know a lot of you are going to be following along with me, and if I put every minute on this, it's just going to eat through all of your Vercel and Neon free tiers, which I did not want that for you. Because if you look at your articles or the summarize here—no, part of wherever our route was, we went our route.ts.

[00:17:00]
This will wake up your Neon database from db and the crontab will actually start your server if it spins down to zero, right? So again, it would waste a lot of your free tier. So that's why we're doing this once a week. But if you've never been to crontab.guru, this is a very useful website for this. 0 0 0. It puts in plain English what a crontab means.

[00:17:32]
So if we did something like this—I gotta remember. I don't even know. See, I can't even remember how to do this off the top of my head. At 5 o'clock in August. Some of these will—I don't know if this works in Vercel, but you can also sometimes do things like daily or something like that, and it'll just work like that, a little bit more plain English.

[00:18:16]
I think that works in Linux. It just depends. In any case, this works for now. This will now run with Vercel. It'll run every Sunday at midnight. So let's try this. I'm going to say npm run build. This will build my production version of my site. And then I'm going to say npm run start, which will run the production version of this.

[00:18:48]
So in theory, I should be able to still come to my Wiki Masters. This is working, but this is now running in production mode and not in dev. And I think if we go to /api/summary, we'll get an unauthorized because it's not in development mode anymore, which is what you want, right? Yeah, is there a problem with long-running crons on serverless?

[00:19:09]
Not on Next, I don't think Next will cut you off. I think you'll just pay for the compute. Generally speaking, yes, 100%, right? Like if you're running it on Lambda or Azure Functions, those have hard limits. Lambda used to be on the order of minutes, and now I think Lambda is an hour. It's been a long time since I've run into that issue with Lambda.

[00:19:33]
But the general thought process here of like, I should be worried about how long my serverless jobs take is wise. Yeah, when you call the cron, would it automatically summarize the articles that have no summary? Yep. And if you have one that's edited ever, would it ever go back to that one? Would it be able to tell? Like if someone edited enough that the summary should be different.

[00:20:08]
Oh, I see what you're saying, and then the summarized job fails. That's actually, I mean, that's actually a really good point. That would be a corner case I hadn't really thought of. What you'd really want to do in that particular case—let's go look at our articles action. So here, if summer came back, because this would come back as empty string, right?

[00:00:00]
This would get onset. So actually, no, I think you'd be fine, because then it would get caught by the next job, right? If you actually failed in the middle here. No, because then the database wouldn't get updated. So it would just fail the entire update. I think you'd be okay from what I can mentally muster at the moment to debug that, but it's a good point.

Learn Straight from the Experts Who Shape the Modern Web

250+
In-depth Courses
Industry Leading Experts
24
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now