Check out a free preview of the full The Hard Parts of Servers & Node.js course:
The "Introducing the File System API" Lesson is part of the full, The Hard Parts of Servers & Node.js course featured in this preview video. Here's what you'd learn in this lesson:

Will plans out a way to use the file system API, fs, to save the tweets into the file system and then apply modifications using JavaScript.

Get Unlimited Access Now

Transcript from the "Introducing the File System API" Lesson

[00:00:04]
>> Will Sentance: We have much of our Twitter app, folk, now set up, handling, inspecting, the inbound request message. What format's a request message in, everyone together? Begins with an H, and ends in TTP.
>> Speaker 2: HTTP.
>> Will Sentance: Good job, all right.
>> [LAUGH]
>> Will Sentance: They're known as requests. It's the core of our app.

[00:00:27] People, it's the core of Node. And it's the core of servers. This is Node. Everything else is ancillary. It's really additional to this core model of inbound message, send back. Because that's what that's all we really care about, really at the heart is user, client, wants stuff. Look at what they want.

[00:00:52] Luckily we've got a message that's gonna be sent back. Add stuff to that message, send that back, add data, send back. It's the heart of it. But Node can do even more. We have a huge archive of tweets, 1.5 gigabytes. Unfortunately, they're saved on our computer. They're not in our little JavaScript memory up here.

[00:01:11] Although it can handle that much, but we're not storing them up there, are we? We're storing them long-term permanently in here. We wouldn't want to have to reinitialize them all every single time we turn on Node. We're going to store them permanently down here. It's much more permanent stored, in the memory of the computer, in the hard drive of the computer.

[00:01:29] [INAUDIBLE] a store. Could we load them into JavaScript in order to run a function that removes the bad tweets, the mean tweets? We wanna do this, right? 37% of Twitter bad tweets were auto removed in the last year, rather than by manual cleaning. This is much higher than previously, which was 0%, apparently, good job by them.

[00:01:56] We can use a feature of the computer. There is a file system or file storage, the hard drive. Node, C++ has access to that. It's a feature, no problem. I love it. It's a built-in feature that speaks to the damn filesystem. It's known as fs. And we've got a label for it here, which we typically, well, it's fs, it's a label for this feature.

[00:02:16] Okay, there might be some issues with the file being that large, though, because I'm gonna bring all the file in, 1.5 gigabytes, and then start cleaning it. If you're experienced in solving problems with code, you're probably going, wow, that sounds like a really unoptimized way of dealing with that problem.

[00:02:38] Let me give you a little preview. Every 64,000 bytes, every 64 kilobytes of data that gets pulled in, I have a feeling a little message, a little event might be flashed out. I wonder if we could do something with that information, but hold that thought for now. All right, well, by the way, what is 1.5 gigabytes of data?

[00:03:00] Every character in a Tweet is worth 1 byte. What is 1 byte? It's eight zeroes and ones, which have apparently 256 combinations. So you could represent the letter A, B, C, D, E in different combinations of zeroes and ones of eight of them. Some things which we see as single characters are actually two sets of eights, zeroes, and ones.

[00:03:25] Like emojis, they can't be done in one set of eight zeroes and ones. There's not enough. There's too many emojis, not enough zeroes and one combinations in eight of them. So you have two bytes to capture that. But that means 64,000 bytes of the letter A is 64,000 letter As.

[00:03:45] 1.5 gigabytes, [SOUND] 1.5 billion characters? Maybe, rough, I think it's right. I think that's right. I was right! Hundreds, I'm right, definitely right. That's a lot of tweets. But by the way, it also takes, people, a long time to get pulled out of the hard drive. Accessing one megabyte from the hard drive takes, one millisecond.

[00:04:14] Yeah, one millisecond. Yeah, yeah, one millisecond. So this is gonna take a bunch of time. Multiple, multiple seconds to get out of or at least up to, multiple milliseconds, multiple seconds, to get this data out of our hard drive into JavaScript. That's very interesting. I wonder if during that time, I wish I could get started on it and start cleaning it rather than waiting.

[00:04:41] That would be amazing, right? That'd be my dream, but who knows.