Digging Into Node.js Transform Stream
Transcript from the "Transform Stream" Lesson
>> Kyle Simpson: We do want to actually do some processing on our streams as they go by. And the way we're gonna do that is we're gonna set up a transform stream. So I'm going to pull in let's see what it's called here at transform from the stream.
>> Kyle Simpson: There's another built in node model called stream, and we're gonna just pull off specifically the transform string constructor from it.
[00:00:31] We're gonna make a transform stream and the reason we're making a transform stream is so that we can step in the middle of a stream pipe and item by item we can process what's going through it. So, instead of having output stream immediately. In there what we're just gonna do is step in the middle of it.
[00:00:53] So let's make an upperStream that's gonna say new Transform. And the constructor takes an object that can have several configurations in it but the one we want to care about is that it needs a method called transform. And this method will receive some inputs, it will receive a chunk.
[00:01:18] It will receive an encoding which we don't care about in here. Doesn't even always know the encoding and then it seems a method to let us know that it's finished. I'll call it next, or maybe we'll call it cb. This is just a notifier that we have finished.
[00:01:36] The way that this transform stream works, it may seem a little bit strange, but it's basically kind of like an array. So if we wanna put something into our stream from that chunk, we literally just say this.push and give it something. Okay, so we're not passing it to callback, but we do need to execute callback so that the stream knows this chunk has been processed and it can move on.
[00:01:59] And that allows you to do asynchronous processing if you need to. Here we literally just wanna take the chunk, turn it into a string and upper case it. So we'll do that we'll say chunk.toString. Chunk of course is a buffer. We'll take chunk.toString which turns it into a string and then we'll call it toUpperCase on it.
>> Kyle Simpson: That pushes the transformed chunk into our upperStream. Now, we need to get input stream into upperStream. Anybody remember how we do that?
>> Kyle Simpson: Think about the shapes of the streams that we're dealing with. UpperStream is going to be a writable.
>> Speaker 2: Would it be outstream.pipe with upperStream as the argument?
>> Kyle Simpson: And output stream won't have a .pipe on it because the output stream is
>> Speaker 2: nstream.pipe
>> Kyle Simpson: So it's going to be inStream.pipe and we're going to give it a readable stream so upperStream and what's gonna come back from that is-
>> Speaker 2: A readable stream.
>> Kyle Simpson: outStream.
>> Kyle Simpson: So we're taking inStream in to this thing and then we're transforming it.
[00:03:23] Now I'm just gonna do one extra little intermediary step, which is gonna set us up for stuff we're doing later. When I assign outputStream there we're just gonna use outputStream. It's gonna allow us to do a bunch of transformations that we'll get to. So we're just gonna keep reassigning this variable outStream is equal toi outStream.pipe into something else, and then we'll do that over and over and over again, okay?
[00:03:45] And now we have a targetStream to write to, and we can finally say at the very end outStream into our targetStream and it should dump those contents. So let's go over to our code. And we're gonna try it with the Lauren.text. And we'll see that it was able to uppercase all of them, but to prove to you that it's happening one chunk at a time, now that we have a transform stream, but I'm actually gonna do is insert a delay.
[00:04:19] Into each step of the transformation, make it obvious what's happening. So I'm gonna just do a simple old set timeout. So I can just say cb and let's just delay by 500 milliseconds for each chunk and then when we run this you're gonna see that each chunk is getting processed.
[00:04:41] Every 500 milliseconds.
>> Kyle Simpson: So it's roughly 11 or 12 chunks to go through the whole megabyte file and then it's finished. So that proves to us it's not doing everything all at once. It's doing it efficiently. It's using these internal buffer sizes. I assume that they have some conditions within it where maybe they adjust those or even configure those, but I think the default is 65K.
[00:05:18] All right, let's take that timer back out. That was just for illustration purposes. Make sure you call cb.