This course has been updated! We now recommend you take the Functional-Light JavaScript, v3 course.
Transcript from the "Proper Tail Calls" Lesson
[00:00:00]
>> Kyle Simpson: So back to our summation recur that we've been working on before we did the exercise. This was representation of that function and you notice we did that peek ahead. But there's something that we all have to come to grips with around this recursion, which is that recursion is one of those things that in theory works great.
[00:00:27] But there are some practical limitations that computer scientists have always had to deal with. To understand those practical limitations, we do have to think a little bit. We have to go a little bit deeper under the covers, if you will to how function calls. And how something like recursion would actually get implemented, how it would get processed, by an engine that was running this code.
[00:00:50] And what I'm about to say really doesn't have anything specific to do with JavaScript. It really is just about the idea of how a computer could express something like recursion for languages that support that. It really goes all the way back to some of the original, earliest languages, the lists, they had recursion very early on and they had to deal with this particular problem.
[00:01:13] So we want to try to understand what that is gonna imply about our usage of recursion. You'll notice that we make a call to sumRecur here on line three, we make this call to sumRecur. And that is actually no different than calling any other function. It's not like it's special because it happens to be calling itself.
[00:01:35] It's calling any other function. So we could really generalize this to saying that could have just been a called a foo or a bar or something. But any time one function makes a call to another function. All right, question has to get asked which is, what is the current state of things, the current set of variables, the current position within, the statements and expressions when that call dispatches.
[00:02:02] Because we need to save all of that information, so that we can restore it when we're done with that function call and we come back. So we could ask ourselves, what is gonna happen after the call to sumRecur on line three? Say it's gone off and done its work and now it's given us a value back, what's gonna happen after it?
[00:02:22] Well, clearly what's gonna happen after it is that it's gonna get added. Its results are gonna get added to the sum variable and then returned. So there's a little bit of work left to do, we're not fully done with this function. Because we're not fully done, preserving that state is super important.
[00:02:40] So the way computers generally implement this idea of a function that could call another function that could call another. In memory they reserve an area of memory that's given the name a stack frame, generally. It's a small area of memory that keeps track of all the variables, sometimes referred to as the variables that are on the stack.
[00:03:00] It keeps track of all of the variables that it knows about, keeps track of the program counter, the position it is in the program, things of that nature. So there's some state that it has to keep track of. And if you've ever heard of the call stack, we use that word stack because, as a data structure we talk about an item that comes here and then a stack, another item on, and another item, and another item.
[00:03:23] And when we're done, we pop that off the stack, and pop that one, and pop the next one. So calls are kinda modeled the same way. We have a call to the current function, and then when we need to make a new call to another function, it's gonna need its own area of memory.
[00:03:37] So we make another stack frame and put it on the call stack. And then if it calls to another function, we make another stack frame and put it on, and they stack up and they stack up. Now this is not a lot of memory being used, but it is a small amount of memory being used.
[00:03:52] Maybe couple dozen bytes, 100, 200 bytes. Something like that. It's a not a ton of memory, but it is getting used. And we know that machines run with, have a fixed amount of memory that they can use. So it probably doesn't take too much of a leap of understanding to realize that that means that a call stack only has a fixed number of depth that it can go before we're gonna exhaust all the memory in the system.
[00:04:20] And when you think back to the 50s when lists and recursions were kinda being wrestled with the first time. [COUGH] They had the exact same problem, but they had a lot less memory than we did. They had a lot less memory in their computers than we did. So it was a very pronounced problem for them.
[00:04:39] And so what we observed back then was recursion works if I'm going to just like a callstack would work. Recursion works if I'm gonna do a small data set. But if I'm gonna end up, for example, summing up a list of 10,000 numbers, now I'm gonna have 10,000 stack frames.
[00:05:00] What about 100,000 numbers? That's 100,000 stack frames. At some point, there's a physical limit to the system. At some point we've run out. Now when you weren't doing recursion. If you just did foo calls, bar calls, bass, you're almost never gonna run out of call stack space because you're never gonna have a million separate functions most likely in your application.
[00:05:23] And you're certainly not gonna wire up a million different function calls like that. You probably at most have, I don't know 10, maybe 15 calls stacked up whenever recursion isn't in play. So a lot of people think this problem is specifically related to recursion. It's really related to any function calls, but you almost never would observe it unless recursion was involved.
[00:05:48] So, what are we gonna do? Because we're developers, we like to extrapolate things to their extremes. This is all well and good if I know I'm only gonna have 5 or 10 or 100 numbers. But what somebody in production gives me a million numbers? Am I gonna run out of space, how many can I go?
[00:06:07] How do I even know how far I can go? Cuz I don't know how much memory is on somebody's mobile phone or on the device, how do I even know? Well when we talk about JavaScript, there's a specific answer to the question. But in the general case they didn't really have a generalized solution to that problem.
[00:06:23] So you just kind of ran until you ran out of memory. You crossed your fingers that you didn't run out of memory. That obviously was an untenable solution for them. So the early days of the computer scientist, he said, we have to come up with some solution to this problem.
[00:06:39] That's not gonna work in general. If that's the way recursion's gonna happen, we're never gonna really be able to use recursion practically in any applications. So they said, okay, what could we do to reduce the amount of memory that was having to get created every time. And maybe not just reduce it, but is there some way that we could eliminate needing to create new memory allocation for each function call in a recursive call stack.
[00:07:09] In other words, could we run in fixed memory, a fixed amount of memory, instead of an infinitely growing amount of memory? Whatever that fixed memory is, we know we can fit that within the device, whatever device it's running on. So I began to think, why is it that we need a stack frame to be kept around?
[00:07:28] It's because we need to restore in case we're gonna do anything after the function call. And so they begin to think well, what if we could arrange it, so there wasn't any work to do after the function call? If we could arrange it so that the function call was the last thing to happen, then we don't need to actually keep that memory around.
[00:07:51] And so when we go to make the next call, we could just reuse the same stack frame, we could just override it with. The memory, we could reuse that memory for the next stack frame. Or a different implementation, an alternate implementation is to create a new stack frame and then just throw away the old one.
[00:08:08] Either way, we could run essentially forever and only have one stack frame in memory at any given time. That only works if we can somehow arrange our algorithm so that we don't have any work to do at the end afterwards. So it's this part right here, on line 3, it's the sum plus part that's creating this problem for JavaScript.
[00:08:30] We are [COUGH] arranging this, so that there is work to do after the recursive bit and that's what's gonna keep those stack frames around. As via 6, the spec requires the JavaScript engine to be smart enough that if you orient it, so there's no more work to do, then it should be able to run your programs without the limits.
[00:08:53] But when there are limits in place, either because you're in a browser that doesn't support that. Or because you don't arrange yourself in that form which we gonna talk about that form in a second. Then JavaScript is gonna not allow to just run out all the memory on the device, they're gonna have a limit.
[00:09:11] And you've probably seen that error that's been thrown before. You may have run some code and either on purpose or accidentally created a really long call stack. And then, they got the range error maximum call stack size exceeded. Now that error used to be really low, like I remember back in IE 4, 5, 6 that that limit was 13.
[00:09:36] Literally you could not call, you could do a chain of 13 function calls, but when you went to the 14th one it would throw a range error which was nuts. At the time I was writing a compiler, and the compilers were recursive descent. And I very quickly said, well, there's a whole bunch of programs I can't implement because the syntax of the language is more than 14 levels deep.
[00:10:01] And so I can't compile that language. [COUGH] So modern days, thankfully, we don't have a limit of 13. And it seems, although I'm not sure cuz I don't work on the browser engines. This is all left entirely up to the implementation to decide. But it seems like the engines nowadays have an adaptive limit, where they just say, I am gonna stop you at some arbitrary point that they determine.
[00:10:26] And they might determine that differently depending on the device. But on my laptop, when I run in a modern day Chrome, that limit seems to be somewhere around 20,000. Somewhere between 20 and 25,000 it will spit out a range error on me. But if I do less 20,000, that's a good size.
[00:10:45] And theres are a lot of data sets that that will work against. But, as a good computer scientist, I don't want to write something if it's gonna accidentally throw an error in production, given data that I can't predict. So that's an unacceptable limitation, even though it's better than it used to be.
[00:11:02] And I think it's for that reason that JavaScript developers have essentially not used recursion very much for the majority of the time JavaScript's been around. Because we have this limitation and we don't have an engine that's even smart enough to do it, even if we rearrange. So that's why it was such a big deal when ES 6 came along and said, we're finally gonna put in this optimization.
[00:11:24] Or we're gonna put in this characteristic, that if you write your code in what is so called PTC, which stands for Proper Tail Calls form. If you write your code in Proper Tail Call form, then what we will do is ensure as the engine, the engine will ensure that you will never run into that range error limit.
[00:11:42] You'll never run out of memory. So I'm guaranteeing that it will somehow run in a fixed space of memory. Now there was a lot of early writing on this topic and I was one of them that referred to this as TCO, Tail Call Optimization. Since then I've realized that these are actually adjacent topics but not the same.
[00:12:06] Proper Tail Calls is specifically that the code can run in a fixed amount of memory. In other words, never run the device out of memory. It could run infinitely and not run the device out of memory, if you arrange it as a Proper Tail Call. Tail Call Optimization says within that space, there's lots of different ways that we might implement that.
[00:12:27] Some of them are faster than others. So Tail Call Optimization is a set of additional things that can be done, like reusing the stack frame or other sorts of tricks. So it's kind of like PTC is the umbrella and TCO is the way that an engine might choose to do different sorts of optimizations, make things not slow.
[00:12:47] As it turns out in the general sense, creating a new stack frame and throwing an old stack frame array. If the engine just started doing that, that would actually make every function in your program, every single time recursive or not slower. So just implementing that feature that they could run in the recursive case in fixed memory would make all functions, even non-recursive ones, run slower.
[00:13:12] So they wanted to leave the door open to say, there's lots of tricks you can perform to optimize that. And there's a whole series of tricks, we won't really get into that, but there's a whole series of tricks called TCO. So when you refer to this feature, refer to it as PTC.
[00:13:26] Because that's the part that's required by the spec. TCO is not required by the spec, but it's allowed. The engines are allowed to figure out how to optimized it however they want. So here's what Proper Tail Calls looks like, in JavaScript specifically. Here I have this foo function that calls this bar function and you'll notice something right off the bat that I've inserted that you haven't seen in my previous code snippets I've switched this into strict mode.
[00:13:56] PTC requires strict mode. If you don't have strict mode on, PTC is not gonna be in effect and therefore, those limits are gonna be in effect for you. So you have to make sure to remember to switch to strict mode. If you're not already using strict mode, this is just one more of the plethora of reasons that you should be using strict mode.
[00:14:15] So you're gonna make sure to remember that and I always forget that. And for some reason I always forget it and I try a little test and it doesn't work and, dang it, I forgot strict mode. Still getting myself into the habit of remember to writing, even for small, little test snippets, you're gonna have to have it there.
[00:14:32] As a matter of fact, I literally have some files on my desktop. I have JavaScript file and an HTML file on my desktop of my system. Test PTC, and anytime I'm doing anything around recursion, I just drop it in there so I don't forget that that's the environment I need to set myself up in.
[00:14:49] But now, we ask obviously bar on line 8, that doesn't make any call. So it's already gonna run in fixed space. But the foo function does dispatch to bar. But does it do any work after it returns? And the answer is no, it just simply returns its value back.
[00:15:08] So this is what we call PTC form. That all of the calls that are made from that function to another function are in a position where they are at the very end of the flow control of that function. And specifically, the return keyword has to be part of the call, even if it's not gonna return anything.
[00:15:30] It should, because, hey, we're functional programmers and real functions should be returning things. But even if technically you weren't returning things, to take advantage of PTC, you're gonna have to have it in this specific form. You're gonna have to say return bar
>> Kyle Simpson: There's one exception to that, which is, you're allowed to have a ternary in the return clause.
[00:15:59] And as long as the function call is in a position in that clause where it's the last thing to run, then that's okay. So the, because the ternary is gonna get run first before the function call happens. But generally speaking you PTCs are gonna look like return bar.
[00:16:16] Now that's a non-recursive scenario. If we switched to a recursive scenario Here foo was calling itself. So, I have the return foo position. The return foo any position, where it's the last thing. Does have to be the last line of the function, it just has to be that in every path that we can dispatch to a function, that path has to be in PTC form.
[00:16:39] So we come back then to this example, and we say well, this is not PTC. It's not PTC because of that sum that happens after the function call finishes. So this code would unfortunately, even with strict mode on, would unfortunately not be able to take advantage of PTC.
[00:16:59] It would run within the limits and then pro arrange here if we ran it too far. So we wanna talk about some strategies that we could use for optimizing or rearranging or our codes so that they would be able to take advantage of PTC. That sum + sum + sum part is the problem, so we have to ask ourselves what could we do to get rid of the sum + sum + sum part, it's the plus part really, what would we do to get rid of that part?
[00:17:35] And to be completely honest with you, it is not, even to this moment with my experience with this, It is not terribly self obvious to me. Until somebody explained it to me a few hundred times, that I started to kinda get it through my thick head. So don't feel bad if this feels a little unnatural.
[00:17:54] But the question to ask yourself is, what is being kept on the stack frame. It is specifically the sum variable that's being kept on the stack frame and if I want to get rid of this stack frame then what can I do with that variable? If I can't keep it in a stack frame then what can I do with it?
[00:18:16] And the answer is I can pass it along as an argument to the next call. Instead of holding onto it now, I can calculate right away what the addition is and pass that partial total along. So essentially if I reserved the first argument for my sum recur function, if I just reserved an argument that was kind of the accumulator, and whenever you called it you could start with that accumulator position 0 and then have your list of numbers then I could just always update the accumulator and forward pass it along and then there would be no need for the stack and I would be able to say directly just return sumRecur and then pass along that argument.
[00:19:05] So because that function might have a signature that is awkward for the user to use. I wanna hide that awkward signature from the outside world. So here's that awkward signature where I've got that item in the first position. That you would normally have to pass a 0 in.
[00:19:26] And what I'm then doing is calling the recur function inside of it. So you'll notice there I'm calling recur, and I'm passing along that list of numbers. And I could pass along the 0 there in that first position if I wanted to. Now this means that from the outside world, I still get to use my function like I'm doing on line 14.
[00:19:48] I still get to use it exactly the same way I'm used to, which is just passing 3, 4, 5, 6, 7, 8, 9. Under the covers, it's an implementation detail that it chooses to do so recursively. And it's a further implementation detail that it chooses. To do so in a PTC form.
[00:20:04] Cuz you'll notice there's now a PTC form in both of the places where function calls are happening. It's return function call. And what I do here on line 10 is I pass along sum in that first position, there on line 10. That's because I've already calculated on line 8, I've gone ahead and pre-calculated the current iterations partial sum.
[00:20:26] I took whatever sum was, I added the first numb to it, I've recalculated that partial sum, I either returned that partial sum on num 9 because I'm done or I forward pass that partial sum along to the next iteration of the recursion. So as long as I have that first item in that position that I'm allowed to kind of do with it, whatever I need to.
[00:20:49] I can just use that as my stack. Now because this is simple addition, all I need to do is add a number, but your case might be that that first position needs to be an array, and you need to stick stuff in an array, and so you're gonna kinda use the array as a heap almost.
[00:21:05] A set of memory that you could grow and contract or stick stuff in or whatever, but you're reserving essentially that first position as your memory instead of the stack frame. And now it's just gonna be able run unbounded according to the JavaScript engine. Of course if you grow that, if you use an array there and you grow that array infinitely You're eventually gonna crash your device and run out of memory, too.
[00:21:29] So you have to be careful about that, but at least you won't be limited artificially by the engine. You have the freedom. And here, we're not gonna grow any memory cuz we can just do a numeric addition. That's all we need to do to accumulate. I've done recursion where that first position is a function that gets more and more composed That's a way of doing it.
[00:21:48] I've done it where it's an array and I stick a few values in the array and then I collapse those down and a few more into the array and collapse those down. I've done them with numbers, with strings, but essentially that first position or maybe a couple positions can be used as your accumulator instead of your stack frame.
[00:22:07] But, there's a different problem that we've created. Other than the fact that this is now a bit more verbose, there's some more overhead here. And, we've lost a little bit of the graceful nature of the recursive notation. It's still pretty good. Lines seven through eleven are still pretty straight forward to our implementation.
[00:22:26] But this extra boiler plate of having to wrap it around is unfortunate but there is a deeper problem. Not a huge one but a deeper problem we should be aware of. Which is every time I call the outer function sumRecur, that is every time I ran line 14 I would actually be recreating the inner function on line 7.
[00:22:45] So if I only use the utility once no big deal, I just create it once and use it a bunch of times. But if I was doing line 14 thousands and thousands of times, I'd be unnecessarily recreating that inner function over and over. So that's an unfortunate detail, cuz I'm kinda shooting myself in the foot creating a much of memory turn that I don't want.
[00:23:08] Now why am I putting that recur on line seven? Why am I putting it inside anyway? The reason generally is, because it's not a function I want to expose to the outside world. I normally would name it with an underscore or something. It's an implementation detail. So generally, encapsulation says keep that thing hidden.
[00:23:29] Your library might give you a place to put your private functions, and then that would've been fine to stick it there. But given that we wanna just implement our own self-recursive function, we wanted to hide it, and that's why we stuck it inside of that scope. That's not the only way to hide something in a scope, though.
[00:23:48] Instead of making our sumRecur be a function that does that, we can instead make the outer scope be a one time use iffy that creates the scope and it hides it. And the if field returns our function. And that's the way of getting around this that the inner recur can get recreated.
[00:24:08] So, I'm just gonna rewrite it slightly and now some recur is pointing at that returned function. And that return function is inside of a scope where it has accessed to the inner implementation recur on line nine. So I only create those two inner functions once ever, free to call some recur as many times as I want without doing unnecessary work.
[00:24:39] Now at this point, we've certainly created a pretty fair bit of overhead. And if you're in the position that I am, where you're trying to make code more declarative, you kind of wince at the fact that, sure, I feel more clever about using recursion. But man, this is not nearly as graceful maybe this wasn't worth it.
[00:25:03] I'm not ultimate pragmatist, I'm always asking yeah, so great I use recursion but at the end of the day, this is a lot uglier. It's not really what I want. Sometimes, the performance aspect of it, sometimes you're doing an algorithm that really needs recursion because the iterative form of expression is much harder to do.
[00:25:23] Summation of nums is not one of those cases. But if you were doing recursive descent to the binary tree, that's a lot harder to write iteratively. You probably wanna do it recursively and then if performance matters, you kinda just have to do the uglier form of it. But [COUGH] when I was preparing this material sometime ago, this was kind of my final slide in this discussion when I landed here and I said, just so you know this is the one of the techniques we can use.
[00:25:52] But sometimes this is just the unfortunate trade off, the code has to get a little bit uglier so that it will perform aka not crash in production. Sometimes that's what we have to make that concession. But something about this code caught my eye one time, when I was looking through it.
[00:26:16] And I think it's because of the discussion around the point free coding. And something caught my eye when I looked up at line four. And I said, I'm creating this function that will call my inner function and you'll notice that the signature of those two is compatible, it's identical.
[00:26:35] In other words, as point free coding told us earlier in the course, I don't really need that wrapper function on line four. Couldn't I just return recur itself? And as soon as I got to that point I was like wait a minute, I don't need this whole wrapping scope thing anyway.
[00:26:52] The way I originally expressed recur, as the signature, where I had those two separate parameters, it's already compatible with this forward passing concept. So, I don't need to hide that as an implementation detail, I can just simply expose recur as is. That signature is compatible to expose to the outside world, so now my PTC form recursive edition just looks like that.
[00:27:22] Now that's different than what it was six or seven slides ago because now on line four where pre calulating the sum, the partial sum and we're forward passing it on line six. That's a bit different than what we were doing before but now that is sufficient to get our algorithm into the PTC form, to take advantage of tail calls.
[00:27:48]
>> Kyle Simpson: In my own coding, I've had mixed success with this kind of refactored into PTC form. The general strategy is usually, try to figure out what the stack needs and then figure out how to pass it in the first argument. Sometimes that works well, sometimes it ends up creating something really super awkward.
[00:28:08] Hard to debug or whatever and I just give up, and I don't do recursive. So I've had some mixed success with it but at least there are times where I can implement something recursively and take advantage of PTC. Well, I could take advantage of PTC if the browser engines would implement this spec.
[00:28:29] So, the final point I'm gonna make before we take a break here, is unfortunately, as things stand, there's only one browser that has faithfully implemented this part of the spec. There are a number of browser which have claimed that they have implemented 100% of ES6. And they have not because they have not implemented this requirement for PTC.
[00:28:52] Safari, which has generally of late been sort of trailing behind a lot of the web in a lot of features. Is the only browser, as things stand today, that has actually implemented and rolled into production PTC support. As a side note, this is just a little rabbit trail that I sometimes go on.
[00:29:16] The folks at Apple that drive Safari, they don't put features into their browser unless there is a really good reason why they want to. So my suspicion is that either something that's already shipped or something that's shipping soon on their road map, distinctly and directly requires them to have PTC support.
[00:29:35] And so they elevated this implementation to be something they spent the time on. They don't generally just do things like that out of the goodness of their heart. There's usually a reason why a feature gets prioritized and especially a feature like this which to some extent has been a bit controversial.
[00:29:53] The reason it's not in the other engines is not because it's too hard for them. They don't want to implement it. They've decided that implementing this feature, even though they voted to put it in the spec, they've decided that implementing this feature is gonna unnecessarily hamper other performance things that they wanna do.
[00:30:11] And so, they're pushing back on the TC39 committee saying, we wanna take that out of this spec. And the WebKit folks are like, well, we already shipped it like six months ago. So, we don't wanna take it out of the spec, we like it, we think it's good.
[00:30:25] So, there's an open debate at TC39 right now, where the engines have basically then in opened defiance of the spec or just really dragging their feet to note implement it. Because they want to either take it out or change it in some way. Here is a number of proposals that they've got on the board for it.
[00:30:45] So as things currently stand unfortunately, you can only take advantage of PTC if your code's gonna run in Safari. And you've put it in strict mode. I hope that that changes soon because I think it's an untenable position for all these major browsers to be in defiance of the spec or not in agreement with each other.
[00:31:08] Somebody is gotta give in this process and we don't have an infinite timeline here for them to make that decision, I think it unhealthy in terms of the state of things. So I genuinely hope that they get over their objections and figure out how to do it, how the way WebKit did and then we just get PTC and it lands, and then we're done with it.
[00:31:27] But as things currently stand, it's important for you to know that caveat is still a bit in flux.