Inferring Function Calls

zed.dev

Lesson Description

The "Inferring Function Calls" Lesson is part of the full, Write a Compiler That Understands Types course featured in this preview video. Here's what you'd learn in this lesson:

Richard discusses maintaining consistency between function calls, arguments, and return types to avoid type mismatches. He also discusses handling type errors by continuing the unification process to provide comprehensive error messages.

Join Now or Start a Free 5-Day Trial

Preview

Transcript from the "Inferring Function Calls" Lesson

[00:00:00]
>> Richard Feldman: So here, we have our function definition up here. And one of the many things that you can do with a function is you can call it, and sometimes you can call it multiple times, and sometimes you can call it multiple times with different types. So here, we have a call to Inc, passing 42 which should work out great, because it expects a number.

[00:00:16]
And here we have one passing a string, which we want to not work out great. So here we have a goal of one success and one type mismatch. So we wanna see that this one succeeds because it's passing a number, and then we wanna see that this one fails because it's passing a string.

[00:00:29]
Again, we have a slide real estate problem. So I've actually got some ellipses here. These numbers are not gone. They're just hiding [LAUGH] so that I can fit the more relevant ones all on the screen. Okay, so here we have 0 through 3 up here. And then we have 6, 7, 8, and 9, which are currently null.

[00:00:44]
Let's go ahead and assign those. So here we're finding, again, we're using ink. So since we already have ink in scope, we're saying yeah, we're not gonna bother creating a new one and some linking them. We're just gonna say, go straight to the source. This one is 0.

[00:00:58]
This one is also 0. We do need a new one for this, because this is a expression that we haven't seen anywhere else, so that's going to go in 6. 7 is going to be the entire call itself. So this is what the call evaluates to. So if we said like const whatever equals, then that would be similar to seven.

[00:01:14]
Same basic deal here except we're gonna use 8, 9 because we already used those earlier. Okay, so the next thing we're gonna do is we're gonna do unify of six and. This would be number but again, screen real estate, so 6 right here being essentially like, hey, I am a number literal.

[00:01:29]
So no problem there. It's a number literal. That's just all that's happening there. Then we're gonna unify six and two and the reason for that is that, basically, this is where we get into the like, function calling rules. So pretty intuitive if you're calling a function passing a particular type, that function's argument at that position had better have the same type.

[00:01:49]
So we're going to unify them. So here's our 6 and then here's that 2, going to unify those. This is an example of the way that you are calling a function actually influencing the function's type itself. So if we didn't know what this thing's type was, the fact that we've unified these would cause it to say like, now this function is a number, when previously maybe it wasn't.

[00:02:10]
Although we will see a little bit later why that's not actually what we want, and what we're really gonna do after we move on a little bit is something slightly different than this. Then we're gonna unify 7 and 3, so this is basically the The entire expression corresponds to the return value.

[00:02:26]
So, in exactly the same way that when I call this passing this argument, this argument had better have the same type as what we're passing in. It's also the case that whatever this whole expression evaluates to had better be the return type of that function. So that's what's putting those together, then basically we're just gonna do the same thing for string and this with the argument and then that with the return type.

[00:02:47]
So just the same stuff repeated, but with string and the different numbers. Okay, so unifying 6 to the number, very straightforward, that just becomes a number. Unifying 6 and 2, we basically just go do the check. Okay, that's also a number, no problem, leave it alone. 7 and 3, now 7 is symlinked to 3, 8 and string, 8 is now string.

[00:03:06]
And now we have a type mismatch. So everything was working great until we tried to unify ID number eight with ID number two because eight is already the concrete type string from the previous step. And two is already the concrete type number from back earlier in the unification process.

[00:03:24]
So okay, sure enough, we did get what we wanted. We wanted to get one success, so this one did successfully unify. And then one failure, this one indeed was a type mismatch. So hey, the system works. Fantastic. Also, if you want, you can keep going and unify nine here, but at this point, that's kind of pointless because we're never gonna actually make use of it.

[00:03:42]
I mainly bring this up on the context of when you hit a type mismatch, there are a few different things the compiler can do. This once again is another one of those things where it's sort of up to you as the compiler author to decide how you wanna handle that.

[00:03:54]
One thing you can do, which I would not advise, is you can say, we are done. As soon as we hit our first type mismatch, you get reported that one type mismatch and we will never do any more type inference ever again and you get to go solve that one.

[00:04:04]
Once you solve that one, then I'll maybe give you another one. Nobody wants that. [LAUGH] People would definitely prefer to get all the tightness matches at once. Or well, I guess maybe not, if it's gonna be repeats or redundant ones but that's another topic. The point is usually it might seem pointless to keep going with something like this, where it's like, we're already in tightness match.

[00:04:23]
Smashing what are you doing? Just stop working. But it's like, no actually, if you want to give people nice error messages, multiple error messages, and there's other type mismatches, you really should keep going after a tight mismatch. So these, like other unifications, you generally tend to want to proceed with.

[00:04:37]
And like I said, there's a couple different ways to handle this, but quite often, what you'll do when you get to this point is you'll record a special type in here that's not quite concrete. And not quite a symlink and also not quite null, but it's this fourth thing that's called like an error.

[00:04:50]
And error has its own special set of unification rules that are kind of designed to be like, just let me keep going and doing my thing. I know that we blew up here and like this doesn't make sense. That we've reached this sort of logical impossibility where this thing is trying to be two different things at once.

[00:05:06]
So we're not gonna pretend it's a string, because that could cause problems downstream. We are gonna write down this is, like a the type of this thing is a something bad happened, basically. And then in the different branches of unify, you just sort of like handle that differently depending on what it's trying to unify with.

[00:05:21]
So not gonna get into that for. Purposes of this workshop. But just to be aware, this is the point where, in a real production compiler, I wouldn't stop here and say, forget about that last unify. It's pointless. I would, in fact, unify it and and I also would have, not left this a string.

[00:05:35]
I would have changed it to some sort of error type. Okay, so we did the thing. We had a goal of one success and one mismatch, and we achieved our goal. Fantastic. All right, to summarize part 5, we talked about inferring conditionals. So we started off with just the ternary.

[00:05:48]
We had a little bit of carryover from our previous multiplication example where the question mark, like right before the question mark, that had to be a bool that was hard coded, no questions asked. But it was also a little bit different than the multiplication because the two different branches didn't have to be a number like they did in the case of the operands for exclamation point.

[00:06:07]
But they did have to be unified with each other. They have to be quote unquote identical, essentially and then also they had to be unified with the expression as a whole. So there was sort of a three-way unification at no point in that unification or doing anything hardcoded.

[00:06:21]
It's just like, yeah. The relationship these three things have to have with each other and of course, there are opportunities for tight mismatches there as we saw. Then we talked about inferring just the function declaration without taking into account any calls. And we saw things like, we already have considerations like the return type, the return statements, and how those interact.

[00:06:39]
So however many return statements you have, you need a unify for every single one of those to unify that with the return type, which can not only affect the return type. But also can be a source of tightness matches if they happen to disagree. Then we got the argument types, then of course the function body as a whole, which introduced our new concrete type that has additional information.

[00:06:58]
Which requires additional branches in Unify, which we will start to see as we progress. And then finally we got into function calls, where we saw that, yeah, not only does the function definition set the and give you a way to affect these types. But also, as the calls participate in the unification process, you can actually either end up changing the functions type or causing type mismatches.

[00:07:18]
Because the way that you know one call site is doing it like disagrees with another call site or one call site disagrees with the definition itself, which is what we saw here

Learn Straight from the Experts Who Shape the Modern Web

In-depth Courses
Industry Leading Experts
Learning Paths
Live Interactive Workshops

Start a 5-Day Free Trial