Check out a free preview of the full Advanced Elm course:
The "Constraint Unification" Lesson is part of the full, Advanced Elm course featured in this preview video. Here's what you'd learn in this lesson:

Elm's compiler performs type inference through a process called constraint unification. Richard walks through how it works.

Get Unlimited Access Now

Transcript from the "Constraint Unification" Lesson

>> Richard Feldman: Part two extensible data. Let's talk about a few things here, first we're gonna talk about Constrain Unification which is the process Elms Compiler uses to type check everything. We'll talk about Open versus Closed Records, we'll talk about why Open Records Exist, and finally we gonna talk about Extensible Custom Types.

[00:00:16] All right, let's start with Constrain Unification. So the way Elms Compiler works is it basically starts with a series of known facts about time constraints. So here are some examples of those. Stuff is a string literal. All it says, okay, I just hard coded. I know what string literals are.

[00:00:32] They are all strings. I know what floats are, they look like that. Every time I see that literal, that's a float. Every time I see one of these, that is a list of strings which it knows because this is a list literal and this is a string literal.

[00:00:44] It also knows that empty list is a list literal with unbound type variable, so list of A. So these are things that her compiler just knows, they're hard coded, it doesn't need to do any type inference to figure them out, it just knows them. So these are the known facts that it starts out with and then it basically goes through your program and starts inferring new facts, so here's an example of how it might do that.

[00:01:03] Let's say we have this expression, Hi, my name is ++ name ++! Okay, this is a string, but also, I know that's a string. Here is other string, and you know that's a string. ++ combines two strings, so I know that, by influence, whatever name is, must be a string.

[00:01:18] So now let's add those two, its a list of known fact. It says, okay, I've used what is known as constraint unification, to say, by virtue of what I already know about these other constraints. This constraint must be a string constraint. That's what the name's type must be.

[00:01:35] And as it continues going through the code it keeps doing this. So let's say we have something called working, which is assigned to not isBroken. So it says, okay, well,, not is a function from Bool to Bool. Which means that since we're passing isBroken to not, well isBroken must be a Bool.

[00:01:50] So it's another form of constraint unification. And then because working is assigned to the result of not isBroken and not return. It says, okay, well working, must also be a Bool. Then let's say later on on the same let expression, we have caption = String.toUpper working. Now it says, okay, well, String.toUpper takes a String and returns to a String which means we're working must be a string.

[00:02:11] We've concluded two facts that are at odds with one another. In one case, it claims that working as a Bool, in the other case it claims working as a string, these are contradictory facts, this is how we get a type mismatch. So as soon as it encounters a contradiction like that it's like, okay, our constraints have unified to something that does not makes sense, therefore type mismatch.

[00:02:31] Now with [INAUDIBLE] types, it's a little bit trickier than that. It's not as straightforward. So here are two different append functions. Sting.append and List.append. Both of them do the conceptual operation of squishing two things together. So String.append takes String -> String -> String. List.append List a -> List a -> List a.

[00:02:49] Now if I call String.append on a string and something that's not a string, well, it stands to reason that's going to be a type mismatch because it says, well, these both need to be a string. The second thing you gave me is not a string. But you can't quite extend that same pattern to parameterized types.

[00:03:05] So here I'm appending a list of strings to a list of a. Now those are not exactly the same type, but as we all sort of intuitively know, this will still work. It'll give us back a which a list of string. So how did it know to do that?

[00:03:18] I mean, we understand how this works when they are literally different types, but in this case they still are literally different types, it's just that it figured out, okay, but these are compatible and it put them together. And not only did it figure out that they're compatible, but it concluded that it could resolve them to a single type, which in this case it decided it was going to be a list string.

[00:03:34] How did it decide list string over list a, how does it figure those things out differently than it figures them out with string and float up here? So this is the process of constraint unification in the presence of type parameters. So if you have two types that are exactly the same.Including the parameters.

[00:03:50] They're identical types. Okay, then it's going to unify to the same thing because they are identical. There's no unification that needs to happen. If one of them is more constrained than another, for example, List String is more constrained than an unbound type. Then it says, okay, we're gonna go with the more constrained one.

[00:04:07] We're gonna unify to the one that's more constrained of the two. And if they're incompatible, then it says okay, type mismatch. So let's see an example of this. List.append a, b and c. These two are identical, so they're going to unify from list string to list string. It says, okay, those unify to List String.

[00:04:26] Let's say we do a, b, and empty list. It says, okay, List String, List a. Well, List String is more constrained than List a. So it says, okay, those unify to List String. The more constrained of the two. And if we put two empty lists together, once again, they're identical.

[00:04:39] So of course, List a, List a, List a, identical. Another constraint we can have is number. So number is a constraint that is somewhere in-between an unbound type variable and a concrete variable like int or float. So number is one of Elm's three constrained type variables. So number means this is either an int or a float.

[00:05:01] The other two constrained type variables are appendable and comparable. So appendable means it's either a Strings or lists, and comparable means it's a int, list, sorry, int, float, there's a list of these. String, if you ever get a mis-match on trying to put a non-comparable thing into a dictionary, you'll see that error message, and it lists, there's like seven of them.

[00:05:24] Regardless, they are more constrained than your typical unbound type variable or your concrete parametrized type. But when they're identical, they work out the same way. If we have 1, 2 and 3, so just from this list literal, we know that this is a list of numbers. But Elm, just based on this syntax alone, if we just put this directly into Elm REPL, it doesn't know if these are ints or floats yet, we haven't given it enough information.

[00:05:49] A literal starts out as a number, and then it might get more specific later, depending on how it's used. But at first, it's nothing more than a number. Okay, we can also append a list of number to a list of unbound type variable. And that unifies to number because number is more constrained than unbound type variable.

[00:06:08] So although we still have a type variable in here, it's not concrete yet. When you unify these two, you still do get something that is more constrained than what you started out with on one side of the expression.
>> Richard Feldman: Now if you unify a number with a float, now okay it says the more constrained of the two is float, so now we unify it to float.

[00:06:29] Okay,
>> Richard Feldman: We can still get type mismatches when it comes to number. So remember when we did a list of a with a list of string, that unified successfully the list of string. The problem is number is not compatible with string. So even though this is a type variable, it is constrained and it's constraint does not support String.

[00:06:46] Which means that if you try to unify a List String with a List number, you'll get a type mismatch. But List String with List of unbound, totally fine. Questions about that before we move on to records?
>> Speaker 2: So I guess for a number, it follows the pattern of lowercase as a type variable.

>> Richard Feldman: Right.
>> Speaker 2: This is kind of like a built in Elm thing.
>> Richard Feldman: Yes.
>> Speaker 2: It's variable because it can be multiple things, but there is a pre-described meaning to the domain of those variables?
>> Richard Feldman: Yes, exactly, so the reason for the syntactic choice of having it look like a type variable is that in all ways it behaves like a type variable, except for the extra constraint.

[00:07:28] So for example, you can choose the name, it just has to start with number. You can put number a, number b, number c if you wanna have multiples of them, it gets replaced by a more concrete type such as float if it's unified with one. So in most ways it behaves like one, except for the additional constraint that it has baked in.

[00:07:45] Some languages have these, and they have a different syntax for them. So there's a lot of trade-offs in language design, and so the choice here was just to go with looks the same as a type variable, but has some extra properties to it.
>> Speaker 3: So in comparison to if it was a custom type where you would now have to be able to handle like adding in-

>> Richard Feldman: Yeah, so there is a, conceivably a way you can do that with phantom types but that's a tangents [LAUGH]. There are other possible designs for this. But the important thing is the way that the constraints work out is that you can have something that represents either an int or a float and they don't unify with strings, but they do unify with ints and floats, and they unify to the more constraint of the two.

[00:08:36] Yeah, there's a lot of bikeshedding around [LAUGH] how to do constraints, syntactically or otherwise.