Check out a free preview of the full Building Your Own Programming Language course:
The "Steps to Build an AST" Lesson is part of the full, Building Your Own Programming Language course featured in this preview video. Here's what you'd learn in this lesson:

Steve demonstrates how to write an AST to make it readable for a babel-like compiler.

Get Unlimited Access Now

Transcript from the "Steps to Build an AST" Lesson

[00:00:00]
>> Steve Kinney: So how might we build an AST? Well, we can iterate through the array of tokens for each number or string. You add that to the same level of the tree, right, their peers. For each parentheses though, that's like a function call in a Lisp-like language, right? And so we know that, okay, that's gonna be a CallExpression.

[00:00:19] The first thing is the function name, right? And they use subsequent ones are the arguments, right? So we can kind of go and use the use data we collected. White spaces matter, it was just to break apart our token. Quotes didn't matter, it was just to identify string, so we threw those away, we didn't even keep those.

[00:00:34] But parentheses, parentheses won't make it into our AST, cuz they're not important to the actual meaning of the language, they just denote as we're trying to figure out that meaning. Once we figured out what they were there for, we can throw those out, too. We really just need the identifiers and the values, right?

[00:00:50] So make CallExpressions based on those and kind of start to nest those as well. So there's kind of a few ways to do it. What we're gonna kinda create, is we're gonna use, the parentheses are gonna inspire nested arrays, right? And we can use those nested arrays cuz again, this is a kinda Lisp-like syntax where everything is a list, arrays are lists.

[00:01:13] It all works out great for us. And so yeah, you can see that the first one we have the name, 1 and 2. So this is for that subtract. Add 1 and 2, and subtract 4 and 2 kind of nested structure that we saw before. You can see, here's an add, so on and so forth.

[00:01:31] This is what we would get from the parsing when we start to put it in, we start to use those parentheses to figure stuff out. And this is the AST that we get out the other end, right? And so you can see a lot of metadata about this.

[00:01:43] Okay, we know that it's a function that we're calling, right, versus a function declaration. Again, I've mostly took inspiration from Babel for these, cuz that's the kind of target language that we're gonna try to hit. And we can kind of dust them and see along those lines and what happens when the Babel comes across, this is, okay, it's a call expression.

[00:02:02] I know what those look like, it's actually different than the way Lisp does it. It's the function name, and then the parentheses. And so it'll be able to rearrange all that stuff for us, cuz we've expressed the meaning of our code in this intermediate representation. As I kinda said before, they're not just for programming languages.

[00:02:18] You can you can use them again, like a lot of our LinkedIn tools, a lot of our code formatting, right? There's a lot of use cases for this stuff above and beyond this.