Building Your Own Programming Language Internal Representations & ASTs
Transcript from the "Internal Representations & ASTs" Lesson
>> Steve Kinney: Now we have our bag of tokens, right? What we wanna kind of do next is figure out what does it all mean, right? And so, parsing is really the combination of both lexical analysis and syntactic analysis. But normally, you'll hear either lexing or tokenization for getting the tokens and then parsing for figuring it all out.
[00:00:20] But together, they're all really parsing.
>> Steve Kinney: So once we've broken out, we need to figure out what does anything of this mean. So we turn to what is normally called an intermediate representation, right? It's not the end product that we're spinning out to, it is some data structure that we can play with and manipulate.
[00:00:38] So one thing to check out is AST Explorer. AST Explorer is really cool. Actually, I'll open it up real fast.
>> Steve Kinney: So here's a very simple one with just an ad save, and this one is using Babel. And you can see, effectively what Babel takes is relatively simple function and breaks it up into, right?
[00:01:01] So it's got a file, it actually keeps the start and the end. And this is useful. I told you we do that side by side thinking in the HTML. Well, it's not using Babel, it's using HTML parser, too. That's how we kinda annotate the nodes and figure out where I need to move stuff.
[00:01:14] But you see, it's a function declaration, and there's an identifier, and that identifier, see if I can open it.
>> Steve Kinney: Is the word add. So on and so forth, right? Like it kind of breaks that code into a meaningful structure that says the intent of what we just kind of tokenized and figures it all out for us.
>> Steve Kinney: And most things that we're doing have some form of an AST, right? There's also ways to visualize it. So here's the same thing for just var x equals 2. console.log('hello'). It's breaking it all into it's meaningful pieces so that we can figure stuff out. And this is how Babel does its thing.
[00:01:56] So most things are doing a lot of transpiling will go ahead and work. We'll actually take the AST that we produce. We are going to cheat a little bit, which is let's just say the AST that we are going to choose produce is our intermediate representation was inspired by what Babel spits out, right?
[00:02:17] So that we don't have to do too much transformation. It is a different language, so we have to do a little bit of transformation, but we're not gonna have to do too much cuz it'll be pretty close to Babel to begin with. But theoretically, you could design an AST that was bespoke for your language and all the stuff you need it to do.
[00:02:31] And if you can morph that tree, cuz it's just a data structure, you can traverse up and down it, mutate it and do everything you need. You can mutate it into what the other language can accept. Or you can at least serialize into that other language yourself. Cuz really, you can just go and say, okay, like a white space or whatever.
[00:02:49] You can rearrange the stuff and be able to put it back into a string effectively the reverse of that. Or if you can use that languages generator, you can do that as well. You can translate your language into any other language you want, right? So it's console log.