Building Your Own Programming Language

Internal Representations & ASTs

Steve Kinney

Steve Kinney

Temporal
Building Your Own Programming Language

Check out a free preview of the full Building Your Own Programming Language course

The "Internal Representations & ASTs" Lesson is part of the full, Building Your Own Programming Language course featured in this preview video. Here's what you'd learn in this lesson:

Steve explains that once types are broken into tokens, the abstract syntax tree (AST) breaks the code into a meaningful structure that gives the intent of what was just tokenized. AST makes it easier to transpile the language created into JavaScript.

Preview
Close

Transcript from the "Internal Representations & ASTs" Lesson

[00:00:00]
>> Steve Kinney: Now we have our bag of tokens, right? What we wanna kind of do next is figure out what does it all mean, right? And so, parsing is really the combination of both lexical analysis and syntactic analysis. But normally, you'll hear either lexing or tokenization for getting the tokens and then parsing for figuring it all out.

[00:00:20]
But together, they're all really parsing.
>> Steve Kinney: So once we've broken out, we need to figure out what does anything of this mean. So we turn to what is normally called an intermediate representation, right? It's not the end product that we're spinning out to, it is some data structure that we can play with and manipulate.

[00:00:38]
So one thing to check out is AST Explorer. AST Explorer is really cool. Actually, I'll open it up real fast.
>> Steve Kinney: So here's a very simple one with just an ad save, and this one is using Babel. And you can see, effectively what Babel takes is relatively simple function and breaks it up into, right?

[00:01:01]
So it's got a file, it actually keeps the start and the end. And this is useful. I told you we do that side by side thinking in the HTML. Well, it's not using Babel, it's using HTML parser, too. That's how we kinda annotate the nodes and figure out where I need to move stuff.

[00:01:14]
But you see, it's a function declaration, and there's an identifier, and that identifier, see if I can open it.
>> Steve Kinney: Is the word add. So on and so forth, right? Like it kind of breaks that code into a meaningful structure that says the intent of what we just kind of tokenized and figures it all out for us.

[00:01:37]
>> Steve Kinney: And most things that we're doing have some form of an AST, right? There's also ways to visualize it. So here's the same thing for just var x equals 2. console.log('hello'). It's breaking it all into it's meaningful pieces so that we can figure stuff out. And this is how Babel does its thing.

[00:01:56]
So most things are doing a lot of transpiling will go ahead and work. We'll actually take the AST that we produce. We are going to cheat a little bit, which is let's just say the AST that we are going to choose produce is our intermediate representation was inspired by what Babel spits out, right?

[00:02:17]
So that we don't have to do too much transformation. It is a different language, so we have to do a little bit of transformation, but we're not gonna have to do too much cuz it'll be pretty close to Babel to begin with. But theoretically, you could design an AST that was bespoke for your language and all the stuff you need it to do.

[00:02:31]
And if you can morph that tree, cuz it's just a data structure, you can traverse up and down it, mutate it and do everything you need. You can mutate it into what the other language can accept. Or you can at least serialize into that other language yourself. Cuz really, you can just go and say, okay, like a white space or whatever.

[00:02:49]
You can rearrange the stuff and be able to put it back into a string effectively the reverse of that. Or if you can use that languages generator, you can do that as well. You can translate your language into any other language you want, right? So it's console log.

[00:03:04]
You can kinda see, it's not not super surprising. There's some kind of different kind of meta layers in there, kind of tell you what l things mean on a larger scale. We're gonna skip a lot of those. But for our purposes, there is kind of a somewhat agreed upon spec for JavaScript ASTs that a lot of that, Acorn, Babble, use.

[00:03:29]
And pretty sure as well. Even stuff like Prettier and use ASTs to figure out what the meaning of your code is to tell you you made a boo boo. And so, if you can get pretty close to that spec, which is what we are kind of purposely doing with our very simple language, it will become incredibly easy to transpile it into JavaScript.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now