Check out a free preview of the full Building Your Own Programming Language course

The "Parser Generators" Lesson is part of the full, Building Your Own Programming Language course featured in this preview video. Here's what you'd learn in this lesson:

Steve explains that parser generators are domain specific languages for generating domain specific languages, and gives an overview of a few parser generators such as Antler and Chevrotain.


Transcript from the "Parser Generators" Lesson

>> Steve Kinney: I do wanna talk about one other thing, which is, there's a whole class of things called parser generators. And these are effectively, it's kind of meta. It's a domain specific language, or there's various ones of domain-specific languages for generating domain-specific languages, right? And I am of mixed feelings on them, one, in learning them, you're kind of learning that parser generator.

Some of them are based on like Bison or Yak, or something along those lines. But generally speaking, you're learning that kind of parser generator, and not any of the stuff that goes on behind it. But there's a trade-off and so here's a little, the bold part is the important one.

But this idea that they're compiler compilers, but they only take a tiny percentage of the job cuz I'll show you some syntax of some of them. Where they become interesting is when, we suffer from the problem where our back end is in Go and our front end is in JavaScript.

And sometimes what we show in the preview has to be run through the compiler as well as what happens on the back end, right? And so in the places where you can maybe use a parser generator to generate a JavaScript parser and a Go parser, that is kinda the dream.

We don't, in fact, do that, but that would be the hope in this case. But generally speaking, I'll show you a few of them, if they're interesting to you, I think they're worth taking a look. But they have never effectively solved the problem for me. And it's not that I'm writing them off, I see promise and merit in them.

But I think it's always worth evaluating the trade-offs of a thing that you have kinda the complete flexibility and control over, versus something where you're kind of bending to the way it does things. So you see, this is even, to kind of do a very kind of simple syntax, you're basically writing a bunch of all the rules for a language in a very linguistic sense.

This is one called Jison, which is based on Bison, which is based on Yak. Get it, it's a pun cuz it's JavaScript. And so it's not like reading, I'll tell you that much, right? But it is kind of powerful that you could theoretically generate, this is what Handlebars uses like under the hood to generate the parser.

But then also, there's some additional stuff along the way. So it is another area, if this is a thing you're really into, there are things like Antler, which is kind of Java-based, that do some really kind of impressive stuff as well. Worth looking into, but I think for me, I've a lot of times got a lot of benefit of just a piece or two of some of these concepts and being able to use them.

So there's Jison and here's, this one's I think Chevron, where you kind of define everything. You get these kind of cool graphs, but there's a learning curve that you have to kinda make the choice on. Nearley is another one, you can kind of see, their syntaxes [INAUDIBLE]. This one will match a hexadecimal color and CSS, or different colors, right?

I would argue that we have the skill set now to write a parser to do that too, right? And so you kinda have to make the choice in those cases, or you don't have to make the choice, right? And there's some other list-specific things that we didn't get into.

One of the things I said during one of the breaks was, a lot of times, when implementing a list, it'll be a series of arrays. Ours was that we also kind of had one foot in the tree realm. And the reason for that was, I think there's a lot of value in understanding the visitor pattern on the ASTs and transpiling to Babel.

And that kind of got a little hairy and complicated with a bunch of arrays and casting a bunch of stuff. There's a bunch of really cool stuff that you can do. So this is creating a lambda, which you see we could add to the environment. I think what's kind of cool here is, you basically are making a function.

And you're using those parameters and you're kind of merging them with the environments. You're creating like ideas of scope, right, by having, okay, whatever's in this scope, if there's an x in our local scope, it overrides the global scope. But then when we're back out of that one, we're using the global scope again.

And you can implement a lot of those things, this is effectively it, right? For implementing it in JavaScript, we'd have to tweak our parser in a way that would have made it a little, the value wasn't really there, right, for the extra work that we'd have to do there.

But I think that as you begin to think about different concepts in languages, it does also help to help you understand how they work when you start to implement them as well. So on top of the very practical things, I think there's also a lot of rich language we use.

We effectively are surrounded by these things and we use them every day, right, [LAUGH] in all sorts of ways. Obviously, we use stuff that is compiled, but especially in the JavaScript land, our JavaScript is run through a JavaScript compiler, right, and all of those things. I think it's kind of interesting to see how some of this stuff works in practice.

Learn Straight from the Experts Who Shape the Modern Web

  • In-depth Courses
  • Industry Leading Experts
  • Learning Paths
  • Live Interactive Workshops
Get Unlimited Access Now