[HN Gopher] Compiling Expressions
___________________________________________________________________
Compiling Expressions
Author : tosh
Score : 83 points
Date : 2024-02-08 11:49 UTC (1 days ago)
(HTM) web link (craftinginterpreters.com)
(TXT) w3m dump (craftinginterpreters.com)
| kunley wrote:
| This book is a pearl, esp. the second part with a VM-based
| interpreter.
|
| Not sure why the OP posted this particular chapter, maybe because
| IIRC there starts a sketch of Pratt parser implementation
| cube2222 wrote:
| There is a very cool paper about structuring query compilers that
| I've read a couple years ago, and came back to recently when
| doing some experimentation.
|
| The paper[0], "How to Architect a Query Compiler, Revisited",
| presents an approach to structuring compilers based on Futamura
| Projections. In practice this means that you write most of your
| high-level code as though it was an interpreter, but then mainly
| rely on constructors and operator overloading (and occasionally
| special functions) to emit compiled code for your expressions and
| nodes under the hood.
|
| The paper should be interesting to anybody who is also interested
| in the posted link.
|
| [0]: https://www.cs.purdue.edu/homes/rompf/papers/tahboub-
| sigmod1...
| kaba0 wrote:
| For an actual implementation of these ideas (in a non-db
| context), you might want to take a look at GraalVM's Truffle
| part, or PyPy.
| chc4 wrote:
| "based on Futamura Projections" is kind of cheating: they don't
| have a `mix` specialization function, which is the entire point
| of Futamura's paper. They're doing automatic symbolic execution
| using operator overloading, but I'm looking extremely
| suspiciously at them referring to that as "the first Futamura
| Projection without mix".
| _false wrote:
| Working through _Crafting Interpreters_ atm. Both an educational
| and enjoyable read. I find the author's ability to reason across
| programming languages as a skill to aspire to. It feels like a
| meta way to think of programming languages. As any language is an
| instantiation of a set of parameters, made through the design
| decisions.
| owlstuffing wrote:
| I've always considered recursive descent easier to understand and
| maintain than Pratt. Shrug.
| caditinpiscinam wrote:
| Even with recursive decent, you still need a strategy to deal
| with left-associative binary operators. This is where Pratt
| parsing comes in: not to replace recursive descent, but to
| supplement it.
| owlstuffing wrote:
| Yes, however there is a much simpler approach to removing
| left recursion. For example, a left recursive additive
| expression grammar: additive-expression
| <multiplicative-expression> <additive-expression> +
| <multiplicative-expression> <additive-expression> -
| <multiplicative-expression>
|
| Left recursion removed from this _grammar_ is:
| additive-expression <multiplicative-expression>
| <additive-expression2> additive-expression2
| + <multiplicative-expression> <additive-expression2>
| - <multiplicative-expression> <additive-expression2>
| <null>
|
| Then implement additive-expression2 as a simple loop. No need
| for Pratt and implementation follows more closely with BNF.
| a_e_k wrote:
| Yep. And whenever I've implemented a recursive-descent
| parser and dealt with left-associative operators, I've just
| inlined the equivalent of additive-expression2 and the
| loop. E.g.: def parseAddExpr():
| parseMultExpr() while peekToken in ["+", "-"]:
| consumeToken() parseMultExpr()
|
| Honestly, I really like recursive-descent parsing, even for
| expressions. It's very straightforward once you've learn a
| few simple patterns like that.
|
| Decently large precedence hierarchies (15-20) don't bother
| me; I'll typically just number the levels rather than use
| names like "term", "factor", "additive", or
| "multiplicative". It can get a little verbose, yes, but
| it's easy to write and skim through. (Golang-like from what
| I've read, though I'm not a user of Golang.)
|
| But I also find it scales better once you start adding
| things like ternary expressions, function calls, and type
| casts -- things that aren't just simple binary operators.
| Handling those feels a lot less clean to me with Pratt,
| precedence climbing, shunting yard, etc.
|
| One main downside that I'll admit to recursive descent
| though is that it's a bit harder to have a language with
| user-defined operators. I don't like those much, though, so
| that's not much of a loss to me.
| sneed_chucker wrote:
| Recursive descent is great for parsing most of an algol-like
| programming language.
|
| Once you get into expressions, you'll really want to do
| precedence climbing or Pratt parsing to get things working
| correctly.
|
| This assumes that you want PEMDAS-like operator precedence in
| the first place. It does seem like a waste once you've
| implemented it and realize how little it's actually utilized by
| most code. Maybe the Lisp and forth people have a point...
| norir wrote:
| > Second, we get to write an actual, honest-to-God compiler. It
| parses source code and outputs a low-level series of binary
| instructions.
|
| Your compiler doesn't need to emit byte or machine code to be
| real or useful. High level to high level compilers can be
| incredibly useful and relatively easy to write. For optimal
| runtime performance, you can target c or some other low level
| language and let it deal with register allocation, instruction
| selection and shared lib/executable generation.
| BeetleB wrote:
| That's called a transpiler.
| samatman wrote:
| A transpiler is a sort of compiler. A single compiler might
| emit LLVM, C, or JavaScript, depending on the target. That
| doesn't mean that it's only a compiler sometimes, it means
| that some of the modes of the compiler transpile.
___________________________________________________________________
(page generated 2024-02-09 23:01 UTC)