[HN Gopher] Compiling Expressions
       ___________________________________________________________________
        
       Compiling Expressions
        
       Author : tosh
       Score  : 83 points
       Date   : 2024-02-08 11:49 UTC (1 days ago)
        
 (HTM) web link (craftinginterpreters.com)
 (TXT) w3m dump (craftinginterpreters.com)
        
       | kunley wrote:
       | This book is a pearl, esp. the second part with a VM-based
       | interpreter.
       | 
       | Not sure why the OP posted this particular chapter, maybe because
       | IIRC there starts a sketch of Pratt parser implementation
        
       | cube2222 wrote:
       | There is a very cool paper about structuring query compilers that
       | I've read a couple years ago, and came back to recently when
       | doing some experimentation.
       | 
       | The paper[0], "How to Architect a Query Compiler, Revisited",
       | presents an approach to structuring compilers based on Futamura
       | Projections. In practice this means that you write most of your
       | high-level code as though it was an interpreter, but then mainly
       | rely on constructors and operator overloading (and occasionally
       | special functions) to emit compiled code for your expressions and
       | nodes under the hood.
       | 
       | The paper should be interesting to anybody who is also interested
       | in the posted link.
       | 
       | [0]: https://www.cs.purdue.edu/homes/rompf/papers/tahboub-
       | sigmod1...
        
         | kaba0 wrote:
         | For an actual implementation of these ideas (in a non-db
         | context), you might want to take a look at GraalVM's Truffle
         | part, or PyPy.
        
         | chc4 wrote:
         | "based on Futamura Projections" is kind of cheating: they don't
         | have a `mix` specialization function, which is the entire point
         | of Futamura's paper. They're doing automatic symbolic execution
         | using operator overloading, but I'm looking extremely
         | suspiciously at them referring to that as "the first Futamura
         | Projection without mix".
        
       | _false wrote:
       | Working through _Crafting Interpreters_ atm. Both an educational
       | and enjoyable read. I find the author's ability to reason across
       | programming languages as a skill to aspire to. It feels like a
       | meta way to think of programming languages. As any language is an
       | instantiation of a set of parameters, made through the design
       | decisions.
        
       | owlstuffing wrote:
       | I've always considered recursive descent easier to understand and
       | maintain than Pratt. Shrug.
        
         | caditinpiscinam wrote:
         | Even with recursive decent, you still need a strategy to deal
         | with left-associative binary operators. This is where Pratt
         | parsing comes in: not to replace recursive descent, but to
         | supplement it.
        
           | owlstuffing wrote:
           | Yes, however there is a much simpler approach to removing
           | left recursion. For example, a left recursive additive
           | expression grammar:                 additive-expression
           | <multiplicative-expression>         <additive-expression> +
           | <multiplicative-expression>         <additive-expression> -
           | <multiplicative-expression>
           | 
           | Left recursion removed from this _grammar_ is:
           | additive-expression         <multiplicative-expression>
           | <additive-expression2>              additive-expression2
           | + <multiplicative-expression> <additive-expression2>
           | - <multiplicative-expression> <additive-expression2>
           | <null>
           | 
           | Then implement additive-expression2 as a simple loop. No need
           | for Pratt and implementation follows more closely with BNF.
        
             | a_e_k wrote:
             | Yep. And whenever I've implemented a recursive-descent
             | parser and dealt with left-associative operators, I've just
             | inlined the equivalent of additive-expression2 and the
             | loop. E.g.:                   def parseAddExpr():
             | parseMultExpr()             while peekToken in ["+", "-"]:
             | consumeToken()                 parseMultExpr()
             | 
             | Honestly, I really like recursive-descent parsing, even for
             | expressions. It's very straightforward once you've learn a
             | few simple patterns like that.
             | 
             | Decently large precedence hierarchies (15-20) don't bother
             | me; I'll typically just number the levels rather than use
             | names like "term", "factor", "additive", or
             | "multiplicative". It can get a little verbose, yes, but
             | it's easy to write and skim through. (Golang-like from what
             | I've read, though I'm not a user of Golang.)
             | 
             | But I also find it scales better once you start adding
             | things like ternary expressions, function calls, and type
             | casts -- things that aren't just simple binary operators.
             | Handling those feels a lot less clean to me with Pratt,
             | precedence climbing, shunting yard, etc.
             | 
             | One main downside that I'll admit to recursive descent
             | though is that it's a bit harder to have a language with
             | user-defined operators. I don't like those much, though, so
             | that's not much of a loss to me.
        
         | sneed_chucker wrote:
         | Recursive descent is great for parsing most of an algol-like
         | programming language.
         | 
         | Once you get into expressions, you'll really want to do
         | precedence climbing or Pratt parsing to get things working
         | correctly.
         | 
         | This assumes that you want PEMDAS-like operator precedence in
         | the first place. It does seem like a waste once you've
         | implemented it and realize how little it's actually utilized by
         | most code. Maybe the Lisp and forth people have a point...
        
       | norir wrote:
       | > Second, we get to write an actual, honest-to-God compiler. It
       | parses source code and outputs a low-level series of binary
       | instructions.
       | 
       | Your compiler doesn't need to emit byte or machine code to be
       | real or useful. High level to high level compilers can be
       | incredibly useful and relatively easy to write. For optimal
       | runtime performance, you can target c or some other low level
       | language and let it deal with register allocation, instruction
       | selection and shared lib/executable generation.
        
         | BeetleB wrote:
         | That's called a transpiler.
        
           | samatman wrote:
           | A transpiler is a sort of compiler. A single compiler might
           | emit LLVM, C, or JavaScript, depending on the target. That
           | doesn't mean that it's only a compiler sometimes, it means
           | that some of the modes of the compiler transpile.
        
       ___________________________________________________________________
       (page generated 2024-02-09 23:01 UTC)