[HN Gopher] Flattening ASTs (and Other Compiler Data Structures)
       ___________________________________________________________________
        
       Flattening ASTs (and Other Compiler Data Structures)
        
       Author : aw1621107
       Score  : 40 points
       Date   : 2025-01-10 19:23 UTC (3 hours ago)
        
 (HTM) web link (www.cs.cornell.edu)
 (TXT) w3m dump (www.cs.cornell.edu)
        
       | ww520 wrote:
       | This is a fantastic idea. AST works well in an array based
       | allocation block since it has no need for freeing individual
       | nodes. It's an add-only allocation.
        
       | dmagyari wrote:
       | "Instead of allocating Expr objects willy-nilly on the heap,
       | we'll pack them into a single, contiguous array." Zig compiler
       | pipeline (AST, Zir, Air, Sema) does exactly this on all layers.
       | Not only contiguous, but instead of array-of-structs it is
       | struct-of-arrays, so walking the tree is even more cache
       | friendly. For AST see:
       | https://github.com/ziglang/zig/blob/master/lib/std/zig/Ast.z...
        
       | cardanome wrote:
       | Amazing article, very good advice to keep your data structures
       | flat.
       | 
       | Adding to that, it also makes editing the AST vastly more
       | efficient.
       | 
       | I have discovered that principle on my own when I worked on an
       | editor that directly operated on the AST instead of text. I found
       | manipulating the tree-style AST so painful, constantly traversing
       | the tree and all. Once I made it flat, my life was a hell lot
       | easier. You can just directly index any part of AST in linear
       | time.
        
       | emptysea wrote:
       | Rust-analyzer uses a similar technique for parsing
       | https://github.com/rust-lang/rust-analyzer/blob/master/crate...
       | which then gets fed into https://github.com/rust-analyzer/rowan
       | (lossless syntax tree)
        
       | ndesaulniers wrote:
       | Cool! Carbon is doing exactly this. I had asked leads if there
       | was a paper on this approach, but they didn't have anything for
       | me. I'll send them this post!
        
       | kazinator wrote:
       | > _Instead of allocating Expr objects willy-nilly on the heap,
       | we'll pack them into a single, contiguous array._
       | 
       | This happens naturally if you bump-allocate them in a garbage-
       | collected run-time, particularly under a copying collector. Free
       | lists also tend to co-locate because they are produced during
       | sweep phases of GC which run through heaps in order of address.
       | 
       | Don't make me bring out the L word for the billionth time.
       | 
       | > _A flat array of Exprs can make it fun and easy to implement
       | hash consing_
       | 
       | OK, it's not a case of L-ignorance, just willful neglect.
        
         | samps wrote:
         | FWIW I did acknowledge this in the article:
         | 
         | > A sufficiently smart memory allocator might achieve the same
         | thing, especially if you allocate the whole AST up front and
         | never add to it
         | 
         | > Again, a really fast malloc might be hard to compete with--
         | but you basically can't beat bump allocation on sheer
         | simplicity.
        
       ___________________________________________________________________
       (page generated 2025-01-10 23:00 UTC)