[HN Gopher] Exo: Exocompilation for productive programming of ha...
       ___________________________________________________________________
        
       Exo: Exocompilation for productive programming of hardware
       accelerators
        
       Author : gnabgib
       Score  : 41 points
       Date   : 2025-03-14 18:35 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | erdaniels wrote:
       | I'm not the target audience but the GitHub and website getting
       | started page feel so poorly explained. What the hell is a
       | schedule?
        
         | gnabgib wrote:
         | This MIT article covers it a bit more (with a slightly too
         | generic title) _High-performance computing, with much less
         | code_ https://news.mit.edu/2025/high-performance-computing-
         | with-mu... (https://news.ycombinator.com/item?id=43357091)
        
         | imtringued wrote:
         | https://github.com/exo-lang/exo/blob/main/examples/avx2_matm...
         | 
         | I personally am not convinced.
        
         | ajb wrote:
         | A schedule is the order in which machine instructions get
         | executed.
         | 
         | So, I've done this professionally (written assembler code, and
         | then scheduled it manually to improve performance). Normally
         | you don't need to do that these days, as even mobile CPUs use
         | out-of-order cores which dynamically schedule at runtime.
         | 
         | It's only going to be useful if you're writing code for some
         | machine that doesn't do that (they give examples of TPU etc)
        
           | almostgotcaught wrote:
           | > out-of-order cores which dynamically schedule at runtime.
           | 
           | OOO architectures don't reschedule dynamically - that's
           | impossible - they just have multiple instruction buffers that
           | can issue the instructions. So scheduling is still important
           | for OOO it's just at the level of DDG instead of literally
           | linear order in the binary.
           | 
           | Edit: just want to emphasize
           | 
           | > It's only going to be useful if you're writing code for
           | some machine that doesn't do that
           | 
           | There is no architecture for which instruction scheduling
           | isn't crucial.
        
             | ajb wrote:
             | If you're talking about modifying the DDG, I would not call
             | that scheduling. Because then you need to do serious work
             | to prove that your code is actually doing the same thing.
             | But I haven't spent a lot of time in the compiler world, so
             | maybe they do call it that. Perhaps you could give your
             | definition?
        
       | almostgotcaught wrote:
       | "[compilation] for productive programming of hardware
       | accelerators"
       | 
       | But 93% of the codebase is Python lol. Whatever you think about
       | Python, it is not a systems programming language. Takeaway: this
       | is not a serious compiler project (and it's clearly not, it's a
       | PhD student project).
       | 
       | Deeper take: this is just a DSL that behind the scenes calls a
       | (SMT) solver for tuning (what they call "scheduling"). There are
       | a million examples of this approach for every architecture under
       | the sun. My org is literally building out the same thing right
       | now. Performance is directly a function of how good your model of
       | the architecture is. At such a high-level it's very likely to
       | produce _suboptimal_ results because you have no control over ISA
       | /intrinsic level decisions. Lower-level implementations are much
       | more robustly "powerful".
       | 
       | https://dl.acm.org/doi/10.1145/3332373
        
         | rscho wrote:
         | Well, this is clearly an attempt at abstracting the kind of
         | low-level stuff you describe. Perhaps it doesn't work (yet),
         | but that shouldn't prevent people from trying ? Involving a SMT
         | solver suggests that the solver is doing the heavy-lifting, not
         | python. PhDs often produce inapplicable stuff, but they are
         | also the basis for industry/application R&D, such as what your
         | org is doing... PhDs are the slaves of science. They make stuff
         | happen for peanuts in return and deserve our respect for that,
         | even if what happens is oftentimes a dead-end. It's really sad
         | seeing people shitting on PhDs.
        
         | QuadmasterXLII wrote:
         | any sufficiently powerful compiler is going to run an
         | interpreted language at compile time, and there's no reason it
         | can't be Python instead of C++ template metaprograms or CMake
        
         | fancyfredbot wrote:
         | Your take seems to contradict the article? You say SMT solvers
         | give "no control over ISA/intrinsic level decisions" but their
         | design.md says "user-defined scheduling operations can
         | encapsulate common optimization patterns and hardware-specific
         | transformations". Are they wrong about this? Can you explain
         | why?
        
       | alex7o wrote:
       | Halide does something similar but as C++ for it dsl langauge:
       | https://halide-lang.org/
        
         | gotoeleven wrote:
         | The Exo docs mention Halide as an example of a language similar
         | to Exo but is "lowering-based", while Exo is "rewrite-based."
         | This seems to mean that Halide is more of a DSL where what you
         | want is specified at a higher level while Exo is a set of
         | transformations you can apply to an existing kernel of code
         | (though at least in the examples the kernel of code is written
         | in python and then somehow C is generated from it, after
         | transformations are applied). Do you know what the relative
         | strengths of these two approaches are?
         | 
         | Also, are there any languages in this vein that take more of a
         | declarative (as opposed to imperative) approach?
        
       | LegNeato wrote:
       | See also https://github.com/rust-gpu/rust-gpu and
       | https://github.com/rust-gpu/rust-cuda
        
       ___________________________________________________________________
       (page generated 2025-03-14 23:00 UTC)