[HN Gopher] Exo: Exocompilation for productive programming of ha...
___________________________________________________________________
Exo: Exocompilation for productive programming of hardware
accelerators
Author : gnabgib
Score : 41 points
Date : 2025-03-14 18:35 UTC (4 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| erdaniels wrote:
| I'm not the target audience but the GitHub and website getting
| started page feel so poorly explained. What the hell is a
| schedule?
| gnabgib wrote:
| This MIT article covers it a bit more (with a slightly too
| generic title) _High-performance computing, with much less
| code_ https://news.mit.edu/2025/high-performance-computing-
| with-mu... (https://news.ycombinator.com/item?id=43357091)
| imtringued wrote:
| https://github.com/exo-lang/exo/blob/main/examples/avx2_matm...
|
| I personally am not convinced.
| ajb wrote:
| A schedule is the order in which machine instructions get
| executed.
|
| So, I've done this professionally (written assembler code, and
| then scheduled it manually to improve performance). Normally
| you don't need to do that these days, as even mobile CPUs use
| out-of-order cores which dynamically schedule at runtime.
|
| It's only going to be useful if you're writing code for some
| machine that doesn't do that (they give examples of TPU etc)
| almostgotcaught wrote:
| > out-of-order cores which dynamically schedule at runtime.
|
| OOO architectures don't reschedule dynamically - that's
| impossible - they just have multiple instruction buffers that
| can issue the instructions. So scheduling is still important
| for OOO it's just at the level of DDG instead of literally
| linear order in the binary.
|
| Edit: just want to emphasize
|
| > It's only going to be useful if you're writing code for
| some machine that doesn't do that
|
| There is no architecture for which instruction scheduling
| isn't crucial.
| ajb wrote:
| If you're talking about modifying the DDG, I would not call
| that scheduling. Because then you need to do serious work
| to prove that your code is actually doing the same thing.
| But I haven't spent a lot of time in the compiler world, so
| maybe they do call it that. Perhaps you could give your
| definition?
| almostgotcaught wrote:
| "[compilation] for productive programming of hardware
| accelerators"
|
| But 93% of the codebase is Python lol. Whatever you think about
| Python, it is not a systems programming language. Takeaway: this
| is not a serious compiler project (and it's clearly not, it's a
| PhD student project).
|
| Deeper take: this is just a DSL that behind the scenes calls a
| (SMT) solver for tuning (what they call "scheduling"). There are
| a million examples of this approach for every architecture under
| the sun. My org is literally building out the same thing right
| now. Performance is directly a function of how good your model of
| the architecture is. At such a high-level it's very likely to
| produce _suboptimal_ results because you have no control over ISA
| /intrinsic level decisions. Lower-level implementations are much
| more robustly "powerful".
|
| https://dl.acm.org/doi/10.1145/3332373
| rscho wrote:
| Well, this is clearly an attempt at abstracting the kind of
| low-level stuff you describe. Perhaps it doesn't work (yet),
| but that shouldn't prevent people from trying ? Involving a SMT
| solver suggests that the solver is doing the heavy-lifting, not
| python. PhDs often produce inapplicable stuff, but they are
| also the basis for industry/application R&D, such as what your
| org is doing... PhDs are the slaves of science. They make stuff
| happen for peanuts in return and deserve our respect for that,
| even if what happens is oftentimes a dead-end. It's really sad
| seeing people shitting on PhDs.
| QuadmasterXLII wrote:
| any sufficiently powerful compiler is going to run an
| interpreted language at compile time, and there's no reason it
| can't be Python instead of C++ template metaprograms or CMake
| fancyfredbot wrote:
| Your take seems to contradict the article? You say SMT solvers
| give "no control over ISA/intrinsic level decisions" but their
| design.md says "user-defined scheduling operations can
| encapsulate common optimization patterns and hardware-specific
| transformations". Are they wrong about this? Can you explain
| why?
| alex7o wrote:
| Halide does something similar but as C++ for it dsl langauge:
| https://halide-lang.org/
| gotoeleven wrote:
| The Exo docs mention Halide as an example of a language similar
| to Exo but is "lowering-based", while Exo is "rewrite-based."
| This seems to mean that Halide is more of a DSL where what you
| want is specified at a higher level while Exo is a set of
| transformations you can apply to an existing kernel of code
| (though at least in the examples the kernel of code is written
| in python and then somehow C is generated from it, after
| transformations are applied). Do you know what the relative
| strengths of these two approaches are?
|
| Also, are there any languages in this vein that take more of a
| declarative (as opposed to imperative) approach?
| LegNeato wrote:
| See also https://github.com/rust-gpu/rust-gpu and
| https://github.com/rust-gpu/rust-cuda
___________________________________________________________________
(page generated 2025-03-14 23:00 UTC)