[HN Gopher] Compilers and IRs: LLVM IR, SPIR-V, and MLIR
___________________________________________________________________
Compilers and IRs: LLVM IR, SPIR-V, and MLIR
Author : matt_d
Score : 36 points
Date : 2022-10-29 19:19 UTC (3 hours ago)
(HTM) web link (www.lei.chat)
(TXT) w3m dump (www.lei.chat)
| thechao wrote:
| I have an irrational dislike of SPIR-V. On the flip side of the
| coin I think MLIR is a work of genius -- especially as a
| springboard for ideas in developing custom IR.
| fooker wrote:
| MLIR is also siloing compiler research and development.
|
| Every one and their mother has their own proprietary MLIR
| dialect nowadays, and the era of competitive open source
| compilers is sort of fading.
| k4st wrote:
| At Trail of Bits, we are creating a new compiler front/middle end
| for Clang called VAST [1]. It consumes Clang ASTs and creates a
| high-level, information-rich MLIR dialect. Then, we progressively
| lower it through various other dialects, eventually down to the
| LLVM dialect in MLIR, which can be translated directly to LLVM
| IR.
|
| Our goals with this pipeline are to enable static analyses that
| can choose the right abstraction level(s) for their goals, and
| using provenance, cross abstraction levels to relate results back
| to source code.
|
| Neither Clang ASTs nor LLVM IR alone meet our needs for static
| analysis. Clang ASTs are too verbose and lack explicit
| representations for implicit behaviours in C++. LLVM IR isn't
| really "one IR," it's a two IRs (LLVM proper, and metadata),
| where LLVM proper is an unspecified family of dialects (-O0, -O1,
| -O2, -O3, then all the arch-specific stuff). LLVM IR also isn't
| easy to relate to source, even in the presence of maximal debug
| information. The Clang codegen process does ABI-specific lowering
| takes high-level types/values and transforms them to be more
| amenable to storing in target-cpu locations (e.g. registers).
| This actively works against relating information across levels;
| something that we want to solve with intermediate MLIR dialects.
|
| Beyond our static analysis goals, I think an MLIR-based setup
| will be a key enabler of library-aware compiler optimizations.
| Right now, library-aware optimizations are challenging because
| Clang ASTs are hard to mutate, and by the time things are in LLVM
| IR, the abstraction boundaries provided by libraries are broken
| down by optimizations (e.g. inlining, specialization, folding),
| forcing optimization passes to reckon with the mechanics of how
| libraries are implemented.
|
| We're very excited about MLIR, and we're pushing full steam ahead
| with VAST. MLIR is a technology that we can use to fix a lot of
| issues in Clang/LLVM that hinder really good static analysis.
|
| [1] https://github.com/trailofbits/vast
| erichocean wrote:
| > _LLVM dialect in MLIR, which can be translated directly to
| MLIR_
|
| Should be:
|
| LLVM dialect in MLIR, which can be translated directly to _LLVM
| IR_
|
| Otherwise, great project! We're also using MLIR internally and
| it's been awesome, game-changing even when considering how much
| can be accomplished with a reasonable amount of effort.
| k4st wrote:
| Typo fixed! Thanks :-)
|
| I think the next big problems for MLIR to address are things
| like: metadata/location maintenance when integrating with
| third-party dialects and transformations. With LLVM
| optimizations, getting the optimization right has always
| seemed like the top priority, and then maybe getting metadata
| propagation working came a distant second.
|
| I think the opportunity with MLIR is that metadata/location
| info can be the old nodes or other dialects. In our work, we
| want a tower/progression of IRs, and we want them
| _simultaneously_ in memory, all living together. You could
| think of the debug metadata for a lower level dialect being
| the higher level dialect. This is why I sometimes think about
| LLVM IR as really being two IRs: LLVM "code" and metadata
| nodes. Metadata nodes in LLVM IR can represent arbitrary
| structures, but lack concrete checks/balances. MLIR fixes
| this by unifying the representations, bringing in structure
| while retaining flexibility.
| manv1 wrote:
| Funny that there was no mention of GCC, since it was probably one
| of the first IRs that anyone encountered IRL. If I remember
| correctly one motivation for Clang/LLVM was because GCC's IR was
| so bad.
|
| I knew people that wrote backends for gcc, and they pretty much
| all agreed it was a nightmare.
___________________________________________________________________
(page generated 2022-10-29 23:00 UTC)