[HN Gopher] How to learn compilers: LLVM Edition
___________________________________________________________________
How to learn compilers: LLVM Edition
Author : AlexDenisov
Score : 62 points
Date : 2021-11-04 21:00 UTC (1 days ago)
(HTM) web link (lowlevelbits.org)
(TXT) w3m dump (lowlevelbits.org)
| anonymousDan wrote:
| I have to say personally I find general program analysis (e.g.
| for security) a much more interesting topic than most vanilla
| compiler courses. For example I recently came across this course
| by the maintainers of soot:
| https://youtube.com/playlist?list=PLamk8lFsMyPXrUIQm5naAQ08a...
|
| Any pointers to similar courses much appreciated!
| the_benno wrote:
| Anders Moeller and Michael Schwartzbach's book [1] on static
| program analysis is a fantastic resource, with (I think) a
| great balance of theory and practice. If you want to get really
| deep into the theory of program analysis, Patrick Cousot just
| published an incredibly thorough book on abstract
| interpretation (just got my copy this week, so haven't fully
| explored enough to have much of an opinion on it as a
| pedagogical resource)
|
| [1] cs.au.dk/~amoeller/spa
| andrewchambers wrote:
| More great things:
|
| - https://c9x.me/compile/
|
| - https://github.com/vnmakarov/mir
| tester34 wrote:
| since here's many compiler hackers then I'd want to ask question:
|
| How do you distribute your frontend with LLVM?
|
| Let's say that I have lexer, parser and emitter written in e.g
| Haskell (random example)
|
| I emit LLVM IR and then I use LLVM to generate something other
|
| but the problem is, that I need to have LLVM binaries and I'd
| rather avoid telling people that want to contribute to my OSS
| project to install LLVM because it's painful process as hell
|
| So I thought about just adding "binaries" folder to my repo and
| put executables there, but the problem is that they're huge as
| hell! and also when you're on linux, then you don't need windows'
| binaries
|
| Another problem is that LLVM installer doesnt include all LLVM
| components that I need (llc and wasm-ld), so I gotta compile it
| and tell cmake (iirc) to generate those
|
| I thought about creating 2nd repo where there'd be all binaries
| compiles for all platforms: mac, linux, windows and after cloning
| my_repo1, then instruction would point to download specific
| binaries
|
| How you people do it?
| staticfloat wrote:
| In the Julia world, we make redistributable binaries for all
| sorts of things; you can find lots of packages here [0], and
| for LLVM in particular (which Julia uses to do its codegen) you
| can find _just_ libLLVM.so (plus a few supporting files) here
| [1]. If you want a more fully-featured, batteries-included
| build of LLVM, check out this package [2].
|
| When using these JLL packages from Julia, it will automatically
| download and load in dependencies, but if you're using it from
| some other system, you'll probably need to manually check out
| the `Project.toml` file and see what other JLL packages are
| listed as dependencies. As an example, `LLVM_full_jll` requires
| `Zlib_jll` [3], since we build with support for compressed ELF
| sections. As you may have guessed, you can get `Zlib_jll` from
| [4], and it thankfully does not have any transitive
| dependencies.
|
| In the Julia world, we're typically concerned with dynamic
| linking, (we `dlopen()` and `dlsym()` our way into all our
| binary dependencies) so this may not meet all your needs, but I
| figured I'd give it a shout out as it is one of the easier ways
| to get some binaries; just `curl -L $url | tar -zxv` and you're
| done. Some larger packages like GTK need to have environment
| variables set to get them to work from strange locations like
| the user's home directory. We set those in Julia code when the
| package is loaded [5], so if you try to use a dependency like
| one of those, you're on your own to set whatever environment
| variables/configuration options are needed in order to make
| something work at an unusual location on disk. Luckily, LLVM
| (at least the way we use it, via `libLLVM.so`) doesn't require
| any such shenanigans.
|
| [0] https://github.com/JuliaBinaryWrappers/ [1]
| https://github.com/JuliaBinaryWrappers/libLLVM_jll.jl/releas...
| [2]
| https://github.com/JuliaBinaryWrappers/LLVM_full_jll.jl/rele...
| [3]
| https://github.com/JuliaBinaryWrappers/LLVM_full_jll.jl/blob...
| [4] https://github.com/JuliaBinaryWrappers/Zlib_jll.jl/releases
| [5]
| https://github.com/JuliaGraphics/Gtk.jl/blob/0ff744723c32c3f...
| xrisk wrote:
| You can statically link LLVM, no problem.
|
| In fact, you never have to call any binaries specifically; just
| do it through code and everything should link at compile-time
| and become one big binary.
| 10000truths wrote:
| This is in fact what Zig does. Everything is statically
| linked into one binary that is used for compiling, linking,
| building, testing etc.
| HowardStark wrote:
| Not a compiler hacker and unfamiliar with the scene but is
| there a specific reason that `git-lfs` wouldn't work? It's the
| first thing that came to mind reading this. You can also pretty
| easily fetch specific objects as opposed to everything, so in
| your README you could direct contributors to only fetch
| specific binaries for given tasks.
| jcranmer wrote:
| When you have an LLVM frontend, what you generally do is have
| your driver run the optimization and code generation steps
| itself using the LLVM APIs rather than using opt/llc binaries
| to drive this step. That way, you don't need the LLVM binaries,
| just the libraries that you statically link into your
| executable.
|
| For example, all of the code in clang to do this is located in
| https://github.com/llvm/llvm-project/blob/main/clang/lib/Cod...
| tester756 wrote:
| What if my frontend is written in non cpp? e.g haskell, js,
| java, c#, etc.
| jcranmer wrote:
| You use the LLVM-C bindings via your favorite FFI mechanism
| to generate the code then, usually.
| [deleted]
| chrisaycock wrote:
| I learned a lot about LLVM by looking at the compiler output from
| Clang: clang -emit-llvm -S sample.cpp
|
| The article mentions Clang's AST, which can also be emitted:
| clang -Xclang -ast-dump -fsyntax-only sample.cpp
|
| And for checking compiler outputs across lots of languages and
| implementations, there's always Matt Godbolt's Compiler Explorer:
| https://godbolt.org
| CalChris wrote:
| 1. _Getting Started with LLVM Core Libraries_
|
| It's a bit dated (covers DAGISel rather than GlobalISel) but it
| gives a thorough introduction.
|
| 2. LLVM Developer Meeting tutorials
|
| These are _really_ good although you 'll have to put them in
| order yourself. They will be out of date, a little. LLVM is a
| moving target. Also, you don't have to go through every tutorial.
| For example, MLIR is not for me.
|
| 3. LLVM documentation
|
| I spent less time reading this than going through the Developer
| Meeting tutorials. I generally use it as a reference.
|
| 4. Discord, LLVM email list, git blame, LLVM Weekly
|
| ... because you will have questions.
|
| 5. MyFirstTypoFix (in the docs)
|
| ... when it comes time to submit a patch.
|
| 6. Mips backend
|
| If you're doing a backend, you will need a place to start. The
| LLVM documentation points you to the horribly out of date SPARC
| backend. Don't even touch that. AArch64 and x86 are very full
| featured and thus very complex (100 kloc+). Don't use those
| either. RISC-V is ok but concerns itself mostly with supporting
| new RISC-V features rather than keeping up to date with LLVM
| compiler services. Don't use that either although _definitely_
| work through Alex Bradbury 's RISC-V backend tutorials. Read the
| Mips backend. It is actively maintained. It has good GlobalISel
| support almost on par with the flagship AArch64 and x86 backends.
|
| BTW, Chris Lattner is a super nice guy.
| jcranmer wrote:
| > 4. Discord, LLVM email list, git blame
|
| and don't forget IRC!
| UncleOxidant wrote:
| > LLVM Developer Meeting tutorials
|
| Are these all in one place or scattered about?
| CalChris wrote:
| Either llvm.org under Developer Meetings or the LLVM Youtube
| channel. The advantage of llvm.org is that it has a lot of
| the PDFs for the presentations as well as some old, pre-
| Youtube tutorials.
___________________________________________________________________
(page generated 2021-11-05 23:00 UTC)