[HN Gopher] Why am I writing a Rust compiler in C?
       ___________________________________________________________________
        
       Why am I writing a Rust compiler in C?
        
       Author : todsacerdoti
       Score  : 104 points
       Date   : 2024-08-25 21:08 UTC (1 hours ago)
        
 (HTM) web link (notgull.net)
 (TXT) w3m dump (notgull.net)
        
       | prologist11 wrote:
       | This is super cool but what's interesting is that this same kind
       | of bootstrapping problem exists for hardware as well. What makes
       | computers? Previously built computers and software running on
       | them. The whole thing is really interesting to think about.
        
         | durumu wrote:
         | Which came first, the computers or the code?
        
           | hughesjj wrote:
           | Code, unless you count the abacus etc
        
           | nine_k wrote:
           | (The code, of course; the code drove music boxes and looms
           | centuries before computers. Same for chicken and egg: eggs
           | are maybe a billion years older.)
        
         | akira2501 wrote:
         | Then you look at the assembly for the old Cray-1 computers
         | (octal opcodes) and the IBM System/360 computers (word
         | opcodes), and you realize, they made it so amazingly simple you
         | can mostly just write the opcode bytes and assemble by hand if
         | you like.
         | 
         | Then x86 came along, without the giant budgets or the big
         | purchasers, and so they made that assembly as efficient and
         | densely packed as is possible; unfortunately, you lose what you
         | might otherwise conveniently have on other machines.
        
         | sekuntul wrote:
         | yep
        
       | fsckboy wrote:
       | TL;DR his goal is rust, but for bootstrapping a first rust
       | compiler for a new environment, the work is already done for C
       | 
       | the article is interesting, and links to some interesting things,
       | but that's what the article is about
       | 
       | his project is https://codeberg.org/notgull/dozer
       | 
       | he references bootstrappable builds https://bootstrappable.org/ a
       | systematized approach to start from the ground up with very
       | simple with a 512 byte "machine coder" (more basic than an
       | assembler) and build up from there rudimentary tools, a "small C
       | subset compiler" which compiles a better C compiler, etc, turtles
       | all the way up.
        
       | IshKebab wrote:
       | For bootstrapping it still feels weird to target C. You could
       | easily target a higher level language or just invent a better
       | language. You don't care about runtime performance. Feels like
       | you don't really gain that much by forcing yourself to jump
       | through the C hoop, and the cost of having to write an entire
       | compiler in C is huge.
       | 
       | Like, how hard would it be to go via Java instead? I bet you can
       | bootstrap to Java very easily.
        
         | ronsor wrote:
         | Every platform, for better or worse, gets a C compiler first.
         | Targeting C is the most practical option.
        
           | cozzyd wrote:
           | Right, but once you have C it's fairly straightforward to use
           | an interpreted language implemented in C (python, perl,
           | guile, lua, whatever).
           | 
           | Obviously such a compiler would likely be unusably slow, but
           | that's not important here.
        
             | trueismywork wrote:
             | You overestimate the comprehensiveness of C standard with
             | half the things being optional. It's not given that python
             | will compile on a minimal comforming C compiler.
        
               | cozzyd wrote:
               | True, but Lua probably will :)
        
         | syntheticnature wrote:
         | I'd expect it to be harder. I used to work on a large embedded
         | device that ran some Java code, and there was a specialist
         | vendor providing Java for the offbeat processor platform.
         | 
         | After a little digging, I found a blog post about it, and it
         | does sound denser than the poster's plans to bootstrap Rust:
         | https://www.chainguard.dev/unchained/fully-bootstrapping-jav...
        
         | fsckboy wrote:
         | > _feels weird to target C_
         | 
         | he's not targeting C, he's targeting rust; he's using C
         | 
         | it's an important distinction, because _he 's not writing the C
         | compilers involved_, he's leveraging them to compile his target
         | rust compiler which will be used to compile a rust-on-rust
         | compiler. The C compiler is the compiler he has available, any
         | other solution he would have to write that compiler, but his
         | target is rust.
        
       | iTokio wrote:
       | It's a huge project, I wonder if it wouldn't be simpler to try to
       | compile cranelift or mrustc to wasm (that's still quite
       | difficult) then use wasm2c to get a bootstrap compiler.
        
         | umanwizard wrote:
         | The resulting C would not be "source code".
         | 
         | Edit to explain further: the point is for the code to be
         | written (or at least auditable) by humans.
        
         | fallingsquirrel wrote:
         | That's the approach Zig is taking:
         | https://ziglang.org/news/goodbye-cpp/
        
       | foldr wrote:
       | Very cool project.
       | 
       | I'm not totally sold on the practical justification (though I
       | appreciate that might not be the real driving motive here). This
       | targets Cranelift, so it gives you a Rust compiler targeting the
       | platforms that Rust already supports. You _could_ use it to cross
       | compile Rust code from a non-supported platform to a supported
       | one, but then you 'd be using a 'toy' implementation for
       | generating your production builds (rather than just to bootstrap
       | a compiler).
        
       | nitwit005 wrote:
       | I'm not sure I see the point. To generate functional new binaries
       | on the target machine, rustc will need to support the target. If
       | you add that support to rustc, you can just have it build itself.
        
         | jeffparsons wrote:
         | It's about having a shorter auditable bootstrap process much
         | more than it is about supporting new architectures.
        
           | bawolff wrote:
           | Regardless, the process is so long that it seems inauditable
           | in practise.
           | 
           | Like i guess i can see the appeal of starting from nothing as
           | i kind of cool achievement, but i dont think it helps with
           | audited code.
        
           | dathery wrote:
           | Not dismissing the usefulness of the project at all, but
           | curious what the concrete benefits of that are -- is it
           | mainly to have a smaller, more auditable bootstrap process to
           | make it easier to avoid "Reflections on Trusting Trust"-type
           | attacks?
           | 
           | It seems like you'd need to trust a C compiler anyway, but I
           | guess the idea is that there are a lot of small C compiler
           | designs that are fairly easy to port?
        
       | nijaar wrote:
       | if this works would this make the rust compiler considerably
       | smaller / faster?
        
         | josephg wrote:
         | Smaller? Yes. Faster? Almost certainly not.
         | 
         | It really doesn't make sense to optimize anything in a
         | bootstrapping compiler. Usually the only code that will ever be
         | compiled by this compiler will be rustc itself. And rustc
         | doesn't need to run fast - just fast enough to recompile
         | itself. So, the output also probably won't have any
         | optimisations applied or anything like that.
        
       | Someone wrote:
       | If I were to try bootstrapping rust, I think I would write a
       | proto-rust in C that has fewer features than full rust, and then
       | write a full rust compiler in proto-rust.
       | 
       | 'proto-rust' might, for example, not have a borrow checker, may
       | have limited or no macro support, may never free memory (freeing
       | memory isn't strictly needed in a compiler whose only goal in
       | life is to compile a better compiler), and definitely need not
       | create good code.
       | 
       | That proto-rust would basically be C with rust syntax, but for
       | rust aficionados, I think that's better than writing a rust
       | compiler in "C with C syntax" that this project aims for.
       | 
       | Anybody know why this path wasn't taken?
        
         | sjrd wrote:
         | This is what we did for Mozart/Oz [1]. We have a compiler for
         | "proto-Oz" written in Scala. We use it to compile the real
         | compiler, which is written in Oz. Since the Scala compiler
         | produces inefficient code, we then _re_ compile the real
         | compiler with itself. This way we finally have an efficient
         | real compiler producing good code. This is all part of the
         | standard build of the language.
         | 
         | [1] https://github.com/mozart/mozart2
        
         | TwentyPosts wrote:
         | So now you're writing two compilers.
         | 
         | What did you actually gain from this, outside of more work?
        
           | Etheryte wrote:
           | Two simpler pieces of work as opposed to one complex one.
           | Even if the two parts might be more volume, they're both
           | easier to write and debug.
        
           | returningfory2 wrote:
           | Writing a small compiler in C and a big compiler in Rust is
           | simpler than writing a big compiler in C.
        
       | jhatemyjob wrote:
       | This is why the C ABI won't die for a very very long time, if
       | ever. It was never about performance or security, it's about
       | compatibility, and that's what most people don't understand. Bell
       | Labs was first.
        
         | cozzyd wrote:
         | Yes, I think rust made a big mistake for not going for a stable
         | (or at least mostly stable like C++) ABI (other than the C
         | one). The "staticly link everything" is fine for desktops and
         | servers, but not for e.g. embedded Linux applications with
         | limited storage. It's too bad because things like routers are
         | some of the most security sensitive devices.
        
       | wrs wrote:
       | Given that we're this far along, bootstrapping is purely an
       | aesthetic exercise (and a cool one, to be sure -- I love
       | aesthetic exercises). If it were an actual practical concern,
       | presumably it would be much easier to use the current rustc
       | toolchain to compile rustc to RISC-V and write a RISC-V emulator
       | in C suitable for TinyC. Unless it's a trust exercise and you
       | don't trust any rustc version.
        
       | modevs wrote:
       | Rewrite it in C. Great idea. Just do not tell it to rust
       | community...
       | 
       | I like to see that programmers like you still exist and belive in
       | what they do.
       | 
       | Remembered this article...
       | https://drewdevault.com/2019/03/25/Rust-is-not-a-good-C-repl...
        
       | cranky908canuck wrote:
       | [delayed]
        
       ___________________________________________________________________
       (page generated 2024-08-25 23:00 UTC)