[HN Gopher] Show HN: Onramp Can Compile Doom
       ___________________________________________________________________
        
       Show HN: Onramp Can Compile Doom
        
       Author : ludocode
       Score  : 60 points
       Date   : 2024-12-30 06:40 UTC (1 days ago)
        
 (HTM) web link (ludocode.com)
 (TXT) w3m dump (ludocode.com)
        
       | in-pursuit wrote:
       | Since this hasn't gotten much attention, I just wanted to say
       | that I think this is a cool project. Nice work!
        
       | nenadg wrote:
       | as an alpine linux enthusiast, i can say that this is fantastic.
       | keep it clean
        
       | purple-leafy wrote:
       | Cool project, love the bit about aliens
        
         | purple-leafy wrote:
         | Can you do a blog about the goals of your project in terms of
         | tech-archaeology? Fascinating topic
        
       | tekknolagi wrote:
       | This is so great. I've been watching the project develop and it's
       | really neat to see this milestone!
        
       | gcr wrote:
       | TL;DR: this is an exercise in implementing a C compiler from
       | scratch. "From scratch" here means "without an existing
       | gcc/clang," so consider civilization destruction scenarios,
       | aliens reading our source code, EMP strike that takes out all
       | smart silicon, corporate policy won't let you download
       | development tools, you only have a javascript console and
       | gumption, etc...
       | 
       | To do this, you must:
       | 
       | 1. Implement a small tool that turns hexidecimal into binary (you
       | can do this in any language)
       | 
       | 2. Use whatever you have (python, POSIX shell, alien crystal
       | substrate, x86-64 machine code, ...) to implement a small VM that
       | runs simple bytecode. The VM has 16 registers and 16MB of working
       | memory. There are sixteen opcodes to implement for arithmetic,
       | memory manipulation, and control flow. There are also twelve
       | syscalls for fopen/fread/fwrite/unlink(!)/etc.
       | 
       | After these two steps (that you have to repeat yourself post-
       | civilization collapse), everything's self-hosted:
       | 
       | 3. Use the VM to write a manual linker that resolves labels
       | 
       | 4. Use the linker to write assembler for a custom assembly
       | language
       | 
       | 5. Use the assembler to implement a minimal C compiler /
       | preprocessor, that then compiles a more complex C compiler, that
       | can compile a C17 compiler, that then compiles doom
       | 
       | See also: nand2tetris (focus is on teaching, less pragmatism),
       | Cosmopolitan C (x64 as actually portable runtime)
        
       | jjnoakes wrote:
       | > Security: Compiler binaries can contain malware and backdoors
       | that insert viruses into programs they compile. Malicious code in
       | a compiler can even recognize its own source code and propagate
       | itself. Recompiling a compiler with itself therefore does not
       | eliminate the threat. The only compiler that can truly be trusted
       | is one that you've bootstrapped from scratch.
       | 
       | It is a laudable goal, but without using from-scratch hardware
       | and either running the bootstrap on bare metal or on a from-
       | scratch OS, I think "truly be trusted" isn't quite reachable with
       | an approach that only handles user-space program execution.
        
         | ludocode wrote:
         | Indeed! An eventual goal of Onramp is to bootstrap in
         | freestanding so we can boot directly into the VM without an OS.
         | This eliminates all binaries except for the firmware of the
         | machine. The stage0/live-bootstrap team has already
         | accomplished this so we know it's possible. Eliminating
         | firmware is platform-dependent and mostly outside the scope of
         | Onramp but it's certainly something I'd like to do as a related
         | bootstrap project.
         | 
         | A modern UEFI is probably a million lines of code so there's a
         | huge firmware trust surface there. One way to eliminate this
         | would be to bootstrap on much simpler hardware. A rosco_m68k
         | [1] is an example, one that has requires no third party
         | firmware at all aside from the non-programmable microcode of
         | the processor. (A Motorola 68010 is thousands of times slower
         | than a modern processor so the bootstrap would take days, but
         | that's fine, I can wait!)
         | 
         | Of course there's still the issue of trusting that the data
         | isn't modified getting into the machine. For example you have
         | to trust the tools you're using to flash EEPROM chips, or if
         | you're using an SD card reader you have to trust its firmware.
         | You also have to trust that your chips are legit, that the
         | Motorola 68010 isn't a modern fake that emulates it while
         | compromising it somehow. If you had the resources you'd
         | probably want to x-ray the whole board at a minimum to make
         | sure the chips are real. As for trusting ROM, I have some crazy
         | ideas on how to get data into the machine in a trustable way,
         | but I'm not quite ready to embarrass myself by saying them out
         | loud yet :)
         | 
         | [1]: https://rosco-m68k.com/
        
       | fuhsnn wrote:
       | I wonder what's the author's view on Forth, seems like the role
       | of the bytecode VM here might be interchangeable with a Forth
       | implementation.
        
         | ludocode wrote:
         | Author here. I think my opinion would be about the same as the
         | authors of the stage0 project [1]. They invested quite a bit of
         | time trying to get Forth to work but ultimately abandoned it.
         | Forth has been suggested often for bootstrapping a C compiler,
         | and I hope someone does it someday, but so far no one has
         | succeeded.
         | 
         | Programming for a stack machine is really hard, whereas
         | programming for a register machine is comparatively easy. I
         | designed the Onramp VM specifically to be easy to program in
         | bytecode, while also being easy to implement in machine code.
         | Onramp bootstraps through the same linker and assembly
         | languages that are used in a traditional C compilation process
         | so there are no detours into any other languages like Forth (or
         | Scheme, which live-bootstrap does with mescc.)
         | 
         | tl;dr I'm not really convinced that Forth would simplify
         | things, but I'd love to be proven wrong!
         | 
         | [1]: https://github.com/oriansj/stage0?tab=readme-ov-file#forth
        
           | entaloneralie wrote:
           | You might get a kick out of DuskOS(baremetal forth system)'s
           | C compiler.
           | 
           | https://git.sr.ht/~vdupras/duskos/tree/master/item/fs/doc/co.
           | ..
        
       | smurpy wrote:
       | Fascinating exercise and nice work!
       | 
       | Adjacent (resilient, low-level, big-vision, auditable) projects
       | include:
       | 
       | http://collapseos.org/ Forth OS, bootstrapable from paper, for
       | z80
       | 
       | https://urbit.org/ standalone, distributed, auditable, provable,
       | minimalist
       | 
       | https://justine.lol/ APE (actually portable executable);
       | cosmopolitan libc
        
       | binarymax wrote:
       | From the GitHub for on-ramp: it's "self-bootstrapping and can
       | compile itself from scratch". What does that mean? How can it
       | compile itself if it doesn't exist?
        
         | Koshkin wrote:
         | They do go into some detail of the steps involved. Basically,
         | it seems as though the system unravels itself, going from
         | simple things to more complex.
        
           | binarymax wrote:
           | Thanks, but perhaps I should have been more clear in my
           | question. How can something self-compile if it doesn't have a
           | compiler to start with? Does the onramp source contain some
           | machine code that is a compiler already?
        
             | Koshkin wrote:
             | Indeed it does! It all starts with a hexadecimal code which
             | is converted by a tool into the machine code for some
             | simple VM; there seem to be a few more steps, where one
             | thing is used to build another, etc. It is in this sense
             | that this particular system is said to be able to "compile
             | itself."
        
       ___________________________________________________________________
       (page generated 2024-12-31 23:01 UTC)