[HN Gopher] Show HN: Kent Dybvig's Scheme Machine in 400 Lines o...
       ___________________________________________________________________
        
       Show HN: Kent Dybvig's Scheme Machine in 400 Lines of C (Heap-
       Memory Model)
        
       Author : swatson741
       Score  : 174 points
       Date   : 2025-10-06 14:06 UTC (8 hours ago)
        
 (HTM) web link (gist.github.com)
 (TXT) w3m dump (gist.github.com)
        
       | lambdaone wrote:
       | Amazing, and you could see how this could be translated into
       | perhaps a couple of thousand lines of assembly code to boostrap
       | Scheme almost from the bare metal, similar to the early IBM 709
       | LISP.
       | 
       | A thought: I wonder if an LLM would be up to the job of writing
       | the assembly code from this?
        
         | nostrademons wrote:
         | You could just use a C compiler to write the assembly code from
         | this, and it'd be far less buggy.
         | 
         | Before there were LLMs, there were about 65 years of other
         | program-writing-programs to save labor.
        
           | f1shy wrote:
           | As simple as
           | 
           | gcc -S heap-lisp.c
        
           | kazinator wrote:
           | Generally speaking, program-writing-programs (when they are
           | not either buggy, or being updated to a new version)
           | predictably write the same thing from the same specification
           | according to rules.
           | 
           | LLMs do not replace program-writing-programs; they should be
           | used to work with program-writing-programs, not as a
           | replacement.
           | 
           | E.g. I wouldn't use an LLM to convert a grammar to LR parsing
           | tables, but perhaps to get help with the grammar file.
        
         | listeria wrote:
         | > I wonder if an LLM would be up to the job of writing the
         | assembly code from this?
         | 
         | I could see a compiler doing that.
        
           | lambdaone wrote:
           | I'm quite aware of the existence of compilers, having worked
           | on bootstrapping a production LISP compiler in the past. My
           | point being that this would be an interesting experiment to
           | do this "naively", given how close C is to (for example)
           | PDP-11 assembly code.
        
         | tgv wrote:
         | If it's speed you're after, much more analysis (and thus code)
         | is needed. See V8.
        
         | swatson741 wrote:
         | A C compiler can output fairly readable code if you turn off
         | optimizations, and it's definitely not going to take thousands
         | of lines to do this in modern assembly. It may be only just
         | barely a thousand lines to do this in aarch64, and the LLM can
         | probably do it.
         | 
         | From what I've seen the LLM do it can definitely enhance these
         | programs if you know what to ask, and it can explain how any
         | piece this code works. It may even be able to add garbage
         | collection to the evaluator since the root registers are
         | explicit, and the evaluator only acts on static memory.
        
         | dannyobrien wrote:
         | This is the path that the GNU Mes and Guix folks are taking:
         | https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-...
         | 
         | (sans LLMs -- I believe they have a Scheme (GNU Mes) that can
         | be compiled from their 357 byte bootloader, and that Scheme can
         | run a C compiler that can compile tinycc, and I think there's a
         | path from tinycc to compiling gcc. I'm not sure how far it can
         | get after that -- this blog post[1] implies that you still need
         | a binary of guile to run Gash, which is a POSIX shell written
         | in Scheme. I'm guessing the plan is to either simplify Gash or
         | complexify Mes to be able to remove Guile as a dependency.
         | 
         | [1] https://guix.gnu.org/en/blog/2023/the-full-source-
         | bootstrap-...
        
         | anthomtb wrote:
         | > I wonder if an LLM would be up to the job of writing the
         | assembly code from this
         | 
         | Why ask a cluster of GPU's to do something any single CPU from
         | the last 30 years can accomplish?
        
           | CobrastanJorji wrote:
           | I ask this question every day, and I think the answer is the
           | same as the answer to the question "why are you doing this
           | with blockchain?"
        
       | glouwbug wrote:
       | I believe MIT Scheme compiled directly to C by intermingling
       | switch statements and gotos to emulate its call stack. Problem
       | was, as programs grew, compile times weren't linear. I gave it a
       | shot once, it was a somewhat spiritual experience:
       | 
       | https://github.com/glouw/switch
        
         | kkylin wrote:
         | Thanks! Do you mean MIT Scheme's C backend? I've used MIT
         | Scheme on and off for a long time and have never touched the C
         | backend & have no idea how it works, so this is interesting.
         | 
         | (MIT Scheme also has a native code compiler for Intel CPUs,
         | which seems to be what most users of MIT Scheme (an admittedly
         | small community) actually use.)
        
           | glouwbug wrote:
           | If you'd like to define the backend like this, it's the
           | easiest way to compile to C; we're just repurposing C as a
           | general-purpose assembler:
           | https://glouw.com/2023/11/07/Switch.html
           | 
           | I believe MIT-scheme took it a step further with gnu
           | extensions which allowed you to take the address of a label
           | like &&label, which allowed for function pointers:
           | https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
        
         | f1shy wrote:
         | I think a very rudimentary implementation (which is easy to
         | understand) is in the SICP book
        
           | soegaard wrote:
           | Even MIT Scheme was used with SICP, the Scheme implementation
           | in the book is different from MIT Scheme.
           | 
           | MIT Scheme was for a long period one of the leading Scheme
           | implementations. It lacks support for Apple Silicon, so it is
           | not as popular now, as it once was.
           | 
           | https://www.gnu.org/software/mit-scheme/
        
             | f1shy wrote:
             | Yes. In Apple the memory can be writable XOR executable.
             | MIT scheme needs to write on pages that will execute, so it
             | will trigger the MMU..
        
         | shawn_w wrote:
         | I don't see anything in the manual about MIT scheme compiling
         | to C, just to its own native code object files and some
         | portable bytecode...
         | 
         | https://www.gnu.org/software/mit-scheme/documentation/stable...
        
       | fanf2 wrote:
       | Instead of representing atoms as string literals, you can
       | represent them as global variables, eg                   const
       | char conti[] = "conti";
       | 
       | Then you can use pointer comparison instead of strcmp().
        
         | swatson741 wrote:
         | You'll still probably need the `strcmp` because the pointers
         | won't be the same unless you check for them and make them the
         | same.
         | 
         | You may be thinking about how `eq?` (reference equality) works
         | in scheme. That's usually done by hashing the identifier
         | string. Which is the more general solution to this equality
         | problem.
        
           | fanf2 wrote:
           | The atoms strcmp()ed by the interpreter are all created by
           | the compiler so you can ensure the pointers are equal by
           | construction.
        
             | swatson741 wrote:
             | You're right `virtmach` only works on things that are
             | output from `compile` and maintaining the invariant that
             | virtmach lisp uses those pointers isn't difficult to do in
             | with how the evaluator is presented.
             | 
             | It gives virtmach lisp and scheme different ontology, but I
             | can't think of any practical reason why that would matter
             | other than it makes things a little bit more complicated.
             | But, then again if I'm thinking practically scheme should
             | be using hashed identifiers, and then there's no reason for
             | them to have different ontology and conceptually we're
             | right back where we started with virtmach lisp and scheme
             | using identifiers as objects.
        
       | ashton314 wrote:
       | Kent Dybvig also wrote Chez Scheme [1] which on most benchmarks
       | is far-and-away the fastest [2] Scheme implementation.
       | 
       | Racket recently got rewritten to be based off of Chez [3] and I
       | have it from the maintainer of Racket himself that it's paid off
       | well: better performance and a smaller, more maintainable
       | codebase.
       | 
       | 1: https://github.com/cisco/ChezScheme (also, pronounced "Shay
       | Scheme")
       | 
       | 2: https://ecraven.github.io/r7rs-benchmarks/
       | 
       | 3:
       | https://m.youtube.com/watch?v=s3Q3M2wZ7rI&pp=0gcJCRsBo7VqN5t...
        
         | qhwudbebd wrote:
         | Reading and hacking on the Chez Scheme codebase is always a
         | treat and rather inspiring, especially compared with more
         | mainstream compilers and code generators. As well as Kent
         | Dybvig, Andy Keep's contribution (nanopass) is super-
         | impressive. The whole thing is so cleanly designed and
         | beautifully expressed.
        
       | saltcured wrote:
       | Since these seems to be turning into an opportunity to shout out
       | for other Scheme implementations...
       | 
       | I was a happy use of VSCM by Matthias Blume, doing a bunch of my
       | university assignments with it on my Linux PC in the early 90s.
        
       | kfarr wrote:
       | Shout out to Mr. Dybvig who taught our intro to CS course at IU
       | Bloomington 20+ years ago! It was my first intro to recursion (of
       | course the class was all scheme based) and gave me much more
       | confidence in software development coming from a self taught
       | background.
        
         | smrq wrote:
         | I had the great pleasure of studying compilers, and then
         | optimizing compilers, with Dybvig at IU in the late 2000s. He
         | was a phenomenal professor; obviously deeply knowledgeable
         | (that's a given), but also uncommonly gifted at imparting that
         | knowledge.
        
       | miga wrote:
       | How much faster or slower than 400 lines compiler?
       | https://news.ycombinator.com/item?id=26191243
        
       | djmips wrote:
       | Have you tested this? Does it work?
        
       | iberator wrote:
       | Do Forth bootstrapping now
        
       | rogerallen wrote:
       | Can someone give more info on the relevance of this particular
       | code?
        
         | apgwoz wrote:
         | This is an implementation of an interpreter from his
         | dissertation.
         | https://www.cs.unc.edu/xcms/wpfiles/dissertations/dybvig.pdf
        
       | dented42 wrote:
       | What an utterly delightful itty bitty scheme. <3
        
       ___________________________________________________________________
       (page generated 2025-10-06 23:00 UTC)