https://www.humprog.org/~stephen/blog/2024/09/05/

Rambles around computer science

Diverting trains of thought, wasting precious time

Thu, 05 Sep 2024

A tiny self-remaking C program

In my last post I floated the idea of writing a slow, edit-prone
shell script instead as a self-rebuilding C program.

Just for fun, here is the briefest way I've found to make a one-file
C program self-rebuilding. Note that it only works given an env that
implements the -S option. That includes the env in GNU Coreutils 8.30
or above, and (so I'm told) FreeBSD's. So regard it as a total hack
not for serious use! Of course for a more portable version you can
use #!/bin/sh and the self-extracting shell script trick, at a cost
of embedding more lines of shell script.

#!/usr/bin/env -S sh -c '[ "$0".bin -nt "$0" ] || tail -n+2 "$0" | cc -x c - -o "$0".bin || exit 1; exec "$0".bin "$@"'
#include <stdio.h>

int main(void)
{
    printf("Hello, world!\n");
    return 0;
}

What's interesting to me is that this models what I've long been
thinking of as the right conceptual basis for build systems: "build
is the first stage of execution". Even the nicest approaches to build
that I've seen, such as the "a la carte" paper, still see the
exercise as one of building a collection of strings that are to be
executed as opaque commands. But commands themselves are just
computations; they have semantics (e.g. some are pure functions, so
their outputs are cacheable). We should be thinking of build as a
computation, just as execution is a computation. Why not all part of
the same computation?

Obviously, we want to cache intermediate results--as a way to start
execution faster. (This comes at the cost of being specialised to our
prior inputs: whatever stuff we ingested during build.) The first
line above is a bit like a specialised version of "make" just for
this program, including the incrementality (only rebuild if
timestamps dictate). For a larger program with a "proper" build
system, it's not clear what we'd do.

Rebuilding freshly modified source and then running it straight away
is also only what you want to do on a development machine, not for
"proper" deployment--unless your build system incorporates enough
checks that it establishes the assurances you want. Should tests be
integrated into build? The recent xz backdoor gives an argument
against. But conversely, one could argue that any built binary should
be subjected to the greatest available scrutiny before it is run. The
problem in xz's case was that the scrutiny was itself complex enough
to hide the insertion of a back door; maybe we just need better
scrutiny of the scrutiny? I have no strong opinions about that as
yet.

[/devel] permanent link contact

---------------------------------------------------------------------
Powered by blosxom

validate this page