https://amodernist.com/texts/fun-c.html
Webpresence of Philip Kaludercic Start | Index | About | [?] Feed |
Contact
Consistently Making Wrong Decisions Whilst Writing Recreational C
Monday, 17 June 2024
For the past few days I have had a lot of fun polishing a small
program written in C. I call it "Trip" and the idea is to
automatically intercept specific Libc functions and simulate their
failure.
The background for writing a tool like this, is that I am a TA for a
systems-programming course at my university. All assignments are to
be written in C, and we place emphasis on robustness and portability.
My intention is to make it easier to provoke specific errors that
would usually only occur when the operating system refuses or is
incapable of providing the (finite) resources it virtualises (like
memory, time and access to peripheral devices).
One example, that is interesting when discussion the example of how a
Unix/CLI shell is implemented, is what happens when the fork(2)
system call fails -- and what the appropriate way to handle this error
is. Out of habit, a lot of people would just write
pid_t pid = fork();
if (pid == -1) {
perror("fork");
exit(EXIT_FAILURE);
}
/* ... */
But running into this case is tricky, as fork usually only fails when
the number of (dead or alive) processes on a system is too high,
which is a generally uncomfortable situation.
So the problem I am interested in is a comfortable way to
intentionally but safely provoke fork to fail.
One could use prlimit -u to limit the maximal number of processes or
GDB's return command to modify the process state and return the
"right" value indicating an error had occurred. While feasible, it is
not what I wanted.
Instead, what I came up with looks like this:
icterid$ trip fork:0.5 bash # after this point we are in a new shell:
icterid$ date
bash: fork: Cannot allocate memory
icterid$ date
Sun Jun 16 03:59:21 PM CEST 2024
icterid$ date
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
Sun Jun 16 03:59:28 PM CEST 2024
This demonstrates the behaviour of Bash, when on average fork(2) were
to fail 50% of the time. As you can see, it might set errno to
ENOMEM, in which case Bash gives up immediately. Sometimes it works,
and if errno is set to EAGAIN then it tries to handle the request a
few times, and in this case eventually succeeding. In the last case,
it waits for a bit between attempts, which makes sense as the number
of processes on a system is time-dependent and will hopefully
decrease.
Additionally, we could also list multiple functions that could fail,
give these different chances of failure and fix specific errno values
that we want to provoke.
I think that this is a simple and neat tool that can help students
think about what robust error handling entails.
The horribly pretty and the pretty horrible
My intention behind writing this text isn't to advertise "Trip". The
core functionality isn't what is fun to work on. What I want to
highlight, are some of the inane decisions I insisted on following
through, that led me to work on the program for over three years (on
and off) before getting it into a functional state.
The core mechanism
As mentioned above, one idea would be to take a debugger-like
approach and use something like ptrace to control and manipulate a
program.
But I did not do that, instead I use the LD_PRELOAD hack. By setting
this environmental variable, we can instruct the dynamic linker/
loader (ld.so) to look up the definition of certain symbols in a
shared object of our own, before it checks the usual places, such as
the actual standard library (libc.so).
The reasonable approach would be to have one executable and one
shared object, and have the executable set LD_PRELOAD to point to the
shared object before executing the intended program (e.g. bash in the
above example).
But no, then I'd have to figure out where to find the shared object
and the user would have to install two files! Just imagine all the
wasted disk space of having a multi kilobyte ELF headers lying
around. Entirely unreasonable!
Instead, I want one file, which is both an executable file, and a
shared object (seeing as both are just ELF files with different
"Types", DYN vs EXEC). An example of this already existing in some
form is the aforementioned shared object for the C standard library:
icterid$ /usr/lib/x86_64-linux-gnu/libc.so.6
GNU C Library (Debian GLIBC 2.36-9+deb12u7) stable release version 2.36.
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 12.2.0.
libc ABIs: UNIQUE IFUNC ABSOLUTE
Minimum supported kernel: 3.2.0
For bug reporting instructions, please see:
.
Glibc does this by setting a custom entry point.
Instead Trip compiles a regular position independent executable,
which should be sufficient for ld.so to preload, but it is not. If
you try that, you get the error message
$ LD_PRELOAD=.../preload-this ./execute-this
ERROR: ld.so: object '.../preload-this'' from LD_PRELOAD cannot be preloaded (cannot dynamically load position-independent executable): ignored.
this is the output ouf ./execeute-this.
...
which is not what I want. Luckly, due to a helpful program by Yubo
Xie, I found a trick to circumvent this draconian ordinance. And it
involves editing the ELF file generated by GCC with the intent of
flipping a single bit (DF_1_PIE). With this change, ld.so doesn't
object, while the file remains executable. So that means I can
preload the executable being executed as it executes some other
program. This brings me to the next step,
Preloading onself
This is not that interesting, but worth mentioning nevertheless. The
issue is that LD_PRELOAD requires an absolute path to the shared
object file it should use. But using argv[0] I can only determine
what the string was that invoked Trip.
Seeing as LD_PRELOAD is not portable^1, we can use a Linux-specific
hack to determine what the file of the program currently being
executed. This involves the /proc filesystem, specifically the /proc/
self directory.
As you might know, each Process with some PID has a /proc/[PID]/
directory with process metadata, such as the command line arguments,
the environmental variables, file descriptors, and so on exposed
using the file system interface. The /proc/self/ is a shorthand for
the /proc/[PID] directory of the current process. And in this
directory, there is a symbolic link called exe, that points to the
file we are looking for.
So the absolute path is just a readlink(2) away, that we can use to
construct the environmental assignment that we realise using execvpe
(2), a GNU extension of execvp that allows passing a list of "VAR=
VALUE" strings to specify the environment right after executing the
new file (we use getopt(3) to parse command line options, hence the
optint):
execvpe(argv[optind], argv + optind, (char*[]) {preload, conf, NULL});
What's conf? What we did not discuss up until now, is how we preserve
state between Trip "the executable" and Trip "the library". That is
were we use conf, a special environmental variable encoding the
options specified when invoking Trip.
____TRIP_CONFIGURATION
We are too deep into this rabbit-hole to give up now. There is no
going back. Once again, there will be an obvious and necessary
decision, though it might offend a few of the bluenoses amongst us.
As a reminder, we invoke Trip with a list of functions, optionally
specifying a chance it should "trip" (by default P(X)=1P(X) = 1) and
optionally fixing a specific errno value to set (otherwise this will
also decided on randomly from the list of known errno values for the
function, more on that below).
E.g.,
mkdir:0.25:ELOOP
says that mkdir(2) will fail 25% of the time, but when it does it
will set errno to ELOOP ("Too many levels of symbolic links").
Environmental variables give us a C string to encode the value. So no
NUL-bytes! This means we cannot just encode arbitrary binary data,
such as a struct containing a floating point value, that might
include a NUL-byte on char level.
Instead, we will to pass a list of triples, respectively specifying,
1. The function to trip
2. A chance encoded by a floating-point value serialised in
hexadecimal (%a), to use a radix with a higher information
density - if we disregard the 0x prefix...
3. An errno value, also in hexadecimal, or 0 to indicate an absence.
How do we encode the triple? By using ASCII group separators of
course. ascii(7) tells us these 35 in octal encodes these. And the
list? Isn't it obvious... By using ASCII record separators,
naturellement (36 in octal). As these aren't pretty to write, we
define two macros,
#define GS "\035" /* group separator */
#define RS "\036" /* record separator */
where the octal encoding of arbitrary charachters comes to use (a lot
of people are just familiar with the one special case, '\0' to encode
the NUL-byte, which is next to chmod probably one of the most common
places where octal notation is used - be it that it coincides with
most other bases).
To construct the value itself, I'd like to use printf(3)-style format
strings, to make parsing the value easier, but now much memory will I
need? Just any arbitrary constant won't do it! So I'll allocate it
dynamically, but here again, a mere malloc is so pedestrian. Instead
using open_memstream(3) we write into a FILE* that will construct a
string of sufficient size as needed.
Now we can write,
fprintf(h, "%s" GS "%a" GS "%x" RS,
entries[i].name, entries[i].chance,
entries[i].error)
which also makes use of C's string concatenation of literals, as GS
and RS are just strings that are substituted by the preprocessor.
Now we have encoded the intended state of the library. As discussed,
we set this while execing and later on will decode it when the
library is initialised. But how does that happen? Some brief comments
on the topic follow below.
Break on through to the other side of the moon
As we are relying on LD_PRELOAD, all the functions that we intend to
support, will have a little stump which either will invoke the
function in the usual fashion or simulate a failure. To decide which
of the two we should do, Trip exposes a little function called
____trip_should_fail (the _s indicate that it is very internal).
Basically it checks the configuration and rolls a dice if necessary.
But to do this, we first have to parse the configuration. To this
end, begin ____trip_should_fail by calling the sensibly named
function called init.
But wait, seeing the invocation of this function depends on the
execution of the tripped program, which might involve multiple
threads or otherwise concurrent execution, what happens when two
functions wish to check a yet-uninitialised ____trip_should_fail
contemporaneously? Why, that could mean that the internal state might
be initialised twice over, leaving us in an inconsistent disarray.
We might use a kind of "lock", e.g. pthread_mutex but consider the
costs of depending on yet another library, not to speak of the
enormous overhead that passive locking effectuates!
No, we need something more [fancy] appropriate. Luckly, C11 has just
the thing: atomic_flag! By statically allocating such a flag inside
init,
static atomic_flag done = ATOMIC_FLAG_INIT;
we can use atomic_flag_test_and_set to ensure that only one execution
will set the flag. The others we keep busy using a naive spin-lock,
we release as soon as the "chosen one" (whoever first completed
atomic_flag_test_and_set) parses the state and unlocks the spin-lock.
A nice of example of non-blocking synchronisation. I just think there
is something neat about it.
As an aside, parsing errno
One thing I omitted to mention above is how we convert a symbol
representation of an errno value, such as ELOOP, EAGAIN, ENOENT, ...
into a hexadecimal number. The last part is easy, as soon as I have
the numerical representation we use the %x format string (we have
seen that part already). The more interesting part is how the rest.
Here, some might argue that I could just define an associative array,
of the form
#define ENTRY(e) { .name = #e, .val = e },
struct errno_names {
char *name;
int val;
} names[] = {
ENTRY(ELOOP),
ENTRY(EAGAIN),
ENTRY(ENOENT),
/* ... */
}
but where's the fun in that? Instead, how about this? Each time we
want to look up what ELOOP does, we manually invoke cpp (the
standalone preprocessor) and just write the value we wish to expand
into its standard input? And what better tool for that than popen(3),
fopen(3)'s lesser known relative (with alleged ties to system(3))?
But the issue is, that with popen(3), we can only read or write,
quote the Linux manpage,
Since a pipe is by definition unidirectional, the type argument
may specify only reading or writing, not both; the resulting
stream is correspondingly read-only or write-only.
Pity, the unimaginative and ill-willed might say. No, I retort, there
is a way! Remember, gentle reader that popen takes a shell command.
We can easily include a pipe in there, and with a little echo inject
the input into the command being executed. So in effect, to resolve
ELOOP, we execute
echo "ELOOP" | cpp -include errno.h
and then read the output, until we encounter a lone number, that is
the value we were looking for! We just have to skip over some noise
to get to this point (try executing the above yourself, and be
surprise to see what you see).
There remains a nibble to quibble! Where do you get the memory for
that command, which you pass to popen(2), the irksomely inquisitive
inquire inappropriately. Why, we just allocate it on the stack! After
all, if you invoke snprintf the right way, it will tell you how much
space a format string will require. We reserve the requisive space
with a simple VLA, and bada-bing bada-bum, done.
But do we really want to invoke sprintf ourselves twice? Here's the
shtick: We don't. Instead, we put on our intermediate macrology hats
and summon this construct:
FILE *cpp;
$sprintf(cmd, "echo \"%s\" | cpp -include errno.h", name) {
cpp = popen(cmd, "r");
}
Here, $sprintf is a macro (we needlessly make use of the fact that
dollar-signs are (apparently^2) valid constituents of identifiers),
which will format some data into a local variable called cmd, visible
only in the subsequent block. How done this? Using the old trick,
that expands the above to a for-loop that executes exactly once. In
effect, it would look something like this:
FILE *cpp;
for (char cmd[snprintf(NULL, 0, "...", name) + 1], c = 1;
c == 1 && (snprintf(buf, sizeof buf, "...", name), 1);
c = 0) {
cpp = popen(cmd, "r");
}
Note that the second sprintf is executed only once, due to the
short-circuiting of && and that we disregard the return value of
sprintf even though tisn't necessary.
In Reality, we add some __COUNTER__ magic on top to avoid variable
capture (a problem which we do not encounter), which requires us to
glue together new symbols. Business as usual for the hat-wearers.
One more time
We were adroitly reticent on a crucial matter, up until now. But
behold, as all cards are now exposed, for all to see, for all detest
and abhore!
I never really thought about it, but there is no easy,
machine-readable way to figure out what errno values a function might
set. It is not encoded in the type system (if we might even speak of
such a thing), and parsing it out of the man pages is not something
I'd like to imagine doing reliably. So I must satisfy myself by
manually writing these facts down. And this turns out to be the
bottleneck of the entire operation.
We write these down, in a custom file format. In effect, these just
list blocks of this form:
errno ENOMEM
fail NULL
name strdup
params const char *s
return char *
separated by empty lines. How does this help us? Easy, we use a
little bit of AWK to convert these into C files. But do I really want
to write C code in AWK? Perhaps, but not today (or whenever I wrote
this initially). So instead, I define a little macro in C and convert
the block into an expression that the macro will use. That will look
like this:
DEF(char *, strdup, (const char *s), ( s), (NULL), ENOMEM)
When compiling the definition, this will expand to the kind of
function we described above: This will be our definition of strdup,
which LD_PRELOAD injects into the real executable and calls
____trip_should_fail from!
And while we are at it, the most devilishly delightful stratagem
remains: We need a table of all functions inside of the trip module
itself, among other things to implement the -l flag that lists all
supported functions. And while we could create a temporary file with
the requisite data, and #include that as usual, why bother our
destitute disk and not just inject what we need right into the array?
/* list of known commands */
#define DEF(ret, name, params, args, fail, ...) { #name, { __VA_ARGS__ }},
static struct entry_name {
char *name;
int errs[256/sizeof(int)-sizeof(char*)]; /* adjust if necessary */
} names[] = {
#include "/dev/stdin"
};
#undef DEF
After a moment, I am certain everyone will realise that including
from /dev/sdtdin means that we are reading from the standard input of
the compiler herself! But what will the input be? Here, as we are
knee-deep into GNU-isms we use a target-specific variable assignment,
that will take effect only when compiling trip.c to trip.o:
trip.o: CC := grep -h '^DEF' $(GENSRC) | sort -k2,2d -t, | $(CC) \
-DCOMPILER="\"$(shell $(CC) --version | sed 1q)\""
Where GENSRC are the list of files containing DEF forms. Yes, we grep
through all of these, sort by the function name (as we of course wish
to use bsearch(3) later on when looking up an entry) and then pipe
this into the C compiler^3. As prior to the #include statement, we
have re-defined DEF, we will discard the uninteresting arguments and
only use what we need to construct the table in a single go.
The best part about this is watching the output of make:
$ time make -j8
awk -f gen.awk db/stdio.db > db/stdio.c
awk -f gen.awk db/stdlib.db > db/stdlib.c
awk -f gen.awk db/string.db > db/string.c
awk -f gen.awk db/sys-socket.db > db/sys-socket.c
awk -f gen.awk db/unistd.db > db/unistd.c
cc -std=c11 -Wall -Wextra -Wformat=2 -Wuninitialized -Warray-bounds -Os -pipe -fsanitize=undefined,nonnull-attribute -fhardened -fanalyzer -ldl -fanalyzer -fsanitize=undefined,nonnull-attribute -fhardened fix-pie.c -o fix-pie
cc -std=c11 -Wall -Wextra -Wformat=2 -Wuninitialized -Warray-bounds -Os -pipe -fsanitize=undefined,nonnull-attribute -fhardened -fanalyzer -c -o db/string.o db/string.c
cc -std=c11 -Wall -Wextra -Wformat=2 -Wuninitialized -Warray-bounds -Os -pipe -fsanitize=undefined,nonnull-attribute -fhardened -fanalyzer -c -o db/stdio.o db/stdio.c
cc -std=c11 -Wall -Wextra -Wformat=2 -Wuninitialized -Warray-bounds -Os -pipe -fsanitize=undefined,nonnull-attribute -fhardened -fanalyzer -c -o db/stdlib.o db/stdlib.c
cc -std=c11 -Wall -Wextra -Wformat=2 -Wuninitialized -Warray-bounds -Os -pipe -fsanitize=undefined,nonnull-attribute -fhardened -fanalyzer -c -o db/sys-socket.o db/sys-socket.c
cc -std=c11 -Wall -Wextra -Wformat=2 -Wuninitialized -Warray-bounds -Os -pipe -fsanitize=undefined,nonnull-attribute -fhardened -fanalyzer -c -o db/unistd.o db/unistd.c
grep -h '^DEF' db/stdio.c db/stdlib.c db/string.c db/sys-socket.c db/unistd.c | sort -k2,2d -t, | cc -DCOMPILER="\"cc (GCC) 14.1.1 20240522 (Red Hat 14.1.1-4)\"" -std=c11 -Wall -Wextra -Wformat=2 -Wuninitialized -Warray-bounds -Os -pipe -fsanitize=undefined,nonnull-attribute -fhardened -fanalyzer -c -o trip.o trip.c
cc -o trip db/stdio.o db/stdlib.o db/string.o db/sys-socket.o db/unistd.o trip.o -ldl -fanalyzer -fsanitize=undefined,nonnull-attribute -fhardened
./fix-pie trip
real 0m1.177s
user 0m1.504s
sys 0m0.170s
Another fun thing is that this breaks LSP servers...
And there we have it. These were the most fun highlights of my little
hobby program. There are a few more points of interest, such as
function-internal typedefs, a custom assert implementation, and a
lagged Fibonacci random number generator taken from TAOCP, just to
quote Knuth in a comment, the persistent usage of dprintf and
user-dependent installation destinations, but those interested in
that can take a look on their own. I'm done explaining my code - as
the saying goes, if it was hard to write, it should be hard to read!
---------------------------------------------------------------------
Reflecting on the program, and having reminded myself of the internal
structure over the last few days, I do think that "polishing" is the
right word to describe what I enjoy. Compared to CS homework
assignments, competitive programming or the hacks that keep my
servers alive, the changes I make to this program are increasingly
insignificant and the commit messages are progressively elaborate. I
get to pin-point my focus on as little of a fragment of the code as I
wish, and pointlessly rewrite it to no semantic effect. While at it,
I get to try out ideas I had but no opportunity to apply and to skim
manuals and standards for inspiration on what I could do.
It is obvious even to me that this isn't everyone's idea of a fun
weekend, but for me there is something to it -- and I don't just mean
the unvarying grin on my face as I explain to a visibly disturbed
person how Trip works -- especially if I know that I actually should
be doing other stuff.
---------------------------------------------------------------------
1. E.g. Linux takes a space-delimited list of files, while OpenBSD
takes a colon-delimited list, and macOS/Darwin/XNU has an
entirely different variable and file format-[?]
2. Edit: Turns out that this is not true, GCC accepts $ but ISO C
doesn't require this to be the case.-[?]
3. The macro definition COMPILER makes use of the surprising fact
that most compilers, even TCC support a long --version flag. We
use this, instead of GCC's __VERSION__, as it appears to be more
informative-[?]