_______ __ _______
| | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----.
| || _ || __|| < | -__|| _| | || -__|| | | ||__ --|
|___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____|
on Gopher (inofficial)
(HTM) Visit Hacker News on the Web
COMMENT PAGE FOR:
(HTM) Booting Linux in QEMU and Writing PID 1 in Go to Illustrate Kernel as Program
pa7ch wrote 11 hours 6 min ago:
Gokrazy is a minimal linux distro that just boots into a go init
program. You can run on a raspberry pi or pc. It has a little init
system that just takes a path you normally use in `go run` and just
runs them and restarts as needed. Its been a joy for me to play around
with. Has A/B updates as well.
(HTM) [1]: https://gokrazy.org/
bradfitz wrote 12 hours 13 min ago:
Related, I gave a 6 minute lightning talk about writing tests in Go
that use the test binary itself as the PID 1 under an emulated Linux in
QEMU: [1]
(HTM) [1]: https://docs.google.com/presentation/d/1rAAyOTCsB8GLbMgI0CAbn6...
(HTM) [2]: https://www.youtube.com/watch?v=69Zy77O-BUM
CupricTea wrote 15 hours 7 min ago:
I got close to this realization after learning barely enough U-Boot to
launch my own bare metal program for the JH7110. I could never get into
Linux From Scratch because it was more focused on getting an entire
system working when I really just wanted to see how it spins up to get
going.
Then at some point the other week I realized I could technically have a
working Linux "system" with nothing more than a kernel and a dirt
simple hello world program in /sbin/init.
I haven't had the time or inclination to scratch that itch but it's
nice to see this article confirm it.
checker659 wrote 14 hours 40 min ago:
Pass init=/bin/sh or what have you in GRUB cmdline
opello wrote 10 hours 10 min ago:
I'm sure it's useful elsewhere, but I have used this for years to
debug embedded Linux environments, it's such a handy tool.
tosti wrote 12 hours 27 min ago:
Traditionally,
init=/etc/rc
And have that be a shell script which starts whatever you need.
You'll probably want fsck in there, mount -a, some syslogd, perhaps
dbus, some dhcp client, whatever else you need, and finally the
getty which is probably a good idea to respawn after it exits.
That's usually the job of init so you could well end your rc with
exec /sbin/init
markhahn wrote 15 hours 28 min ago:
isn't this obvious?
maybe the audience is people who've never heard of init or thought
about kernel vs userspace.
teraflop wrote 16 hours 5 min ago:
Nice article! One point of clarification:
> When the kernel starts it does not have all of the parts loaded that
are needed to access the disks in the computer, so it needs a
filesystem loaded into the memory called initramfs (Initial RAM
filesystem).
The kernel might not have all the parts needed to mount the filesystem,
especially on a modern Linux distro that supports a wide variety of
hardware.
Initramfs exists so that parts of the boot logic can be handled in
userspace. Part of this includes deciding which device drivers to load
as kernel modules, using something like udev.
Another part is deciding which root filesystem to mount. The root FS
might be on an LVM volume that needs to be configured with
device-mapper, or unlocked with decrypt. Or it might be mounted over a
network, which in turn requires IP configuration and authentication.
You don't want the kernel to have those mechanisms hard-coded, so
initramfs allows handling them in userspace.
But strictly speaking, you don't need any of that for a minimal system.
You can boot without initramfs at all, as long as no special userspace
setup is required. i.e., the root FS is a plain old disk partition
specified on the kernel command line, and the correct drivers (e.g. for
a SCSI/SATA hard drive) are already linked into the kernel.
ktpsns wrote 8 hours 30 min ago:
When I used Gentoo, where you typically configure&compile the kernel
yourself, I never used initramfs.
This was 20yrs ago. Gentoo was really a great teacher.
spwa4 wrote 8 hours 11 min ago:
Problem with that was that you'd run literally every module
initialization and occasionally there were some that crashed the
kernel.
piperswe wrote 6 hours 32 min ago:
Only if you compiled your kernel with literally every module. If
you compile your kernel with only the modules your system needs,
thereâs no such issue
seanw444 wrote 12 hours 29 min ago:
I've used Linux for quite some time, and had always kinda wondered
what purpose initramfs served, since I have to rebuild it so often.
Thanks.
tosti wrote 11 hours 0 min ago:
Linux includes a cpio utility and documentation for building your
own initramfs.
tosti wrote 12 hours 39 min ago:
This. Only CPU microcode can't be loaded without an initramfs unless
you enable late loading, but that's labeled dangerous because it may
cause instability. If needed, you could let the built-in motherboard
uefi do the microcode updates instead.
maxboone wrote 16 hours 17 min ago:
Another cool way to show that 'the Linux kernel as "just a program"' is
that you can also run the kernel as a regular binary without needing
QEMU to emulate a full system:
-
(HTM) [1]: https://www.kernel.org/doc/html/v5.9/virt/uml/user_mode_linux....
maccard wrote 16 hours 24 min ago:
Stupid question, but what does the default init program do? If I have a
single application (say a game), can I just set up the file system,
statically link my game and bundle it as an iso, rather than say
containerising it?
Purely academic.
markhahn wrote 15 hours 24 min ago:
of course. init is just pid 1. it can be a copy of "Hello, World!"
(suitably linked) or whatever.
maxboone wrote 16 hours 13 min ago:
Absolutely, and the init system does not even have to set up the
filesystem and all. If you boot your machine by adding
`init=/bin/bash` to the kernel command line you'll have a fairly
functioning system.
Do anything necessary from there to boot your game, and record those
steps in a script. When that's done you can just point your init
cmdline to that script (doesn't even have to be a binary, a script
with the #!/bin/bash shebang should just work).
Gazoche wrote 16 hours 16 min ago:
In theory yes, though depending on the complexity of your game you
may need to bundle a lot of userspace libraries and other programs
along with your kernel to make it work. Most graphical applications
expect a display server like X11 or Wayland to talk to, at minimum.
maccard wrote 9 hours 35 min ago:
Yeah, that's the hard part (but also the appeal). How minimal can I
go and still have a single-use system. Maybe a holiday project...
Tigike wrote 17 hours 21 min ago:
Wow, what a nice and easily understandable explanation of an
overcomplicated topic. This kind of teaching method is so much needed
in software development.
markhahn wrote 15 hours 25 min ago:
I'm curious why you think it's overcomplicated.
That is: this seemed like the first 3 minutes of the first lecture on
an freshman OS course, or similar in any book on systems. The
complication you refer to - is it just from the clutter of adjacent
words (EFI, grub, kmod maybe?)
mlrtime wrote 1 hour 16 min ago:
Try to read a document/book on the linux boot process and it is
VERY complicated if you actually want to know all the steps from
POST to a tty login. You can strip some of it away but focusing on
one path (UEFI vs BIOS) or just ignoring the instruction pointer
movements.
I agree, little nuggets like this are valuable even if know it
already.
LorantToth wrote 17 hours 45 min ago:
Love how simply you explain concepts that are completely foreign to me.
Enjoyed it very much!
zsofia wrote 18 hours 12 min ago:
Nice demo. Itâs great to see such a clean, beginner-friendly
explanation of kernel vs. init responsibilities.
akpa1 wrote 18 hours 55 min ago:
I love that it's possible to boot a raw Linux kernel this way; I only
learned about it very recently when working on a university project. It
makes me want to fiddle around with it more and really understand the
nuts and bolts of a modern Linux system and work out what actually is
responsible for what and, crucially, when it happens.
jkrejcha wrote 19 hours 4 min ago:
A fun little tidbit, if you don't provide an init to the kernel command
line, it'll try to look for them in a few places in this order:
1. /sbin/init
2. /etc/init
3. /bin/init
4. /bin/sh
It dropping you into a shell is a pretty neat little way to allow
recovery if you somehow really borked your init
wibbily wrote 15 hours 21 min ago:
The kernel even has a special error message for you when it happens:
> Bailing out, you are on your own. Good luck.
(HTM) [1]: https://unix.stackexchange.com/questions/96720
kmm wrote 14 hours 5 min ago:
That's actually a message from the (Arch) initramfs[1], in case it
can't mount the root filesystem or find an init to hand off to.
The kernel has a different error message: "No working init found.
Try passing init= option to kernel."[2]
1: [1] 2:
(HTM) [1]: https://github.com/archlinux/mkinitcpio/blob/2dc9e12814aaf...
(HTM) [2]: https://github.com/torvalds/linux/blob/d358e5254674b70f34c...
gr4vityWall wrote 19 hours 12 min ago:
The writing is really succinct and easy to follow.
One thing that could be improved is that the author could break down
some of the commands, and explain what their arguments mean. For
example:
> mknod rootfs/dev/console c 5 1
Depending on the reader's background, the args 'c', '5', and '1' can
look arbitrary and not mean much. Of course, we can just look those up,
and it doesn't make the article worse.
0xFEE1DEAD wrote 12 hours 42 min ago:
For anyone curious: "c" just means that it's a character device.
There is also "b" for block device (e.g. a disk, a partition, or even
something like a loopback device) and "p" for FIFOs (similar to
mkfifo).
The two numbers are just identifiers to specify the device, so in
case of `5 1` it means the systems tty console, while `1 8` would
mean "blocking random byte device" (mknod dev/random c 1 8)
zsoltkacsandi wrote 20 hours 3 min ago:
Author here. It was a bit emotional seeing this on the front page.
My goal with this post and the whole (work in progress) series is to
fill the gap between "here are the commands to do X" and "if you want
to contribute to the kernel, you need to learn this" style books and
tutorials.
I want something in between, for developers who just want a solid
mental model of how Linux fits together.
The rough progression I have in mind is:
1. the Linux kernel as "just a program"
2. system calls as the kernel's API
3. files as resources manipulated through system calls, forming a
consistent API
4. the filesystem hierarchy as a namespace system, not a direct map of
disk layout
5. user/group IDs and permissions as the access control mechanism for
resources (files)
6. processes, where all of the above comes together
I deliberately chose Go for the examples instead of C because I want
this to be approachable to a broader audience of developers, while
still being close enough to the OS to show what's really going on.
As a developer, this kind of understanding has been incredibly useful
for me for writing better software, debugging complex issues with tools
like strace and lsof, or the proc fs. I would like to help others to
gain the same knowledge.
kunley wrote 16 hours 20 min ago:
Hi! Great article.
I guess also one of the points of using Go was the fact it has own
memory management for obtaining memory pages it interacts only with
the kernel.
I mean, had you used C, it would be better to compile it statically,
otherwise you'd need to put also glibc and ld.so and what else into
the initrd, I guess
pollux_423 wrote 18 hours 11 min ago:
Really cool post, clear, easy to follow, just the right length and
depth. Lookig forward to read the whole series!
preisschild wrote 19 hours 19 min ago:
Another "interesting" related thing I found is that pid 1 signals are
handled differently in the kernel. Basically, SIGTERM is ignored and
you need to explicitly handle it in your program. Took me quite a
while before I found out why my program in a container didn't quit
gracefully...
(HTM) [1]: https://raby.sh/sigterm-and-pid-1-why-does-a-container-linge...
potato-peeler wrote 19 hours 22 min ago:
Can you also consider adapting Linux from scratch as a part of this
series? Or Maybe after this series, you can expand what is learnt to
build a minimal Linux distribution. I suppose that might give a good
understanding on how to apply this knowledge and a have a foundation
on the internals of the os itself.
zsoltkacsandi wrote 16 hours 23 min ago:
I want to keep this series focused, but LFS-style content is
definitely something I'm considering for later, I think it's a good
idea.
That said, this series will also give you practical, applicable
knowledge as we progress.
WesolyKubeczek wrote 20 hours 18 min ago:
Can anyone explain why CGO_ENABLED needs to be set to 1 here?
zsoltkacsandi wrote 19 hours 19 min ago:
In the post it is set to 0. `CGO_ENABLED=0 go build -o init .`
The only reason is because I like to be explicit, and I could not
know what was set before in the user's environment.
westurner wrote 20 hours 22 min ago:
Systemd service unit and systemd-nspawn support could be written in Go,
too;
From [1] re: "MiniBox, ultra small busybox without uncommon options":
> There's a pypi:SystemdUnitParser.
> docker-systemctl-replacement > systemctl3.py parses and schedules
processes defined in systemd unit files: [2] From a container2wasm
issue about linux-wasm the other day: [3] :
> [ uutils/uucore, uutils/coreutils, uutils/procps, uutils/util-linux,
findutils, diffutils, toybox (C), rustybox, ]
(HTM) [1]: https://news.ycombinator.com/item?id=41270425
(HTM) [2]: https://github.com/gdraheim/docker-systemctl-replacement/blob/...
(HTM) [3]: https://github.com/container2wasm/container2wasm/issues/550#is...
CSDude wrote 21 hours 3 min ago:
I had a similar experiment ~10yr ago, see relevant discussion [1] And
updated domain:
(HTM) [1]: https://news.ycombinator.com/item?id=11064694
(HTM) [2]: https://mustafaakin.dev/posts/2016-02-08-writing-my-own-init-w...
mbana wrote 14 hours 33 min ago:
Interesting ...
Do you still maintain the site?
alexellisuk wrote 21 hours 9 min ago:
Interesting starter post.. I took this one step further a few years ago
to make the init mount various other /proc /sys etc filesystems and
boot up with Firecracker - using a container image as a rootfs.. GitHub
[1] Blog post:
(HTM) [1]: https://github.com/alexellis/firecracker-init-lab
(HTM) [2]: https://actuated.com/blog/firecracker-container-lab
mrbluecoat wrote 21 hours 12 min ago:
> If you ever wondered what this name means: vmlinuz: vm for virtual
memory, linux, and z indicating compression
Thank you. I have always wondered that.
Tor3 wrote 16 hours 42 min ago:
In the early days when the kernel was small (I used to build kernels
and copy them to floppy disks, and boot Linux from there) the kernel
was called 'vmlinux', and when compression was added after the kernel
started to get bigger it became 'vmlinuz'. It was still possible to
boot from 'vmlinux', and it may be possible today as well, for all I
know.
9front wrote 6 hours 6 min ago:
And 'vmlinux' was inspired by the 'vmunix' (Virtual Memory Unix)
the UNIX kernel.
drnick1 wrote 21 hours 50 min ago:
It's a bit unnatural to use Go when C is the "native language" of Linux
and pretty much every operating system.
themafia wrote 19 hours 13 min ago:
Go can speak C. It's fine.
zsoltkacsandi wrote 19 hours 31 min ago:
The goal was to strip away most of the complexities (including C), to
make the topic more approachable for a broader audience.
Go seemed a perfect fit, it is easy to pick up the syntax and see
what is going on, but you can still be close to the OS.
ktpsns wrote 21 hours 37 min ago:
Talos Linux [1], "the Kubernetes Operating System", is written in Go.
That means it exactly works as the little demo here, where the Kernel
hands over to a statically compiled Go code as init script.
Talos is really an interesting linux distribution because it has no
classical user space, i.e. there is no such thing as a $PATH
including /bin, /usr/bin, etc. The shell is instead a network API,
following the kubernetes configuration-as-code paradigm. The linux
host (node) is supposed to run containerized applications. If you
really want to, you can use a special container to get access to the
actual user space from the node. [1]
(HTM) [1]: https://www.talos.dev/
(HTM) [2]: https://github.com/siderolabs/talos/releases/tag/v1.11.5
tayo42 wrote 13 hours 22 min ago:
Off-topic i guess. Are there like large scale success stories using
this os?
ktpsns wrote 8 hours 59 min ago:
Yes. I know at least one big cloud provider (actually the
biggest) in Germany who uses Talos for their managed k8s.
preisschild wrote 19 hours 30 min ago:
I also use Talos, but I wonder if just using systemd for the init
process wouldn't have been easier. You can interface with systemd
in go quite easily anyways...
cpach wrote 19 hours 28 min ago:
s6 (perhaps with s6-rc) is another interesting option. One could
say itâs less opinionated than systemd. Or perhaps itâs more
correct to say it has another set of opinions.
cpach wrote 21 hours 47 min ago:
I mean what you run is still machine code anyway, right?
zoobab wrote 21 hours 54 min ago:
Is there a patch for systemd so that you can start it without PID1
monopoly?
fxbois wrote 21 hours 55 min ago:
Thank you for this quite perfect blog post (short, interesting, well
written). One subject I would be interested in is what are all the
parameters a kernel accepts
pouulet wrote 18 hours 3 min ago:
Something like this?
(HTM) [1]: https://docs.kernel.org/admin-guide/kernel-parameters.html
pastage wrote 21 hours 56 min ago:
This is a really clean write up, but it is absolutely a happy path. I
do feel the kernel is too big to be called a program. It is almost
everything you want from comp sci class, router, scheduler, queue,
memory manager. There are some interesting things that you have to
handle if you do not run and OS and init on hardware e.g. handle
signals, how do you shutdown, reap child process. I believe you are
always better off with an init process and an OS.
markhahn wrote 15 hours 23 min ago:
yes, it's misleading clickbait.
the author's apparent epiphany is realizing that init is just a
program. the kernel is, of course, software as well, but it does
injustice to both "program" and "kernel" to lump them together.
zsoltkacsandi wrote 19 hours 34 min ago:
> I do feel the kernel is too big to be called a program.
I kind of agree, but the kernel as a program serves a pedagogical
framing here.
The goal of the post is to make it more tangible for developers, they
write programs that are files on the disk, and you can interact with
them. That's where the analogy came from.
geonineties wrote 22 hours 30 min ago:
I would say something a little different. The kernel is a _library_
that has an init routine you can provide the function for. Or put
another way, without the kernel your go program would have to have
drivers statically compiled into it. This was the world of DOS, btw.
sedatk wrote 22 hours 10 min ago:
I agree with your point, but I must correct you on DOS: it had device
drivers too. :) That's how we used to access mouse input, CD drives,
network, extended memory, etc. Yes, it sucked on the graphics and
sound; every app basically had to reimplement its own graphics and
audio layer from scratch, but the rest was quite abstracted away.
1313ed01 wrote 14 hours 11 min ago:
There were generic VESA SVGA drivers towards the end of the MS-DOS
era.
Sound blaster(16) also came close to being standard enough that
games could just support that.
Extrapolating I think MS-DOS was on a nice trajectory to having
complete enough (and reasonably simple and non-bloated!) APIs for
everything important, when it was killed off. Late MS-DOS 32-bit
games were usually trivial to install and run.
charcircuit wrote 22 hours 17 min ago:
More importantly, a kernel is a platform. Conceptually it isn't that
much different than other platforms such as Chrome or Roblox. They
all have to care about the lifecycle of content, expose input events
to content, allow content to render things, make sure bad things
don't happen when running poorly programmed or malicous content, etc.
zsoltkacsandi wrote 19 hours 47 min ago:
> More importantly, a kernel is a platform.
Completely agree with this framing. We will get there by the end of
the series.
tosti wrote 10 hours 36 min ago:
Yeah no. An operating system kernel doesn't just act as a host
for userland processes, it interacts with hardware. Hardware
behaves in weird and unexpected ways, can be quite hard to debug,
can fail, etc.
This is why Linux is excellent. Users of other operating systems
often remind people to update their device drivers. A
non-technical Linux responds asking what the heck device drivers
are. To the casual user, device drivers become invisible because
they work exactly as intended.
charcircuit wrote 2 hours 42 min ago:
The kernel talks to the device using an API it exposes.
Similarly Chrome will talk to the OS using an API it exposes.
OS APIs can also behave in weird and unexpected ways, be hard
to debug and fail. Chrome protects the content it hosts from
this complexity. Interacting with the layer underneath you is
part of your job of hosting things on top of you.
peddling-brink wrote 23 hours 3 min ago:
Ahh, this was really cool. Iâm not sure I understand the kernel much
better, but init and the concept of an operating system make a lot more
sense.
Iâd love a similarly styled part two that dives into making a
slightly useful distro from âscratchâ in go.
tombert wrote 23 hours 7 min ago:
I love blog posts like this. You're not wrong in saying that the
kernel is sort of this magical block box to most engineers (including
me). I know how to use systemd and I know how to use bash and I know a
few other things, but the kernel has always been "the kernel", and it's
something I've never really tried to mess with. But you're right:
ulimately the kernel is just a program. Yes, it's a big and important
program that works at a lower level than I typically work at, but it's
probably not something that is impossible for me to learn some basic
stuff around.
I have had a bit of a dream of building a full desktop operating system
around seL4 [1], with all drivers in user space and the guts fully
verified in Isabelle, but learning about this level of code kind of
feels like drinking from a firehose. I would like to port over
something like xserver and XFCE and go from there, but I've never made
a proper attempt because of how overwhelming it feels.
[1] I know about sculpt and Genode, and while those are interesting,
not quite what I want.
bicolao wrote 13 hours 44 min ago:
> But you're right: ulimately the kernel is just a program.
Play a bit with user mode linux [1] the kernel becomes literally a
linux program, that I believe you can even debug with gdb (hazy
memory as I tried uml last time maybe a decade ago)
In theory you can also attach gdb to qemu running linux, but that's
more complicated.
(HTM) [1]: https://en.wikipedia.org/wiki/User-mode_Linux
ktpsns wrote 8 hours 27 min ago:
And User Mode Linux was the basic technology for dirt cheap (not
so) virtual machines at some VPS providers 15yrs ago. This had some
disadvantages, for instance you could not load custom kernel
modules in the VM (such as for VPN), actually you could not modify
the kernel at all.
antonkochubey wrote 5 hours 58 min ago:
Another major disadvantage, at least back then, was that it did
not support SMP at all
bitwize wrote 22 hours 14 min ago:
Try working on NetBSD or OpenBSD. You can learn kernel hacking by
literally reading the man pages. Changing, rebuilding,and booting
your own custom kernel is tremendously exciting.
TZubiri wrote 22 hours 56 min ago:
It reminds me of when people speak of money as a product. Sure, maybe
you are right, but I think more of it as something in relation to
products/programs than as a product/program itself.
The fact that it's also a product/program is some brainfucky exercise
that might either be an interesting hobby thought experiment OR it
might be a very relevant nuance that will be useful to the top 0.1%
of professionals who need a 99.9% accuracy, like the difference
between classical and relativistic mechanics.
I mean, sure you are right that kernels are programs and that money
is a product, and that gravity is not a force. But I am a mere mortal
and I will stick to my incorrect and incomplete mental model at a
small expense of accuracy to the great advantage of being
comprehensible.
ronsor wrote 22 hours 58 min ago:
You can actually disable most features of the Linux kernel, including
multi-user support (everything will run as root). The end result is a
stripped down kernel fit for only running your single desired
application.
tosti wrote 12 hours 36 min ago:
gmake tinyconfig all
The result of that probably won't boot your friendly neighbourhood
desktop distro.
(DIR) <- back to front page