https://rwmj.wordpress.com/2023/06/14/i-booted-linux-292612-times/ Richard WM Jones Skip to content * Home * About [dogs] - NBD-backed qemu guest RAM June 14, 2023 * 9:42 am | Jump to Comments I booted Linux 292,612 times And it only took 21 hours. Linux 6.4 has a bug where it hangs on boot, but probably only 1 in 1000 boots (and rarer if using Intel hardware for some reason). It's surprising to me that no one has noticed this, but I certainly did because our nbdkit tests which use libguestfs were randomly hanging, always at the same place early in booting the libguestfs qemu appliance: [ 0.070120] Freeing SMP alternatives memory: 48K So to bisect this I had to run guestfish in a loop until it either hangs or doesn't. How many times? I chose 10,000 boots as a good threshold. To make this easier I wrote a test harness which uses up to 8 threads and parses the output to detect the hang. After a painful bisection between v6.0 and v6.4-rc6 which took many days I found the culprit, a regression in the printk time feature: https://lkml.org/lkml/2023/6/13/733 To prove it I booted Linux 292,612 times before the faulty commit (successfully), and then after (failed after under 1,000 boots). Advertisement Share this: * Reddit * Twitter * Email * Print * Like this: Like Loading... Related 7 Comments Filed under Uncategorized - NBD-backed qemu guest RAM 7 responses to "I booted Linux 292,612 times" 1. [ee6e7] problemchild68 June 14, 2023 at 10:17 am Impressed with the tenacity of the search. I'd ASS-u-ME that at 1 in 1000 failure they attributed to H/W failure or glitching. Some more detail on your method of homing down the code segment would be useful ...Cheers Reply + [b5881] rich June 14, 2023 at 10:21 am I guess people would think that yes. Finding the commit was actually simple (albeit taking days). I just use git bisect with the test linked above. The problem was the amount of time it took to run 10,000 boot iterations to prove that the kernel was good (vs bad if it hung). For unclear reasons the bisect only got me down to a merge commit, I then had to manually test each commit within that which took about another day. Reply o [cceb7] Allan June 14, 2023 at 6:26 pm Bisecting with flakiness is tricky. "Noisy binary search" somewhat alleviates the pain: https://github.com/ adamcrume/robust-binary-search 2. [cb641] Jennifer Thompson June 14, 2023 at 3:42 pm This is yet more evidence, not that we really need it at this point, that it's time to rewrite the rest of the Linux kernel in Rust. It can't happen in an instant, of course, but if the kernel devs set a goal of reducing C's usage by, say, 0.5% with each release, we'd soon see real progress being made toward a safer and more reliable Linux kernel. C has served us well, but it's time to move beyond it. The future is Rust, and only Rust. Reply + [a3d4d] Nobody June 14, 2023 at 7:59 pm How does Rust prevent this bug? The change: https:// git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=f31dcb152a3d0816e2f1deab4e64572336da197d Reply o [04b95] example June 14, 2023 at 9:22 pm Of course it doesn't, which is why we should re-write the kernel in zig. 3. [0d455] Ollie Jones June 14, 2023 at 7:05 pm Software Test Engineer walks into a bar, orders a beer, orders 292,612 beers! This is outstanding detective work. Reply Leave a Reply Cancel reply Enter your comment here... [ ] Fill in your details below or click an icon to log in: * * * Gravatar Email (required) (Address never made public) [ ] Name (required) [ ] Website [ ] WordPress.com Logo You are commenting using your WordPress.com account. ( Log Out / Change ) Facebook photo You are commenting using your Facebook account. ( Log Out / Change ) Cancel Connecting to %s [ ] Notify me of new comments via email. [ ] Notify me of new posts via email. [Post Comment] [ ] [ ] [ ] [ ] [ ] [ ] [ ] D[ ] This site uses Akismet to reduce spam. Learn how your comment data is processed. * Search for: [ ] [Search] * Recent Posts + I booted Linux 292,612 times + NBD-backed qemu guest RAM + nbdkit's evil filter + Frame pointers vs DWARF - my verdict + SmithForth + Frame pointers - an important update + nbdkit + libblkio + Creating a modifiable gzipped disk image + An NBD block device written using Linux ublk (user block device) + nbdkit for macOS + SSH from RHEL 9 to RHEL 5 or RHEL 6 + Composable tools for disk images + nbdkit now supports LUKS encryption + Installing Fedora 34 on my Turing Pi 7 node cluster + Interview for Red Hat Blog + HiFive Unmatched + BeagleV + Turing Pi 1 + nbdkit 1.24 & libnbd 1.6, new copying tool + nbdkit 1.24, new data plugin features * Recent Comments [04b95] example on I booted Linux 292,612 ti... [a3d4d] Nobody on I booted Linux 292,612 ti... [0d455] Ollie Jones on I booted Linux 292,612 ti... [cceb7] Allan on I booted Linux 292,612 ti... [cb641] Jennifer Thompson on I booted Linux 292,612 ti... [b5881] rich on I booted Linux 292,612 ti... [ee6e7] problemchild68 on I booted Linux 292,612 ti... [b5881] rich on nbdkit's evil filter [5eb38] James on nbdkit's evil filter [689f5] Stephan on Why the Windows Registry sucks... * About the author I am Richard W.M. Jones, a computer programmer. I have strong opinions on how we write software, about Reason and the scientific method. Consequently I am an atheist [To nutcases: Please stop emailing me about this, I'm not interested in your views on it] By day I work for Red Hat on all things to do with virtualization. I am a "citizen of the world". My motto is "often wrong". I don't mind being wrong (I'm often wrong), and I don't mind changing my mind. This blog is not affiliated or endorsed by Red Hat and all views are entirely my own. * aarch64 AMD ARM bbc c++ centos cluster cron debian disk image disk images febootstrap fedora filesystems fosdem fpga FUSE git guestfish guestfs-browser guestmount hardware hivex ideas kernel kvm kvm forum libguestfs libguestfs-1.12 libnbd libvirt linux lvm nbd nbdkit ocaml odroid openstack performance perl programming python qemu rants red hat registry rhel risc-v rpm security ssh tip ubuntu v2v video virt-builder virt-cat virt-df virt-edit virt-inspector virt-install virt-manager virt-p2v virt-rescue virt-resize virt-sysprep virt-tools virt-v2v virt-win-reg virtualization virtual machine vmware whenjobs windows windows registry * RSS Feed RSS - Posts RSS Feed RSS - Comments Richard WM Jones * Virtualization, tools and tips Create a free website or blog at WordPress.com. [Close and accept] Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use. To find out more, including how to control cookies, see here: Cookie Policy * Follow Following + [wpcom-] Richard WM Jones Join 241 other followers [ ] Sign me up + Already have a WordPress.com account? Log in now. * + [wpcom-] Richard WM Jones + Customize + Follow Following + Sign up + Log in + Copy shortlink + Report this content + View post in Reader + Manage subscriptions + Collapse this bar %d bloggers like this: [b]