[HN Gopher] Deterministic Replay of QEMU Emulation
       ___________________________________________________________________
        
       Deterministic Replay of QEMU Emulation
        
       Author : Intralexical
       Score  : 153 points
       Date   : 2024-08-28 11:24 UTC (1 days ago)
        
 (HTM) web link (www.qemu.org)
 (TXT) w3m dump (www.qemu.org)
        
       | jester1337 wrote:
       | I remember we were working on this exact topic at my University
       | chair ~8-10 years ago. I think it never fully took off. Several
       | Master students worked on it for a while. I like that it's now in
       | QEMU!
        
       | repelsteeltje wrote:
       | I think this is awesome.
       | 
       | While it might seem like a small feature, it opens a huge door.
       | It's similar to what reproducible build infrastructure has done
       | for finding bugs, attestation that binary matches source,
       | immutability, etc.
       | 
       | Can imagine this is useful for finding bugs in hardware designs
       | too.
        
       | waschl wrote:
       | Tried to apply it for debugging on my own OS, but couldn't get it
       | finally running after several days of trial and error...
       | 
       | https://github.com/jbreu/jos?tab=readme-ov-file#reverse-debu...
        
         | junon wrote:
         | Yeah QEMU's story for this sort of thing is pretty rough around
         | the edges for OS dev. Wishing there was something like Unicorn-
         | but-with-devices for making osdev tooling.
        
           | SoothingSorbet wrote:
           | There's also panda[1], but I never got it working myself. I
           | share your frustration, as it would help greatly with
           | debugging, especially with nondeterminstic bugs. I likewise
           | never got QEMU's record/replay to work.
           | 
           | [1] https://github.com/panda-re/panda
        
         | bbarnett wrote:
         | Qemu is great, the DEVs and all who worked on her deserve
         | applause, except for documentation. It's like someone creating
         | a huge Japanese titanium indestructible fighting robot, but
         | then using aluminum in the feet/heels.
         | 
         | So much of my qemu work spent on randomly changing options,
         | with no change documentation, discovering features, with no
         | documentation, options with no reason or indication why,
         | manpages out of date, READMEs not updated, changelog not there,
         | etc.
        
           | cedws wrote:
           | Agreed, the CLI in particular is a complete mess.
        
             | bonzini wrote:
             | If there are parts that specifically you'd want to have
             | better documentation for, please let me know here.
             | 
             | Generally we've been moving command line towards a scheme
             | where each option describes an aspect of either the guest
             | (a device, the board type, the CPU model) or the interface
             | to the host (a file holding the contents of the disk, the
             | network bridge to attach to, how to show graphic contents),
             | with some options providing both as a shortcut (for example
             | -nic, -audio, -serial).
        
           | bonzini wrote:
           | Documentation is not great I admit. The problem is that we
           | don't have anyone who is a capable tech writer in the team.
           | It's not something that you can improvise.
           | 
           | However, all incompatible changes are documented and also
           | announced at least 8 months in advance.
           | 
           | https://www.qemu.org/docs/master/about/removed-features.html
           | 
           | https://www.qemu.org/docs/master/about/deprecated.html
           | 
           | It may seem like there are many, but in practice they are in
           | very old, mostly unused or very badly designed corners. For
           | example configuration of audio was overhauled last year, and
           | is now the same as basically all other backends (e.g. -audio
           | pa,model=sb16; compare with -nic user,model=e1000 for a
           | network card).
        
             | justinclift wrote:
             | As a general thought, would it be possible to put out a
             | "call for tech writers" post or similar on the front of the
             | qemu website, or even a prominent blog post in the blog
             | section?
        
               | bonzini wrote:
               | Yes, I guess it would be an idea. We could also
               | participate to Season of Docs.
        
               | justinclift wrote:
               | Sounds like it'd be a useful avenue for closing a major
               | (non-code) problem with the software. :)
        
               | dev-n wrote:
               | bonzini I'm a QEMU fan ... and a techwriter. Is there a
               | way to send you an email? There's no real contact option
               | on qemu, other than IRC.
        
               | bonzini wrote:
               | pbonzini@redhat.com :) thanks very much!
        
       | moondev wrote:
       | Such a casual and low-key introduction of what sounds like an
       | incredible new capability.
       | 
       | 1 . Would something like this replace packer for creating machine
       | images?
       | 
       | 2. Curious how quickly the replay log grows and how it compares
       | to a CoW snapshot.
       | 
       | 3. Will be interesting what the log looks like and what doors
       | could open up creating or generating it by other means.
        
         | owyn wrote:
         | It is very cool, but I think some version of this feature has
         | been around for years? This commit is from 7 years ago, and it
         | looks like the code originates back to 2010.
         | 
         | https://github.com/qemu/qemu/blob/v2.9.0/docs/replay.txt
         | 
         | That said, I was not aware of it until I saw this post, and I
         | definitely want to play around with it.
        
           | Intralexical wrote:
           | Well, I guess it was probably new when the doc page was first
           | written to introduce it.
           | 
           | > That said, I was not aware of it until I saw this post, and
           | I definitely want to play around with it.
           | 
           | You could almost say it was too casual and low-key. ;)
        
       | majke wrote:
       | This is a big deal. With some tooling around it can be amazing.
       | 
       | I can think of using this for testing, and as a vehicle to change
       | a programming paradigm of existing/legacy software (run a thing,
       | and roll it back aggressively from outside of a vm)
        
         | m000 wrote:
         | Indeed, the tooling is the problem. And I wouldn't hold my
         | breath to see this tooling being implemented, as the feature
         | has been around for quite a bit.
         | 
         | IMHO, PANDA [1] remains a better/more practical choice for
         | whole-system record/replay analysis. It already offers quite a
         | bit of tooling (including a python interface), as well as hooks
         | to build your own. It does have its own shortcomings (speed and
         | not being in-sync with the latest QEMU), but at least you're
         | not limited to gdb-based debugging.
         | 
         | [1] https://panda.re/
        
         | darby_nine wrote:
         | This is the central premise of Antithesis:
         | https://antithesis.com (no affiliation)
        
       | justinclift wrote:
       | Anyone have clear ideas/guidelines for how much ram/disk/etc this
       | is likely to need for a "reasonable" capture?
       | 
       | Say capturing a Qt application as it corrupts its internal state
       | during startup, in order to work out what's corrupting its
       | internal state?
        
         | londons_explore wrote:
         | I'm gonna guess not that much - easily doable on a typical
         | desktop.
         | 
         | If it were a problem, you can skip recording your emulated
         | machines bootup process, and simply take a snapshot when you're
         | about to start your QT application. That snapshot probably only
         | takes about 10% extra RAM because most of RAM contents wont
         | change between the snapshot and the live system.
        
         | jraph wrote:
         | I don't know, however a key element is:
         | 
         | > Record/replay system is based on saving and replaying non-
         | deterministic events
         | 
         | > The following non-deterministic data from peripheral devices
         | is saved into the log: mouse and keyboard input, network
         | packets, audio controller input, serial port input, and
         | hardware clocks (they are non-deterministic too, because their
         | values are taken from the host machine). Inputs from simulated
         | hardware, memory of VM, software interrupts, and execution of
         | instructions are not saved into the log, because they are
         | deterministic and can be replayed by simulating the behavior of
         | virtual machine starting from initial state.
         | 
         | So, it's probably not much, you can probably comfortably save
         | minutes of qemu sessions.
         | 
         | Also note the existence of the rr debugger [1], which allows
         | you to reverse debug applications with a ~10% performance hit
         | while recording. To achieve this, it records results of
         | syscalls (only). It will serialize thread events, so have the
         | effect of running applications like on a single core CPU.
         | 
         | [1] https://rr-project.org/
        
       | vessenes wrote:
       | Seems to me like one of the highest and best uses of this right
       | now would be adding verifiable builds to ... literally anything.
       | You no longer need a verifiable-build-capable compiler or
       | language -- you can just run the compile and packaging step
       | through a deterministic QEMU session.
       | 
       | Does this sound right? I'm trying to figure out where
       | uncontrollable randomness would come in during a compile phase,
       | and coming up blank.
        
         | 0points wrote:
         | Historically, major causes for non-determinism was embedded
         | timestamps and unsorted file listings created by the build
         | tools.
         | 
         | I have not followed the progress recently, but
         | https://reproducible-builds.org/ is a starting point if you are
         | interested.
         | 
         | There is a sane path forward for reproducibility on bare metal,
         | no custom emulation is needed.
        
           | vessenes wrote:
           | Thanks for the link, I'm aware of reproducible-builds.org.
           | 
           | Both your causes seem trivially fixable here - the QEMU
           | builds could have a standard system clock time they start
           | with, and an 'unsorted' file listing made in a deterministic
           | OS environment will keep the same file order, no?
           | 
           | By comparison the rb.org site says you need to start with
           | stripping all that stuff out of your build process, for the
           | reasons you refer to.
        
             | repelsteeltje wrote:
             | > Both your causes seem trivially fixable here []
             | 
             | You'd be amazed about the amount of indeterminsim lurking
             | in the guts of depencies all the way into libc and os ...
             | Like locale, fs
        
             | commercialnix wrote:
             | I like your thinking. Deterministic replay with QEMU is
             | supplemental to the larger goal of reproducible builds. The
             | communities concerned with the topic of reproducible
             | software not only expect cohesive human-readable code that
             | runs deterministically to produce binary reproducible
             | results, but their originally stated goals require it.
             | 
             | Deterministic replay with QEMU is a "power tool" in the
             | larger picture of these efforts.
        
         | Intralexical wrote:
         | Sounds to me like that wouldn't be quite as good as true
         | reproducible builds (which can run on anybody's computer)
         | because auditing the entire emulated hardware starting state
         | and events log is a harder problem than auditing only your
         | code. Including a virtual machine image for your build
         | effectively makes the VM part of your codebase in terms of
         | users needing to trust it, so verifying the build result means
         | not just engineering but also forensics.
         | 
         | So it'd be good for cases where you otherwise wouldn't be able
         | to provide _any_ verifiability. But for software, it 's still
         | not as good as eliminating non-determinism completely.
        
       | rrdharan wrote:
       | VMware had this many years ago; it was very cool:
       | 
       | http://stackframe.blogspot.com/2007/10/configuring-applicati...
       | 
       | http://www.replaydebugging.com/2008/08/vmware-workstation-65...
        
         | justinclift wrote:
         | Any idea if that works in modern VMware Workstation? It's
         | currently on version 17, whereas that post was for version 6.5.
         | 
         | VMware Workstation has such disjointed development spurts that
         | it wouldn't surprise me if the feature had been ripped out at
         | some point. Other useful features such as machine groups have
         | been. :(
        
           | roca wrote:
           | It was removed in 2011.
        
           | rrdharan wrote:
           | http://www.replaydebugging.com/2011/09/goodbye-replay-
           | debugg...
        
             | justinclift wrote:
             | Thanks. Yeah, I kind of expected that. :(
        
             | m000 wrote:
             | I guess we can say it was _too_ cool.
        
       ___________________________________________________________________
       (page generated 2024-08-29 23:01 UTC)