[HN Gopher] Bringing Record and Replay debugging everywhere on L...
___________________________________________________________________
Bringing Record and Replay debugging everywhere on Linux
Author : sidkshatriya
Score : 190 points
Date : 2025-03-26 12:49 UTC (4 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| yjftsjthsd-h wrote:
| Well! That eliminates one of if not the biggest problems with
| rr:) Is there some catch or tradeoff? Performance, maybe?
| sidkshatriya wrote:
| Author here. Yes there are some tradeoffs:
|
| - Performance: slower than using rr with HW counters. Both
| dynamic and static instrumentation employed by _Software
| Counter mode_ rr slow things down
|
| - Potential fragility: Dynamic and static instrumentation can
| often make record/replay a bit more fragile
|
| - Currently only x86-64 support has been publicly released. I
| have aarch64 support working reasonably well internally and it
| allows me for instance run to rr in a Linux VM running on macOS
| ! I have yet to figure out my strategy for the aarch64 release
| so watch https://github.com/sidkshatriya/rr.soft for any
| updates.
|
| - Currently can run only on a few recent Linux distributions
| (e.g. Fedora 40/41, Debian Unstable, Ubuntu 24.10) because it
| relies on robust debuginfod support that is not widespread yet.
| See https://github.com/sidkshatriya/rr.soft/wiki#how-does-
| softwa... for why debuginfod is required. The debuginfod
| requirement may be relaxed in the future with more work
|
| Regardless of the tradeoffs this allows rr to be used in many
| more situations i.e. wherever HW Performance counter access is
| not possible/not reliable/broken.
|
| I would love it if more people tried this out and let me know
| if things worked out well for them (or not) with their
| programs.
| Agingcoder wrote:
| Does it work with Pernosco ?
| sidkshatriya wrote:
| I've not used Pernosco but I am generally aware of it. My
| guess is that Pernosco would need to be technically
| modified to support recordings that have "soft ticks" i.e.
| use Software Counters.
|
| I don't see any compelling reason why this should _not_ be
| possible from a broad level technical point of view.
| Pernosco engineers of course would be able to give a more
| authoritative reply.
| vlovich123 wrote:
| I'm assuming this changes nothing about the lack of io_uring
| support?
| sidkshatriya wrote:
| Yes, io_uring is still not supported due to fundamental
| issues in the overall rr architecture which my modification
| does not resolve. My modification only addresses the HW
| counter requirement of upstream rr and the other core
| aspects of rr remain the same.
|
| Normal system calls transition to kernel space and return
| back from kernel space. They will change your program's
| memory/process state as soon as they complete. This gives
| rr an easy boundary when it "can do its thing" to record
| memory/process state changes or insert results (during
| replay).
|
| When does an io_uring request/response complete ? That's
| difficult to say. The kernel/userspace when using io_uring
| communicate with each other by checking a queue head or
| tail with memory accesses to see if something got
| added/removed from request/response ring buffer.
|
| Think of io_uring and userspace cooperating via memory.
| (Yes, sometimes "proper" traditional ring crossing system
| calls are made but what makes io_uring so fast is
| communicating via memory and not via system calls most of
| the time). Anyways all this makes it difficult for rr to
| intervene on the boundary between kernel and userspace
| because this boundary is elusive when it comes to io_uring.
| The memory writes cannot be caught by ptrace ! This
| explanation is simplified of course.
|
| There are some plans to deal with io_uring by rr
| maintainers https://github.com/rr-debugger/rr/issues/2613
| CMCDragonkai wrote:
| Will if run on latest NixOS?
| IshKebab wrote:
| I had to scroll a _very_ long way to get to the most important
| bit:
|
| > Running rr record/replay without access to CPU HW performance
| counters is accomplished using lightweight dynamic (and static)
| instrumentation. The Software Counters mode rr wiki has more
| details in case you're curious about some more of the internals.
|
| You should move that to the top.
| sidkshatriya wrote:
| Good suggestion. Added to the TL;DR section of the blog post !
| db48x wrote:
| Very cool. It's difficult to praise rr too much, and it just
| keeps getting better. If you're not using it, you're missing out
| on a superpower.
| stuaxo wrote:
| Very nice.
|
| Has anyone got rr working with python?
| sidkshatriya wrote:
| rr has always worked with Python in the sense that it can
| record and replay Python programs.
|
| However, when you try to debug the program you can only debug
| the C code the Python interpreter is written in.
|
| I suppose you want to be able to debug the Python code itself.
| Here is something that could do this
| https://pypy.org/posts/2016/07/reverse-debugging-for-python-...
| . I don't think the project is active nowadays though. Also I
| haven't used it so can't say whether it is good or not.
|
| It should be possible to built a Python reverse debugger on top
| of rr. I know this should be possible because I built something
| for PHP https://github.com/sidkshatriya/dontbug .
|
| There are other fancy (and possibly better) things that are
| possible -- instead of building a Python debugger atop rr you
| can record the full trace of the Python program and then for
| e.g. store the values of important variables at each executed
| line of the Python program in a database. This would again use
| rr as the record/replay substrate but with a slightly different
| approach. This is an area which I've done some work internally
| but nothing public released yet :-) !
| senkora wrote:
| Do the gdb commands for printing information about python
| frames work with rr? E.g. py-bt, py-print, py-locals?
| sidkshatriya wrote:
| Any gdb integration scripts for Python to get stack frames
| etc. should work fine in rr and Software Counters Mode rr.
|
| `rr replay` and (the software counters equivalent) `rr
| replay -W` invokes gdb.
| 29ebJCyy wrote:
| https://retracesoftware.com/ Is python specific and seems
| promising.
| pabs3 wrote:
| Is this getting merged back into rr?
| sidkshatriya wrote:
| Tough to say right now. Here is a long answer:
| https://github.com/sidkshatriya/rr.soft?tab=readme-ov-file#w...
| georgewsinger wrote:
| Still waiting for rr to work more transparently/easily with
| Haskell.
| sidkshatriya wrote:
| rr and Gdb are very DWARF debugging focussed. As long as
| Haskell has only basic DWARF debugging support I wonder how
| much rr/gdb can do.
|
| Though I do see a lot of promise in the future. rr can help
| make premature evaluation (many expressions are evaluated
| earlier than they might typically happen in a real Haskell
| program because a user may want to inspect a value) in the
| debugger not matter so much because that evaluation can be
| executed in a diversion session.
| chacham15 wrote:
| I have tried SO hard to get rr to work for me, including buying a
| separate pc just to use it...but it just consistently fails so
| I've basically abandoned it. Something like this would absolutely
| be a godsend. Just getting something consistently working with
| Ubuntu is amazing. Does this approach make working in something
| like WSL viable?
|
| I would love if this were upstreamed. Is there a github issue
| where you discuss the possibility of this with the rr devs? That
| might be something to add to your readme for everyone else who
| wants to follow along. Thanks!
| sidkshatriya wrote:
| Thanks for the encouraging words! Please do try it out and
| report back if it worked well or not for you on the issue
| tracker.
|
| With sufficient usage I think we can make a good case to get
| merged upstream. This patch introduces dynamic/static
| instrumentation for ticks counting which is quite different to
| how things have happened till now on rr. If there are many
| success stories a stronger case for upstream merge can be made.
| The rr maintainers are aware of this project but it is early
| days yet for an upstream merge PR attempt yet
| chacham15 wrote:
| With a big changeset, its better to have a brief discussion
| about how it works / what it needs before you actually
| actually make a PR. Just big principles high level stuff.
| This way if you build a train station, the devs wont be like
| "ooh, we really need an airport." Thats why an issue to track
| it is good: it raises visibility for anyone who has an issue
| with the approach etc. long before its time to make a merge.
| Also, if theyre like "well never take this" or "well take
| this if you build a space station" its good to know that
| before investing a ton of time into something PR-able.
| maleldil wrote:
| Patiently waiting for the day when someone makes something
| similar for macOS.
| sidkshatriya wrote:
| It's very difficult for a broad based record/replay software
| like rr to exist for macOS in my opinion. macOS system
| interfaces are quite basic in terms of functionality compared
| to Linux and increasingly locked down.
|
| rr uses many advanced features of Linux `ptrace`. Compare `man
| ptrace` on Linux with that on macOS for example and you will
| notice that Linux gives a lot of features to `ptrace` that
| macOS simply does not.
|
| There are a large number of other features required for
| practical record and replay -- I dont think macOS simply
| provides them also.
|
| It's probably possible to build _some_ record/replay system on
| macOS with constraints, restrictions, workarounds and
| compromises -- never say never as they say. But I don't think
| it can be as capable/generic as rr on Linux.
| transpute wrote:
| Could this help?
| https://developer.apple.com/documentation/xcode-release-
| note... Instruments 16.3 includes a new
| Processor Trace Instrument which uses hardware-supported,
| low-overhead CPU execution tracing to accurately reconstruct
| execution of the program. This tool provides metrics like
| duration, number of cycles, and instructions retired for
| every function executed on the CPU. Timeline in Instruments
| presents execution flame graph, while detail views provide
| aggregate-level data like Call Tree or aggregated metrics
| (min, max, count, sum), divided by function. Traces can be
| recorded using the new Processor Trace template on supported
| devices: M4 Mac, M4 iPad, and iPhone 16/16 Pro. Tracing on
| the device requires additional configuration in the System
| Settings.
| vchuravy wrote:
| Very cool. Does the dynamic instrumentation handle JIT emitted
| code?
___________________________________________________________________
(page generated 2025-03-30 23:01 UTC)