[HN Gopher] Logging all C++ destructors, poor mans run-time tracing
       ___________________________________________________________________
        
       Logging all C++ destructors, poor mans run-time tracing
        
       Author : jandeboevrie
       Score  : 82 points
       Date   : 2024-09-21 19:06 UTC (1 days ago)
        
 (HTM) web link (raymii.org)
 (TXT) w3m dump (raymii.org)
        
       | meindnoch wrote:
       | And what was the bug in the end?
        
       | neverartful wrote:
       | I did something similar once but my implementation didn't rely on
       | any compiler features. I made tracing macros for constructors,
       | destructors, and regular c++ methods. If the tracing was turned
       | on in the macros, the information given to the macro (class name,
       | method name, etc.) would be passed to the tracing manager. The
       | tracing manager would serialize to a string and send it through a
       | TCP socket. I also wrote a GUI tracing monitor that would listen
       | on a socket for tracing messages and then display the trace
       | messages received (including counts by class and method). The
       | tracing monitor had filters to tweak. It was a nice tool to have
       | and was very instrumental in finding memory leaks and obscure
       | crashes. This was back in the late 1990s or early 2000s.
        
       | jprete wrote:
       | Address/MemorySanitizer are also meant for this kind of problem.
       | https://github.com/google/sanitizers/wiki/AddressSanitizer
       | https://github.com/google/sanitizers/wiki/MemorySanitizer
       | 
       | Also valgrind, but I'm more familiar with the first two.
        
         | richardwhiuk wrote:
         | The author explicitly name checks valgrind.
         | 
         | I think plain gdb would have been sufficient if it's exiting
         | with a segfault or terminating...
        
       | loeg wrote:
       | But what was the shutdown bug you were trying to identify? Was
       | this destructor logging actually useful? The article teases the
       | problem and provides detailed instructions for reproducing the
       | logging, but doesn't actually describe solving the problem.
        
       | rqtwteye wrote:
       | I did this a long time ago with macros. It helped me to find a
       | ton of leaks in a huge video codec codebase.
       | 
       | I still don't understand the hate for the C preprocessor. It
       | enables doing this like this without any overhead. Set a flag and
       | you get constructor/destructor logging and whatever else you
       | want. Don't set it and you get the regular behavior. Zero
       | overhead.
        
         | jonathrg wrote:
         | The hate might have to do with it being such a primitive and
         | blunt tool; doing anything moderately complex becomes extremely
         | complicated and fragile.
        
           | tialaramex wrote:
           | Yeah, this very primitive tool easily creates the programming
           | equivalent of the "iwizard problem".
           | 
           | [You replace straight forwardly "mage" with "wizard" and
           | oops, now your images are "iwizards" and your "magenta" is
           | "wizardnta"]
        
           | immibis wrote:
           | By contrast something like AspectJ for C++ (if such a thing
           | existed) could express this requirement cleanly.
        
           | akira2501 wrote:
           | Complex arrangements _are_ fragile.
           | 
           | There is no magic environment that can fix this for you. If
           | you feel you've seen one, then you've focused on the parts of
           | it that were important to you, while ignoring the parts of it
           | everyone else actually needs.
        
         | synergy20 wrote:
         | do you have a write-up how you did it? I'm interested, thanks.
        
           | naruhodo wrote:
           | I did a similar thing, in C++, 3 decades ago. I used a macro,
           | FUNC(), that I would put at the start of functions. It took
           | no arguments and declared a local instance, using the
           | __FUNC__ preprocessor builtin to pass the function name to
           | the Trace constructor:                   Trace
           | trace##__LINE__(__FUNC__);
           | 
           | The Trace instance would generate one log on construction and
           | another on destruction. It also kept track of function call
           | nesting (a counter) in a static member that would increment
           | in the constructor and decrement in the destructor. It was
           | inherently single-threaded, because I used a static member,
           | but it could be adapted to multiple threads using thread
           | local storage. I paired it with a LINE("Var x is " << x);
           | macro for arbitrary ostreams-style logging. And building on
           | that, EXPR(x) would do LINE(#x " = " << (x)). The output was
           | along the lines of:                   ,- A::f()         | ,-
           | A::g()         | | ,- B::B()         | | `- B::~B()         |
           | | x = 12         | | About to do a thing...         | | ,-
           | A::doAThing()         | | `- A::doAThing()         | `-
           | A::g()         `- A::f()
           | 
           | The macros could be disabled (defined to do nothing) by a
           | preprocessor symbol.
        
       | meisel wrote:
       | I'd say address sanitizer is a better starting point, and likely
       | to show memory issues faster than this
        
         | tempodox wrote:
         | Yep, ASAN to find use-after-free, and valgrind memcheck to find
         | forgotten-to-free.
        
           | manwe150 wrote:
           | ASAN also checks for memory leaks like valgrind, the main
           | difference with the tools is whether you can recompile all of
           | libraries to get the compiler support for detection or
           | whether binary instrumentation is better (https://github.com/
           | google/sanitizers/wiki/AddressSanitizerLe...)
        
             | tempodox wrote:
             | Thanks, I didn't know the `ASAN_OPTIONS` part for macOS.
        
       | MontagFTB wrote:
       | I consider Tracy the state of the art for profiling C++
       | applications. It's straightforward to integrate, toggle, gather
       | data, analyze, and respond. It's also open source, but rivals any
       | product you'd have to pay for:
       | 
       | https://github.com/wolfpld/tracy
        
         | Veserv wrote:
         | Looks fine, but it does not look like there is a automatic full
         | function entry/exit trace, just sampling. The real benefit is
         | when you do not even need to insert manual instrumentation
         | points, you just hit run and you get a full system trace.
         | 
         | How well does the visualizer handle multi-TB traces? Usually
         | pretty uncommon, but a 10-100 GB is not that hard to produce
         | when doing full tracing.
        
           | jms55 wrote:
           | Of note is that tracy is aimed at games, where sampling is
           | often too expensive and not fine-grained enough. Hence the
           | manual instrumenting.
           | 
           | For the Bevy game engine, we automatically insert tracy spans
           | for each ECS system. In practice, users can just compile with
           | the tracy feature enabled, and get a rough but very usable
           | overview of which part of their game is taking a long time on
           | the CPU.
        
             | Veserv wrote:
             | I was talking about automatic instrumentation of _every
             | single function call_ by default. No manual instrumentation
             | needed because _everything_ is already instrumented.
             | 
             | To be fair, you do still want some manual instrumentation
             | to correlate higher level things, but full trace everywhere
             | answers most questions. You also want to be able to
             | manually suppress calls for small functions since that can
             | be performance relevant or distorting, but the point is
             | "default on, manual off" over "default off, manual on".
        
               | andersa wrote:
               | This would be unbelievably inefficient, game engines will
               | be running hundreds of millions of functions per second.
               | And if the code runs 10x slower with the trace active,
               | then it's no longer sensible.
               | 
               | We use sampling for the cases where this level of detail
               | is needed as it has lower overhead.
               | 
               | What use case did you find this useful for?
        
               | Veserv wrote:
               | No, it is quite reasonable with efficient implementation.
               | 10-50% overhead or so depending on function size
               | distribution (since it is small fixed overhead per call,
               | smaller functions result in a greater fraction of
               | overhead). 10x for just function entry/exit recording
               | would be grotesquely inefficient. You can do inefficient
               | time travel debugging recording for less than that.
               | 
               | You do need to allocate a ton of memory for the recording
               | buffer to record sizable amounts of trace data. GB per
               | core-second of trace or so (ring buffer so you get to see
               | the last N seconds, not you need to run for less than N
               | seconds) but that is fine during development on normal
               | dev machines.
               | 
               | It is useful for everything. Why would you not want full
               | traces for everything? It is amazing. We use it for
               | everything internally where I work. Or rather, it is part
               | of it. We actually prefer full time travel debugging
               | during development and automated testing (again, overhead
               | is low enough) but it is not available for everything. So
               | sometimes we are stuck with just traces.
        
               | donadigo wrote:
               | _Every_ function would be pretty overkill, but you can
               | automatically install instrumentation on functions on
               | demand or with a pre-determined user selection. You can
               | even instrument on a line basis if your instrumentation
               | is cheap enough. I 've experimented with this in a VS
               | extension I'm developing and I could easily browse
               | through a non-trivial game codebase without causing
               | noticeable performance overhead [1]. In the demo, the
               | instrumentation is auto-installed on all functions within
               | the file you opened. Obviously, this is just one project
               | I was testing on but it shows that this type of tracing
               | is feasible.
               | 
               | [1] https://www.youtube.com/watch?v=3PnVG49SFmU
        
               | jpc0 wrote:
               | How would you implement this may I ask? C++ does not have
               | reflection in the language so you at best can do that by
               | hooking into the running application, but C++ also
               | aggressively inlines functions on anything except -O0
               | which mean your function call might never be a function
               | call. Running at -O0 is alsp just generally a bad idea
               | since many many instances of UB will never get caught.
               | 
               | The only way I can see doing this at compile time is with
               | a compiler extension but then you are entirely locked in
               | to 1 compiler.
               | 
               | Maybe if you compile with debug symbols but then well,
               | you are shipping debug symbols...
        
               | TickleSteve wrote:
               | gcc has "-finstrument-functions". This calls your code on
               | every function entry and exit. I've used this previously
               | for tracing as described here and to move memory-
               | protection windows around based on the running code.
        
         | rerdavies wrote:
         | Alas, not for Linux. I've been using the unloved and mostly
         | abandoned (and mostly awful) google perf tools on Linux. :-(
        
           | jchw wrote:
           | Hmm? I haven't used Tracy yet but the demo trace they show at
           | the URL linked on GitHub[1] sure looks like a trace from an
           | application running on Linux. The documentation[2] also seems
           | to reference what you need to run it on Linux, and the NixOS
           | derivation[3] also suggests it runs on at least Linux and
           | macOS, and I was able to run several of the binaries
           | including the UI and capture binary. I still hesitate to
           | doubt you on this because I haven't figured out how one is
           | supposed to actually use it but it surely seems to support
           | Linux. (I will definitely find a use for this, it looks
           | amazing.)
           | 
           | [1]: https://tracy.nereid.pl/
           | 
           | [2]: https://github.com/wolfpld/tracy/releases/latest/downloa
           | d/tr...
           | 
           | [3]: https://github.com/NixOS/nixpkgs/blob/nixos-24.05/pkgs/d
           | evel...
        
           | HellsMaddy wrote:
           | Tracy works fine on Linux! Just used it today.
        
           | dkersten wrote:
           | I've used Tracy on Linux about two months ago. Works fine.
        
       | rerdavies wrote:
       | Just spent three days of debugging hell getting my app to shut
       | down gracefully, so that it gracefully turns off all the things
       | that it asynchronously turned on without performing use-after
       | deletes). I can sympathise with that.
        
       | tempodox wrote:
       | Too bad, it doesn't work like this on macOS.
        
       | kccqzy wrote:
       | One of my favorite hacky workarounds is to simply call _exit(0)
       | for an immediate exit without running destructors. Most of the
       | time, the destructors are just freeing memory that will be
       | reclaimed by the OS anyways so they are not worth running if you
       | know the program is exiting. And even if the destructor does more
       | than just freeing memory, maybe the work it's doing isn't needed
       | if you know the process is ending soon: maybe it's joining
       | threads or releasing mutexes or deleting timers.
       | 
       | You will find that in a typical C++ codebase, the destructors
       | that do useful things (say, flushing useful buffers and closing
       | files etc) are much fewer.
        
         | ignoramous wrote:
         | Sub-optimal if the exiting process must relinquish resources
         | external to the current process or the host.
        
           | gpderetta wrote:
           | Optimal of you are writing crash-only, i.e. reliable,
           | software.
        
             | TeMPOraL wrote:
             | Suboptimal if you're relying on RAII not for "regular"
             | resources, but more abstract ones like "correctly bookended
             | piece of persisted data".
             | 
             | Then again, I guess it's optimal if you're writing fault-
             | resistant software.
        
           | akira2501 wrote:
           | And you guarantee you can release those resources with out
           | error or exception all the time? If this is not the case then
           | a destructor is the wrong place to be doing this work
           | anyways. If an external service requires a client
           | experiencing an error to manually release resources than that
           | external service has significant design or protocol issues.
        
       ___________________________________________________________________
       (page generated 2024-09-22 23:01 UTC)