[HN Gopher] Logging all C++ destructors, poor mans run-time tracing
___________________________________________________________________
Logging all C++ destructors, poor mans run-time tracing
Author : jandeboevrie
Score : 82 points
Date : 2024-09-21 19:06 UTC (1 days ago)
(HTM) web link (raymii.org)
(TXT) w3m dump (raymii.org)
| meindnoch wrote:
| And what was the bug in the end?
| neverartful wrote:
| I did something similar once but my implementation didn't rely on
| any compiler features. I made tracing macros for constructors,
| destructors, and regular c++ methods. If the tracing was turned
| on in the macros, the information given to the macro (class name,
| method name, etc.) would be passed to the tracing manager. The
| tracing manager would serialize to a string and send it through a
| TCP socket. I also wrote a GUI tracing monitor that would listen
| on a socket for tracing messages and then display the trace
| messages received (including counts by class and method). The
| tracing monitor had filters to tweak. It was a nice tool to have
| and was very instrumental in finding memory leaks and obscure
| crashes. This was back in the late 1990s or early 2000s.
| jprete wrote:
| Address/MemorySanitizer are also meant for this kind of problem.
| https://github.com/google/sanitizers/wiki/AddressSanitizer
| https://github.com/google/sanitizers/wiki/MemorySanitizer
|
| Also valgrind, but I'm more familiar with the first two.
| richardwhiuk wrote:
| The author explicitly name checks valgrind.
|
| I think plain gdb would have been sufficient if it's exiting
| with a segfault or terminating...
| loeg wrote:
| But what was the shutdown bug you were trying to identify? Was
| this destructor logging actually useful? The article teases the
| problem and provides detailed instructions for reproducing the
| logging, but doesn't actually describe solving the problem.
| rqtwteye wrote:
| I did this a long time ago with macros. It helped me to find a
| ton of leaks in a huge video codec codebase.
|
| I still don't understand the hate for the C preprocessor. It
| enables doing this like this without any overhead. Set a flag and
| you get constructor/destructor logging and whatever else you
| want. Don't set it and you get the regular behavior. Zero
| overhead.
| jonathrg wrote:
| The hate might have to do with it being such a primitive and
| blunt tool; doing anything moderately complex becomes extremely
| complicated and fragile.
| tialaramex wrote:
| Yeah, this very primitive tool easily creates the programming
| equivalent of the "iwizard problem".
|
| [You replace straight forwardly "mage" with "wizard" and
| oops, now your images are "iwizards" and your "magenta" is
| "wizardnta"]
| immibis wrote:
| By contrast something like AspectJ for C++ (if such a thing
| existed) could express this requirement cleanly.
| akira2501 wrote:
| Complex arrangements _are_ fragile.
|
| There is no magic environment that can fix this for you. If
| you feel you've seen one, then you've focused on the parts of
| it that were important to you, while ignoring the parts of it
| everyone else actually needs.
| synergy20 wrote:
| do you have a write-up how you did it? I'm interested, thanks.
| naruhodo wrote:
| I did a similar thing, in C++, 3 decades ago. I used a macro,
| FUNC(), that I would put at the start of functions. It took
| no arguments and declared a local instance, using the
| __FUNC__ preprocessor builtin to pass the function name to
| the Trace constructor: Trace
| trace##__LINE__(__FUNC__);
|
| The Trace instance would generate one log on construction and
| another on destruction. It also kept track of function call
| nesting (a counter) in a static member that would increment
| in the constructor and decrement in the destructor. It was
| inherently single-threaded, because I used a static member,
| but it could be adapted to multiple threads using thread
| local storage. I paired it with a LINE("Var x is " << x);
| macro for arbitrary ostreams-style logging. And building on
| that, EXPR(x) would do LINE(#x " = " << (x)). The output was
| along the lines of: ,- A::f() | ,-
| A::g() | | ,- B::B() | | `- B::~B() |
| | x = 12 | | About to do a thing... | | ,-
| A::doAThing() | | `- A::doAThing() | `-
| A::g() `- A::f()
|
| The macros could be disabled (defined to do nothing) by a
| preprocessor symbol.
| meisel wrote:
| I'd say address sanitizer is a better starting point, and likely
| to show memory issues faster than this
| tempodox wrote:
| Yep, ASAN to find use-after-free, and valgrind memcheck to find
| forgotten-to-free.
| manwe150 wrote:
| ASAN also checks for memory leaks like valgrind, the main
| difference with the tools is whether you can recompile all of
| libraries to get the compiler support for detection or
| whether binary instrumentation is better (https://github.com/
| google/sanitizers/wiki/AddressSanitizerLe...)
| tempodox wrote:
| Thanks, I didn't know the `ASAN_OPTIONS` part for macOS.
| MontagFTB wrote:
| I consider Tracy the state of the art for profiling C++
| applications. It's straightforward to integrate, toggle, gather
| data, analyze, and respond. It's also open source, but rivals any
| product you'd have to pay for:
|
| https://github.com/wolfpld/tracy
| Veserv wrote:
| Looks fine, but it does not look like there is a automatic full
| function entry/exit trace, just sampling. The real benefit is
| when you do not even need to insert manual instrumentation
| points, you just hit run and you get a full system trace.
|
| How well does the visualizer handle multi-TB traces? Usually
| pretty uncommon, but a 10-100 GB is not that hard to produce
| when doing full tracing.
| jms55 wrote:
| Of note is that tracy is aimed at games, where sampling is
| often too expensive and not fine-grained enough. Hence the
| manual instrumenting.
|
| For the Bevy game engine, we automatically insert tracy spans
| for each ECS system. In practice, users can just compile with
| the tracy feature enabled, and get a rough but very usable
| overview of which part of their game is taking a long time on
| the CPU.
| Veserv wrote:
| I was talking about automatic instrumentation of _every
| single function call_ by default. No manual instrumentation
| needed because _everything_ is already instrumented.
|
| To be fair, you do still want some manual instrumentation
| to correlate higher level things, but full trace everywhere
| answers most questions. You also want to be able to
| manually suppress calls for small functions since that can
| be performance relevant or distorting, but the point is
| "default on, manual off" over "default off, manual on".
| andersa wrote:
| This would be unbelievably inefficient, game engines will
| be running hundreds of millions of functions per second.
| And if the code runs 10x slower with the trace active,
| then it's no longer sensible.
|
| We use sampling for the cases where this level of detail
| is needed as it has lower overhead.
|
| What use case did you find this useful for?
| Veserv wrote:
| No, it is quite reasonable with efficient implementation.
| 10-50% overhead or so depending on function size
| distribution (since it is small fixed overhead per call,
| smaller functions result in a greater fraction of
| overhead). 10x for just function entry/exit recording
| would be grotesquely inefficient. You can do inefficient
| time travel debugging recording for less than that.
|
| You do need to allocate a ton of memory for the recording
| buffer to record sizable amounts of trace data. GB per
| core-second of trace or so (ring buffer so you get to see
| the last N seconds, not you need to run for less than N
| seconds) but that is fine during development on normal
| dev machines.
|
| It is useful for everything. Why would you not want full
| traces for everything? It is amazing. We use it for
| everything internally where I work. Or rather, it is part
| of it. We actually prefer full time travel debugging
| during development and automated testing (again, overhead
| is low enough) but it is not available for everything. So
| sometimes we are stuck with just traces.
| donadigo wrote:
| _Every_ function would be pretty overkill, but you can
| automatically install instrumentation on functions on
| demand or with a pre-determined user selection. You can
| even instrument on a line basis if your instrumentation
| is cheap enough. I 've experimented with this in a VS
| extension I'm developing and I could easily browse
| through a non-trivial game codebase without causing
| noticeable performance overhead [1]. In the demo, the
| instrumentation is auto-installed on all functions within
| the file you opened. Obviously, this is just one project
| I was testing on but it shows that this type of tracing
| is feasible.
|
| [1] https://www.youtube.com/watch?v=3PnVG49SFmU
| jpc0 wrote:
| How would you implement this may I ask? C++ does not have
| reflection in the language so you at best can do that by
| hooking into the running application, but C++ also
| aggressively inlines functions on anything except -O0
| which mean your function call might never be a function
| call. Running at -O0 is alsp just generally a bad idea
| since many many instances of UB will never get caught.
|
| The only way I can see doing this at compile time is with
| a compiler extension but then you are entirely locked in
| to 1 compiler.
|
| Maybe if you compile with debug symbols but then well,
| you are shipping debug symbols...
| TickleSteve wrote:
| gcc has "-finstrument-functions". This calls your code on
| every function entry and exit. I've used this previously
| for tracing as described here and to move memory-
| protection windows around based on the running code.
| rerdavies wrote:
| Alas, not for Linux. I've been using the unloved and mostly
| abandoned (and mostly awful) google perf tools on Linux. :-(
| jchw wrote:
| Hmm? I haven't used Tracy yet but the demo trace they show at
| the URL linked on GitHub[1] sure looks like a trace from an
| application running on Linux. The documentation[2] also seems
| to reference what you need to run it on Linux, and the NixOS
| derivation[3] also suggests it runs on at least Linux and
| macOS, and I was able to run several of the binaries
| including the UI and capture binary. I still hesitate to
| doubt you on this because I haven't figured out how one is
| supposed to actually use it but it surely seems to support
| Linux. (I will definitely find a use for this, it looks
| amazing.)
|
| [1]: https://tracy.nereid.pl/
|
| [2]: https://github.com/wolfpld/tracy/releases/latest/downloa
| d/tr...
|
| [3]: https://github.com/NixOS/nixpkgs/blob/nixos-24.05/pkgs/d
| evel...
| HellsMaddy wrote:
| Tracy works fine on Linux! Just used it today.
| dkersten wrote:
| I've used Tracy on Linux about two months ago. Works fine.
| rerdavies wrote:
| Just spent three days of debugging hell getting my app to shut
| down gracefully, so that it gracefully turns off all the things
| that it asynchronously turned on without performing use-after
| deletes). I can sympathise with that.
| tempodox wrote:
| Too bad, it doesn't work like this on macOS.
| kccqzy wrote:
| One of my favorite hacky workarounds is to simply call _exit(0)
| for an immediate exit without running destructors. Most of the
| time, the destructors are just freeing memory that will be
| reclaimed by the OS anyways so they are not worth running if you
| know the program is exiting. And even if the destructor does more
| than just freeing memory, maybe the work it's doing isn't needed
| if you know the process is ending soon: maybe it's joining
| threads or releasing mutexes or deleting timers.
|
| You will find that in a typical C++ codebase, the destructors
| that do useful things (say, flushing useful buffers and closing
| files etc) are much fewer.
| ignoramous wrote:
| Sub-optimal if the exiting process must relinquish resources
| external to the current process or the host.
| gpderetta wrote:
| Optimal of you are writing crash-only, i.e. reliable,
| software.
| TeMPOraL wrote:
| Suboptimal if you're relying on RAII not for "regular"
| resources, but more abstract ones like "correctly bookended
| piece of persisted data".
|
| Then again, I guess it's optimal if you're writing fault-
| resistant software.
| akira2501 wrote:
| And you guarantee you can release those resources with out
| error or exception all the time? If this is not the case then
| a destructor is the wrong place to be doing this work
| anyways. If an external service requires a client
| experiencing an error to manually release resources than that
| external service has significant design or protocol issues.
___________________________________________________________________
(page generated 2024-09-22 23:01 UTC)