[HN Gopher] GCC Profiler Internals
___________________________________________________________________
GCC Profiler Internals
Author : fcambus
Score : 53 points
Date : 2022-05-20 15:32 UTC (7 hours ago)
(HTM) web link (trofi.github.io)
(TXT) w3m dump (trofi.github.io)
| mhh__ wrote:
| 1. Might want to mention FDO, even if it doesn't use the gcc-
| native profiling data.
|
| 2. Do _not_ use instrumenting profilers for measuring code.
| mistrial9 wrote:
| > Do not use instrumenting profilers for measuring code.
|
| except when that is useful 8-)
| mhh__ wrote:
| Use a sampling profiler. They're faster, more accurate, and
| don't require recompiling.
| gnufx wrote:
| "more accurate"? The rationale for (typically expensive)
| tracing in HPC circles is that sampling can miss things you
| want to pick up. I think that's also covered in the vi-
| hps.org workshop introductions, for instance. (It doesn't
| necessarily involve recompilation.)
| mhh__ wrote:
| "More accurate" is a condensed version of a longer
| (disclaimer-attached) statement about how I think they're
| more useful for typical profiling.
|
| If you want to know the exact ratios of how _hot_ code
| is, then yes instrumentation can be a godsend, however in
| my experience most people are simply mislead by them
| because a naive instrumenting profiler usually does not
| have enough information to capture the right context
| (i.e. the call stack).
|
| For most applications profiling is basically just an
| exercise in proving your own mental model of execution
| correct, not fine-tuning (initially at least).
|
| https://youtu.be/6TDZa5LUBzY I gave a talk on this in
| November.
| AlotOfReading wrote:
| On the other hand, it's a lot more difficult to implement a
| sampling profiler than an instrumenting profiler on a bare
| metal target. There's definitely a niche for the latter and
| I've made use of the GCC instrumentation hooks for all
| sorts of analysis tools over the years.
| mhh__ wrote:
| I suppose you're right but working on a compiler/language
| has made me very wary of "But what about embedded!" as a
| technical point. You can basically argue any point using
| embedded.
| AlotOfReading wrote:
| That's a totally fair objection and one I'm sensitive to
| as a vocal advocate for moving ES teams away from the
| custom crap that we've spent the past 40 years doing.
|
| I still think the weird limitations you get from "what
| about embedded" are often just obvious canaries for
| problems present throughout much of the vast diversity of
| computing. For example, sampling profilers used to have
| serious issues on WSL (where perf must still be manually
| compiled) and WINE. Having instrumentation hooks also
| makes things like AFL and custom sanitizers possible. The
| flexibility isn't always reasonable (e.g. don't assume
| little endian, don't assume 8-bit bytes), but there may
| be valid use cases that weren't considered at the
| toolchain/language level.
| flakiness wrote:
| Is there anyone still using these instrumentation based profilers
| (vs sampling profiler like perf)? How is it like today?
|
| I stopped using it a long time ago since it was so slow and not
| thread-safe. But I occasionally miss the comprehensive coverage
| it had. If the situation has changed, I'd love to give it another
| shot.
| gnufx wrote:
| Instrumentation is widely used in HPC performance engineering,
| but you may still do sampling, rather than tracing, with the
| result. Often only a specific set of function calls is
| instrumented (e.g. MPI, i/o), and often using LD_PRELOAD and
| hooks provided for the purpose.
| AlotOfReading wrote:
| Thread safety isn't an issue that I've observed, but it's still
| dog slow and pretty much always will be. Sampling profilers are
| generally better when you can use them.
| gnufx wrote:
| Nothing is generally better when it comes to serious
| performance engineering. The standard introduction to the
| performance engineering workshops under vi-hps.org stress
| that you need a variety of tools/techniques available.
___________________________________________________________________
(page generated 2022-05-20 23:01 UTC)