[HN Gopher] GCC Profiler Internals
       ___________________________________________________________________
        
       GCC Profiler Internals
        
       Author : fcambus
       Score  : 53 points
       Date   : 2022-05-20 15:32 UTC (7 hours ago)
        
 (HTM) web link (trofi.github.io)
 (TXT) w3m dump (trofi.github.io)
        
       | mhh__ wrote:
       | 1. Might want to mention FDO, even if it doesn't use the gcc-
       | native profiling data.
       | 
       | 2. Do _not_ use instrumenting profilers for measuring code.
        
         | mistrial9 wrote:
         | > Do not use instrumenting profilers for measuring code.
         | 
         | except when that is useful 8-)
        
           | mhh__ wrote:
           | Use a sampling profiler. They're faster, more accurate, and
           | don't require recompiling.
        
             | gnufx wrote:
             | "more accurate"? The rationale for (typically expensive)
             | tracing in HPC circles is that sampling can miss things you
             | want to pick up. I think that's also covered in the vi-
             | hps.org workshop introductions, for instance. (It doesn't
             | necessarily involve recompilation.)
        
               | mhh__ wrote:
               | "More accurate" is a condensed version of a longer
               | (disclaimer-attached) statement about how I think they're
               | more useful for typical profiling.
               | 
               | If you want to know the exact ratios of how _hot_ code
               | is, then yes instrumentation can be a godsend, however in
               | my experience most people are simply mislead by them
               | because a naive instrumenting profiler usually does not
               | have enough information to capture the right context
               | (i.e. the call stack).
               | 
               | For most applications profiling is basically just an
               | exercise in proving your own mental model of execution
               | correct, not fine-tuning (initially at least).
               | 
               | https://youtu.be/6TDZa5LUBzY I gave a talk on this in
               | November.
        
             | AlotOfReading wrote:
             | On the other hand, it's a lot more difficult to implement a
             | sampling profiler than an instrumenting profiler on a bare
             | metal target. There's definitely a niche for the latter and
             | I've made use of the GCC instrumentation hooks for all
             | sorts of analysis tools over the years.
        
               | mhh__ wrote:
               | I suppose you're right but working on a compiler/language
               | has made me very wary of "But what about embedded!" as a
               | technical point. You can basically argue any point using
               | embedded.
        
               | AlotOfReading wrote:
               | That's a totally fair objection and one I'm sensitive to
               | as a vocal advocate for moving ES teams away from the
               | custom crap that we've spent the past 40 years doing.
               | 
               | I still think the weird limitations you get from "what
               | about embedded" are often just obvious canaries for
               | problems present throughout much of the vast diversity of
               | computing. For example, sampling profilers used to have
               | serious issues on WSL (where perf must still be manually
               | compiled) and WINE. Having instrumentation hooks also
               | makes things like AFL and custom sanitizers possible. The
               | flexibility isn't always reasonable (e.g. don't assume
               | little endian, don't assume 8-bit bytes), but there may
               | be valid use cases that weren't considered at the
               | toolchain/language level.
        
       | flakiness wrote:
       | Is there anyone still using these instrumentation based profilers
       | (vs sampling profiler like perf)? How is it like today?
       | 
       | I stopped using it a long time ago since it was so slow and not
       | thread-safe. But I occasionally miss the comprehensive coverage
       | it had. If the situation has changed, I'd love to give it another
       | shot.
        
         | gnufx wrote:
         | Instrumentation is widely used in HPC performance engineering,
         | but you may still do sampling, rather than tracing, with the
         | result. Often only a specific set of function calls is
         | instrumented (e.g. MPI, i/o), and often using LD_PRELOAD and
         | hooks provided for the purpose.
        
         | AlotOfReading wrote:
         | Thread safety isn't an issue that I've observed, but it's still
         | dog slow and pretty much always will be. Sampling profilers are
         | generally better when you can use them.
        
           | gnufx wrote:
           | Nothing is generally better when it comes to serious
           | performance engineering. The standard introduction to the
           | performance engineering workshops under vi-hps.org stress
           | that you need a variety of tools/techniques available.
        
       ___________________________________________________________________
       (page generated 2022-05-20 23:01 UTC)