[HN Gopher] Fgtrace - The Full Go Tracer
       ___________________________________________________________________
        
       Fgtrace - The Full Go Tracer
        
       Author : felixge
       Score  : 100 points
       Date   : 2022-09-19 13:23 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | hknmtt wrote:
       | Looks interesting, added a star to keep track.
        
       | jaffee wrote:
       | this thing is awesome, have used it many times to quickly track
       | down tricky performance issues.
        
         | jaffee wrote:
         | wait I'm thinking of fgprof.... this looks awesome too though
        
           | felixge wrote:
           | Sorry for the confusion :). I'm the author of both tools and
           | was also considering to build the new functionality into
           | fgprof since the data capturing approach is very similar.
           | Anyway, if you found fgprof useful, I think fgtrace could be
           | even more useful in similar situations :)
        
       | felixge wrote:
       | I'm the author of fgtrace, happy to answer any questions :).
       | 
       | I've also posted a few more comments in this twitter thread:
       | https://twitter.com/felixge/status/1571850160358965249
        
         | [deleted]
        
         | lanstin wrote:
         | I use the builtin pprof flame graphs all the time, and since
         | each of the goroutine pools have different stack traces, i can
         | tell them apart. what does this package improve on? Wall time
         | instead of CPU time? It isnt immediately obvious to me what the
         | extra info is?
        
           | felixge wrote:
           | The main difference is that you get a timeline (flame chart)
           | rather than flame graph. This allows you to understand the
           | order in which operations are taking place. You also get
           | walltime (instead of CPU time), so you can debug Off-CPU
           | performance bottlenecks (e.g. database calls) without the
           | need for additional instrumentation. Last but not least you
           | get everything broken down per-goroutine, so you can
           | understand which operations are executed concurrently vs
           | sequentially.
           | 
           | The Go CPU profiler is great for reducing CPU utilization.
           | But unless you're CPU-bound, it's not very useful for
           | improving latency. fgtrace is trying to help with that.
        
       | kjeetgill wrote:
       | > fgtrace may cause noticeable stop-the-world pauses in your
       | applications.
       | 
       | Huh, I wonder if this is a temporary limitation or an issue with
       | the approach. In my experience if you're doing profiling you
       | probably better off getting something lighter weight that you can
       | get more honest numbers from.
       | 
       | Edit: reading closer, it looks like the go team had similar
       | concerns. I wonder if this can capture how long a goroutine was
       | unmounted for.
        
         | felixge wrote:
         | Capturing a consistent snapshot of all goroutines requires
         | stopping the world. However, this can be very quick as the GC
         | relies on the same mechanism.
         | 
         | The bigger problem is capturing the stack traces for all
         | goroutines. Rhys added a patch to Go 1.19 [1] that mostly moves
         | this work outside of the critical STW section, which greatly
         | reduces the overhead. Unfortunately this improvement only
         | applies to the official goroutine profiling APIs, and those do
         | not provide details such as goroutine ids. This means fgtrace
         | has to use runtime.Stack() which returns the stack traces as
         | text (yikes) and isn't optimized like the other goroutine
         | profiling APIs.
         | 
         | There are various ways the implementation details of fgtrace
         | and the Go runtime could be improved for this use case
         | (wallclock timeline views), and I'm hoping to work on
         | contributions in the coming months.
         | 
         | [1] https://go-review.googlesource.com/c/go/+/387415
        
       | nathias wrote:
       | never go full tracer
        
         | felixge wrote:
         | haha - certainly not in production, at least not with my hacky
         | code here :)
        
       | chrsig wrote:
       | The proposal[0] mentioned in the README has some good insight
       | from rsc.
       | 
       | He notes the performance & scalability issues already noted here
       | by other commenters.
       | 
       | > Probably the right thing to do is figure out more of a trace
       | like the current trace profiles but perhaps less low level.
       | 
       | This is the key take away for me.
       | 
       | I think there's room for tracing support somewhere in-between
       | runtime/trace and full blown distributed traces (e.g.,
       | OpenTelemetry[1]) - so I'm hopeful this effort may evolve into a
       | good solution in that space.
       | 
       | From a usability point of view, my biggest gripe right now with
       | the go tracer is that it's viewer is...painful. It uses the
       | tracer that's built into chrome, which chrome itself is moving
       | away from.
       | 
       | I'd hacked around a bit recently to try and get the existing go
       | traces into perfetto[2], with some success. As I recall, I
       | couldn't get user traces functioning.
       | 
       | The `go tool trace` server has an api to output compatible json,
       | but it's limited in what it outputs. Unfortunately, the trace
       | file itself is in some custom binary format. All the tools for
       | manipulating it are in `internal/` folders, making them
       | unavailable for import, so creating new tools for working with
       | the traces is quite burdensome.
       | 
       | I'd debated copying the code out into a new project, and starting
       | to hack on it, but at that point, I'd reached the end of my
       | willingness to invest time. Perhaps I should open an issue or
       | mesage the mailing list to see what the maintainers think the
       | future of runtime tracing looks like.
       | 
       | [0]
       | https://github.com/golang/go/issues/41324#issuecomment-70379...
       | 
       | [1] https://opentelemetry.io
       | 
       | [2] https://ui.perfetto.dev/
        
         | felixge wrote:
         | > He notes the performance & scalability issues already noted
         | here by other commenters.
         | 
         | Go 1.19 has made some improvements in this regard [1]. But yes,
         | profiling all goroutines does not scale to programs that use
         | more than perhaps 10k goroutines which isn't entirely uncommon.
         | To overcome this, the goroutine profile API would need to be
         | extended to allow profiling a subset of goroutines. pprof
         | labels could be used to specify which goroutines should be
         | profiled.
         | 
         | > Probably the right thing to do is figure out more of a trace
         | like the current trace profiles but perhaps less low level.
         | 
         | Yeah, in the long run the tracer, perhaps in combination with
         | the cpu profiler [2], also offers a great way of capturing this
         | data. But right now it's too much of a firehose, so it probably
         | needs some way of selecting a subset of goroutines to trace as
         | well. Additionally the unwinding of stack traces is a major
         | bottleneck, so maybe frame pointer unwinding or similar will be
         | needed to make it faster.
         | 
         | I've heard some stuff about future plans to the tracer that
         | would help with the custom binary format problem, so hopefully
         | this will improve in the future.
         | 
         | Anyway, I mostly see fgtrace as a "Do Things that Don't Scale"
         | [3] kind of project. If people like the value it can provide,
         | it will likely motivate myself and others to figure out how to
         | build a version of it that is safe for production usage :).
         | 
         | [1] https://go-review.googlesource.com/c/go/+/387415
         | 
         | [2] https://go-review.googlesource.com/c/go/+/400795
         | 
         | [3] http://paulgraham.com/ds.html
        
       ___________________________________________________________________
       (page generated 2022-09-19 23:01 UTC)