Post B36fhOXub1Cd7CEUM4 by pervognsen@mastodon.social
(DIR) More posts by pervognsen@mastodon.social
(DIR) Post #B36fhNCbakZ6wpFxVQ by pervognsen@mastodon.social
2026-02-08T03:54:42Z
0 likes, 0 repeats
There's a funny thing about sampling profilers for games: at typical default sampling rates you're getting so few samples in a particular frame that only the subsystem-level parts of the stack near the main loop are statistically meaningful and almost everything else is noise. But with a stack snapshot you're capturing the noisiest parts of the call stack first. In the worst case (e.g. 64-depth truncated snapshot of a deep call stack) you might not even capture any stable parts of the stack.
(DIR) Post #B36fhOXub1Cd7CEUM4 by pervognsen@mastodon.social
2026-02-08T04:03:25Z
0 likes, 0 repeats
The low sampling rate on a per frame basis is something that most people point out but I haven't seen many people focus on the fact that you're literally prioritizing the statistically noisiest part of the stack. Which can be very useful to catch big hitches in unexpected places near the leaves, but most of the non-stationary data there is literal noise. Obviously cranking the sampling rate helps a lot (albeit with trade-offs).
(DIR) Post #B36fhPpfoT0L6ZYBg8 by wolf480pl@mstdn.io
2026-02-08T09:04:48Z
0 likes, 0 repeats
@pervognsenwhy are frames not similar to each other?
(DIR) Post #B36fhRk4iFiN1mwkEK by pervognsen@mastodon.social
2026-02-08T03:56:56Z
0 likes, 0 repeats
So ironically the only statistically relevant parts of the data tend to be the subsystem-level durations of the frame and those can be trivially marked up with instrumentation by hand since there's so few of them (which also has essentially zero overhead and perturbative probe effect). Of course there's proper effective ways to use sampling profilers for applications like games and e.g. Superluminal (and Tracy if you set it up for your app) makes it easy.
(DIR) Post #B36ftBG7p2K3NiX6cS by pervognsen@mastodon.social
2026-02-08T09:06:57Z
0 likes, 0 repeats
@wolf480pl If they are, of course it works fine (and even lower rate would probably be fine too). I'm really talking about cases where that isn't the case, i.e. you're not just assuming stationary but reconstructing a timeline-based virtual call call graph a la Superluminal or ghost stacks in Tracy.