[HN Gopher] Navtive FlameGraphViewer
___________________________________________________________________
Navtive FlameGraphViewer
Author : laladrik
Score : 62 points
Date : 2024-12-23 10:43 UTC (2 days ago)
(HTM) web link (laladrik.xyz)
(TXT) w3m dump (laladrik.xyz)
| laladrik wrote:
| Hello, I found that it's difficult to visualize the flamegraph
| out of the huge amount of data when I was profiling Rust
| Analyzer. Viewing the flamegraph in a browser (Firefox and
| Chrome) made it impossible to view. In fact, it was simply
| frozen. I made this visualizer to solve my problem. Maybe it
| would help someone else. I leave the link to my article about it,
| but you can find the link to the project right in the first
| paragraph.
| janice1999 wrote:
| I was surprised to hear the Hotspot isn't fast. I had assumed
| it would be since it's written in C++.
| mandarax8 wrote:
| I've never had hotspot not be fast enough. Even on 20Gb
| traces, everything is instant.
|
| Only thing that ever takes some time is the initial load of
| the perf file and filtering (bit still really fast).
| guipsp wrote:
| I think going for xlib is somewhat missing the forest for the
| trees. Does it take less memory? Yeah, but you lose out on any
| gpu assistance you might get for free otherwise. This only
| really matters as you get to bigger resolutions tho, as you
| avoid redrawing.
| atq2119 wrote:
| Props to you for making a cool little project, but as somebody
| who's been involved in Linux graphics a bit: please just let
| Xlib die. It's an outdated API, even if you ignore the
| existence of Wayland. For something like fast visualizations,
| you should really go with something that does offscreen
| rendering and then blits the result. As long as you're just
| drawing a bunch of rectangles, even CPU software rendering may
| be the better solution, though obviously modern tools should
| use GPU rendering.
|
| I see your journey and _how_ you ended up with Xlib. But I
| think that 's really more of an indictment of the sorry state
| of GUI in Rust.
|
| I know that's not your job, I just couldn't let this use of
| Xlib stand uncommented because it's really bad for the larger
| ecosystem.
| IshKebab wrote:
| Damn I hate it when you write a whole project and someone comes
| along and says "this already exists" and you realise how much
| time you wasted (yeah even if some of it counts towards learning
| I'd still rather not needlessly repeat other people's work).
|
| Anyway, pprof has a fantastic interactive Flamegraph viewer that
| lets you narrow down to specific functions. It's really very
| good, I would use that.
|
| https://github.com/google/pprof
|
| Run `pprof -http=:` on a profile and you get a web interface with
| the Flamegraph, call graph, line based profiling etc.
|
| It's demonstrated in this video.
|
| https://youtu.be/v6skRrlXsjY
|
| They only show a very simple example and no zooming, but it works
| very well with huge flamegraphs.
| Gobd wrote:
| I tried to find something fast and native. Saying "native" I
| mean something which doesn't require a browser.
|
| Uses a browser which doesn't meet the requirements they set.
| josephg wrote:
| Yep. Personally I love the Firefox profiler for interacting
| with perf - since it can show you flame graphs and let you
| explore a perf trace by dominators and whatnot.
|
| But I applaud the effort to make small, native apps. I agree
| with the author - not everything should live in the browser.
| IshKebab wrote:
| I think they were saying "fast and native" because web things
| usually aren't fast. In this case it is though, so I don't
| see why it would be a problem for it to be web based.
| jasonjmcghee wrote:
| MacOS Instruments is really quite good.
|
| I have a `profile` function I use. fn
| profile() { xcrun xctrace record --template 'Time
| Profiler' --launch -- $@ }
|
| Then I just do: $ profile ./my-binary -a -b
| -c "foo bar"
|
| or w/e and when it completes (can be one-time run or long-
| running / interactive) now I have a great native experience
| exploring the profile.
|
| All the normal bells and whistles are there and I can double
| click on something and see it inline in the source code with
| per-line (cumulative) timings.
| Sesse__ wrote:
| > MacOS Instruments is really quite good.
|
| It really isn't. It's probably the slowest profiler UI I've
| ever used (it loves to beachball...), it hardly has any
| hardware performance counters, and its actual profiling core
| (xctrace) is... just really buggy? After the fifth time where
| it told me "this function uses 5% CPU" and you optimize it
| away and absolutely nothing happened, because it was just
| another Instruments mirage. Or the time where it told me
| opening a file on iOS took 1000+ ms, but that was just
| because its end timestamps were pure fabrications.
|
| Maybe it's better if you have toy examples, but for large
| applications, it's among the worst profilers I've ever seen
| along almost every axis. I'll give you that gprof is worse,
| though...
| lubsch wrote:
| A very enjoyable and inspiring read! I wonder if self-rolling a
| native application similar to this is feasible on Wayland.
| adolph wrote:
| The article linked as "W3C specifications are bigger than POSIX."
| is also worth reading.
|
| _The total word count of the W3C specification catalogue is 114
| million words at the time of writing. If you added the combined
| word counts of the C11, C++17, UEFI, USB 3.2, and POSIX
| specifications, all 8,754 published RFCs, and the combined word
| counts of everything on Wikipedia's list of longest novels, you
| would be 12 million words short of the W3C specifications._
|
| https://drewdevault.com/2020/03/18/Reckless-limitless-scope....
| tdullien wrote:
| For the prodfiler.com flamegraph viewer we ended up building it
| in Pixi.JS, which allowed us to have nice GPU acceleration and
| render massive flamegraphs quickly. Omitting to draw blocks of
| less than half a pixel width is also a good idea, as is the
| monospace font.
| josephg wrote:
| As someone who's gone down the rust "native pointers vs pin vs
| ..." rabbit hole many times now, I really recommend just using a
| Vec for the data and storing indexes into the vec when you need a
| pointer.
|
| Pin adds a huge amount of weird incidental complexity to your
| code base - since you need to pin-project your struct fields (but
| which ones?). You can't just take an &self or &mut self in
| functions if your value is pinned, and pin is just generally
| confusing, hard to use and hard to reason about.
|
| The article ended up with Vec<Box<T>> - but that's a huge code
| smell in my book. It's much less performant than Vec<T> because
| every object needs to be individually allocated & deallocated. So
| you have orders of magitude more calls to malloc & free, more
| memory fragmentation and way more cache misses while accessing
| your data. The impact this has on performance is insane.
|
| Vec & indexes is a lovely middle ground. In my experience it's
| often (remarkably) slightly more performant than using raw
| pointers. You don't have to worry about vec reallocations (since
| the indexes don't change). And it's 100% safe rust. It feels
| weird at first - indexes are just pointers with more steps. But I
| find rust's language affordances just work better if you write
| your code like that. Code is simple, safe, ergonomic and obvious.
| wging wrote:
| > Code is simple, safe, ergonomic and obvious.
|
| Dunno about 'safe' -- or at least not in the more general sense
| that you seem to intend, rather than the more limited sense of
| rust's safe/unsafe distinction. If you store an index into a
| Vec<T> as a usize, rather than a &T, very little is stopping
| you from invalidating that pseudo-pointer without knowing it.
| (Or from using it as an index into the wrong vector, etc...)
|
| These problems are manageable and I'm not saying 'never do
| this' -- I've done it myself on occasion. It's just that there
| are more pitfalls than you're indicating here, and it is
| actually a meaningful tradeoff of bug potential for ease-of-
| use.
| josephg wrote:
| I mean safe in the narrow way that rust intends. It's memory
| safe, but as you imply, we're leaving the door to open to
| logic bugs if you misuse those array indices.
|
| But honestly, I think danger from that is wildly overstated.
| The author isn't talking about implementing an ECS or b-tree
| here. They're just populating an array from a file when the
| program launches, then freeing the whole thing when the
| program terminates. It's really not rocket science.
|
| The other big advantage of this approach is that you don't
| have to deal with unsafe rust. So, no unsafe {} blocks. No
| wrangling with rust's frankly awful syntax for following raw
| pointers. No stressing about whether or not a future version
| of rust will change some subtle invariant you're accidentally
| depending on, or worrying about if you need to use MaybeInit
| or something like that. I think the chance of making a
| mistake while interacting with unsafe code is far higher than
| the chance of misusing an array index. And the impact is
| usually worse.
|
| The author details running into exactly that problem while
| coding - since they assumed memory allocated by vec would be
| pinned (it isn't). And the program they ended up with still
| doesn't use pin, even though they depend on the memory being
| pinned. That's cause for far more concern than a simple array
| index.
| javierhonduco wrote:
| This is pretty cool work.
|
| Something that's been on my mind recently is that there's a need
| of a high-performance flame graph library for the web.
| Unfortunately the most popular flame graph as a library /
| component, basically the react and d3 ones, work fine but the
| authors don't actively maintain them anymore and their
| performance with large profiles is quite poor.
|
| Most people that care about performance either hard-fork the
| Firefox profiler / speedscope flame graph component or create
| their own.
|
| Would be nice to have a reusable, high performance flame graph
| for web platforms.
| tantalor wrote:
| It's very funny they would call out poor performance of KDAB's
| Hotspot, a performance analysis app.
| Scene_Cast2 wrote:
| I recently went through trying to profile Rust code recently. I
| realized that the profiling toolchain is underdeveloped across
| the board - "perf", the recommended profiler, isn't cross-
| platform (and there aren't any profilers that I found that "just
| work"); visualizing traces from a multi-threaded app is not fun;
| there isn't an IDE plugin to highlight the problematic lines,
| etc.
| bombela wrote:
| This article reads like it was AI padded generously.
| creatonez wrote:
| Except it very obviously was not. Is this accusation going to
| come up in every single HN thread?
___________________________________________________________________
(page generated 2024-12-25 23:00 UTC)