[HN Gopher] Twenty years of Valgrind
___________________________________________________________________
Twenty years of Valgrind
Author : nnethercote
Score : 584 points
Date : 2022-07-26 22:59 UTC (1 days ago)
(HTM) web link (nnethercote.github.io)
(TXT) w3m dump (nnethercote.github.io)
| [deleted]
| appleflaxen wrote:
| What other great tools are there in the vein of valgrind and AFL?
| tux3 wrote:
| rr, for record and replay
|
| I'm also a fan of systemtap, for when your probing problems
| push into peeking at the kernel
| tialaramex wrote:
| In my obviously biased opinion, very specialised, but sometimes
| exactly what you needed (I have used this in anger maybe 2-3
| times in my career since then, which is why I wrote the C
| version):
|
| https://github.com/tialaramex/leakdice (or
| https://github.com/tialaramex/leakdice-rust)
|
| Leakdice implements some of Raymond Chen's "The poor man's way
| of identifying memory leaks" for you. On Linux at least.
|
| https://bytepointer.com/resources/old_new_thing/20050815_224...
|
| All leakdice does is: You pick a running process which you own,
| leakdice picks a random heap page belonging to that process and
| shows you that page as hex + ASCII.
|
| The Raymond Chen article explains why you might ever want to do
| this.
| cjbprime wrote:
| Starting to stretch, but would have to pick strace next. Can't
| believe macOS devs don't get to use it (at least without hoops
| like disabling SIP).
| yaantc wrote:
| Seconding `rr` as suggested by @tux3, it's great for debugging.
|
| Also, the sanitizers for GCC and Clang
| (https://github.com/google/sanitizers), and the Clang static
| analyzer (and tidy too) through CodeChecker
| (https://codechecker.readthedocs.io/).
|
| For the Clang static analyzer, make sure your LLVM toolchain
| has the Z3 support enabled (OK in Debian stable for example),
| and enable cross translation units (CTU) analysis too for
| better results.
| ahartmetz wrote:
| Valgrind is fantastic.
|
| Memcheck decreases the memory safety problem of C++ by about 80%
| in my experience - it really is a big deal. The compiler-based
| tools that require recompiling every library used are a bit
| impractical for large stacks such as the ones under Qt-based GUI
| applications. Several libraries, several build systems. But I
| hear that they are popular for CI systems in large projects such
| as web browsers, which probably have dedicated CI developers.
| There are also some IME rare problems that these tools can find
| that Memcheck can't, which is due to information unavailable in
| compiled code. Still, Memcheck has the largest coverage by far.
|
| Callgrind and Cachegrind give very precise, repeatable results,
| complementary to but not replacing perf and AMD / Intel tooling
| which use hardware performance counters. I tend to use all of
| them. They all work without recompiling.
| Linda703 wrote:
| ssrs wrote:
| ive used valgrind quite extensively. a big thank you to the folks
| behind this!
| cjbprime wrote:
| I wish I hadn't read this article because now I know that I've
| been mispronouncing Valgrind for nearly 20 years but I'm not
| going to stop.
|
| (Kidding. Thanks for Valgrind! I still use it for assessing
| memory corruption vulnerabilities along with ASan.)
| hgs3 wrote:
| It's giving me flashbacks to the hard G vs soft G in gif image
| format.
| stormbrew wrote:
| Fwiw I've literally worked with Nicholas (but not on valgrind)
| and I only learned this today somehow.
| klyrs wrote:
| I learned of the tool from a native German speaker who
| pronounced it wall-grinned, which is apparently half-right.
| Like latex, I can't keep the pronunciation straight from one
| sentence to the next.
| dtgriscom wrote:
| I've been promoting proper pronunciation of Valgrind at work,
| an am making passable progress...
| quickthrower2 wrote:
| Valarie smiled. Is how I will remember it.
|
| That said I sometimes get the "V" tools mixed up (Vagrant,
| Valgrind, Varnish)
| koolba wrote:
| What other ways are there to (mis)pronounce it?
| dahart wrote:
| There are so many amazing ways! ;)
|
| Since it's an old Norse word, try using Google Translate to
| hear what happens in Danish, Dutch, German, Icelandic,
| Norwegian, and Swedish. I don't know if it's a modern word in
| those languages, but Translate is showing translations
| "election gate" for several languages, and "fall gravel" for
| Swedish.
|
| According to the audio pronunciations on Translate...
|
| Danish: "vale grint", long a, hard tapped r, hard d sounds
| like t
|
| Dutch: sounds like "fall hint" but there's a slight throaty r
| in there hard to hear for English speakers, so maybe "hrint"
|
| German: "val grinned", val like value, grinned with the
| normal German r
|
| Icelandic: "vall grint", vall like fall, hard tapped r
|
| Norwegian: "vall grin", hard tapped r, almost "vall g'din",
| silent or nearly silent d/t at the end.
|
| Swedish: "voll grint / g'dint", hard tapped r, hard d
|
| German is the only one that has "Val" like "value", all the
| rest sound more like "fall". The word valgrind is the door to
| Valhalla, which means literally "fall hall", as in hall of
| the fallen. For that reason, I suspect it makes the most
| sense to pronounce valgrind like "fall grinned", but Old
| Norse might have used val like value, I'm not sure.
|
| BTW Valhalla has an equally amusing number of ways to
| pronounce it across Germanic languages, "val" sometimes turns
| into what sound like "fell" instead of "fall", and in
| Icelandic the double ell makes it fall-hat-la.
|
| Languages are cool!
| meowface wrote:
| Pronouncing the "-grind" like the word "grind". I think
| that's probably how most English-speakers first assume it's
| pronounced.
| Buttons840 wrote:
| Safe to assume many pronounce grind as "grind".
| opan wrote:
| How do you pronounce it? I hoped it'd be near the start, but
| several paragraphs in and I'm still not sure.
|
| edit: val as in value + grinned
| dietr1ch wrote:
| Now you are really on track to mispronounce Valgrind for nearly
| 21 years :P
| galangalalgol wrote:
| Our pipelines have asan ( and cpp check clang tidy coverity and
| coverage stuff) but no valgrind, is there something it is good
| at that we are missing?
| gkfasdfasdf wrote:
| If your tests can take the performance hit, Valgrind would
| tell you about uninitialized memory reads, which isn't
| covered by those tools you mentioned. If however, you are
| able to add MSAN (i.e. able to rebuild the entire product,
| including dependencies, with -fsanitize=memory) to the
| pipeline, then you would have the same coverage as Valgrind.
| cjbprime wrote:
| The main reason for Valgrind would be if you're working with
| a binary that you can't recompile to add the ASAN
| instrumentation.
| Jason_Gibson wrote:
| ASAN on its own doesn't detect uninitialized memory. MSAN
| can, though. Valgrind is also more than just the memcheck
| sub-tool - there are others, like Cachegrind, which is a
| cache and branch-prediction profiler.
|
| https://github.com/google/sanitizers/wiki/AddressSanitizerCo.
| .. https://github.com/google/sanitizers/wiki/MemorySanitizer
| https://valgrind.org/docs/manual/manual.html
| [deleted]
| glouwbug wrote:
| Yeah, valgrind can report L1/L2 cache misses and report the
| percentage of branch mispredictions. It also reports the
| exact number of instructions processed, and how many of those
| instructions cache missed. It's great for improving small
| code that needs to be performant.
|
| I'd use asan over valgrind only for memory leaks. It's
| faster.
| Sesse__ wrote:
| If you only want memory leaks, LSan will do that for you.
|
| In general, I tend to use ASan for nearly everything I used
| Valgrind for back in the day; it's faster and usually more
| precise (Valgrind cannot reliably detect small overflows
| between stack variables). Valgrind if I cannot recompile,
| or if ASan doesn't find th issue. Callgrind and Cachegrind
| never; perf does a much better job, much faster. DHAT
| never; Heaptrack gives me what I want.
|
| Valgrind was and is a fantastic tool; it became part of my
| standard toolkit together with the editor, compiler,
| debugger and build system. But technology has moved on for
| me.
| gpderetta wrote:
| Amen. Between the various sanitizers and perf, I stopped
| needing valgrind a few years ago.
|
| But when it was the only option it was fantastically
| useful.
| themulticaster wrote:
| If I understand correctly valgrind (cachegrind) reports
| L1/L2 cache misses based on a simulated CPU/cache model.
|
| On Linux, you can easily instrument real cache events using
| the very powerful perf suite. There is an overwhelming
| number of events you can instrument (use perf-list(1) to
| show them), but a simple example could look like this:
| $ perf stat -d -- sh -c 'find ~ -type f -print | wc -l'
| ^Csh: Interrupt Performance counter stats for 'sh -c
| find ~ -type f -print | wc -l':
| 47,91 msec task-clock # 0,020 CPUs
| utilized 599 context-switches
| # 12,502 K/sec 81 cpu-
| migrations # 1,691 K/sec
| 569 page-faults # 11,876 K/sec
| 185.814.947 cycles # 3,878 GHz
| (28,71%) 105.650.405 instructions
| # 0,57 insn per cycle (46,15%)
| 22.991.322 branches # 479,863 M/sec
| (46,72%) 643.767 branch-misses
| # 2,80% of all branches (46,14%)
| 26.010.223 L1-dcache-loads # 542,871 M/sec
| (36,80%) 2.449.173 L1-dcache-load-
| misses # 9,42% of all L1-dcache accesses (29,62%)
| 517.052 LLC-loads # 10,792 M/sec
| (22,53%) 133.152 LLC-load-misses
| # 25,75% of all LL-cache accesses (16,02%)
| 2,403975646 seconds time elapsed
| 0,005972000 seconds user 0,046268000 seconds
| sys
|
| Ignore the command, it's just a placeholder to get
| meaningful values. The -d flag adds basic cache events, by
| adding another -d you also get load and load miss events
| for the dTLB, iTLB and L1i cache.
|
| But as mentioned, you can instrument any event supported by
| your system. Including very obscure events such as
| uops_executed.cycles_ge_2_uops_exec (Cycles where at least
| 2 uops were executed per-thread) or
| frontend_retired.latency_ge_2_bubbles_ge_2 (Retired
| instructions that are fetched after an interval where the
| front-end had at least 2 bubble-slots for a period of 2
| cycles which was not interrupted by a back-end stall).
|
| You can also record data using perf-record(1) and inspect
| them using perf-report(1) or - my personal favorite - the
| Hotspot tool (https://github.com/KDAB/hotspot).
|
| Sorry for hijacking the discussion a little, but I think
| perf is an awesome little tool and not as widely known as
| it should be. IMO, when using it as a profiler (perf-
| record), it is vastly superior to any language-specific
| built-in profiler. Unfortunately some languages (such as
| Python or Haskell) are not a good fit for profiling using
| perf instrumentation as their stack frame model does not
| quite map to the C model.
| harry8 wrote:
| I was introduced to valgrind by Andrew Tridgell during the main
| content of a vaguely famous lecture he gave that finished with
| the audience collectively writing a shellscript bitkeeper
| client [1] demonstrating beyond doubt that Tridge had not in
| any way acted like a "git" when bitkeeper's licenseholder
| pulled the license for the linux kernel community.
|
| Tridge said words to the effect "if you program in C and you
| don't aren't using valgrind you flipping should be!" And went
| on to talk about how some projects like to have a "valgrind
| clean" build the same way they compile without warnings and
| that it's a really useful thing. As ever well expressed with
| examples from samba development.
|
| He was obviously right and I started using valgrind right there
| in the lecture theatre. apt-get install is a beautiful thing.
|
| He pronounced it val grind like the first part of "value" and
| "grind" as in grinding coffee beans. I haven't been able to
| change my pronunciation since then regardless of it being
| "wrong".
|
| [1] https://lwn.net/Articles/132938/
|
| Corbett's account of this is actually wrong in the lwn link
| above. Noted by akumria in the comments below it. Every single
| command and suggestion came from the audience, starting with
| telnetting to Ted Tso's bitkeeper ip & port that he made
| available for the demo. Typing help came from the audience as
| did using netcat and the entire nc command. The audience wrote
| the bitkeeper client in 2 minutes with tridge doing no more
| than encouraging, typing and pointing out the "tridge is a
| wizard reverse engineer who has used his powers for evil" Was
| clearly just some "wrong thinking." Linus claimed thereafter
| that Git was named after himself and not Tridge.
| throwawaylinux wrote:
| Tridgell is possibly the most intelligent person I've ever
| met, and I've met Torvalds and a bunch of other Linux
| developers -- not that they aren't intelligent too, among
| them might be a challenger to that title.
|
| Tridge has a way of explaining complicated ideas in a way
| that pares them down to their essence and helps you to
| understand them that just really struck me (a smart person is
| able to talk about a complicated thing in a way that makes
| you feel dumb, a _really_ smart person is able to talk about
| a complicated thing in a way that makes you feel like a
| genius). As well as the ability and intellectual curiosity to
| jump seemingly effortlessly across disciplines.
|
| And he's a fantastic and very entertaining public speaker.
| Highly recommend any talk he gives.
| glandium wrote:
| I've known the right pronunciation for about 10 years. I still
| say it wrong.
| mynegation wrote:
| I am old enough that I started with Purify and I used Valgrind
| starting from the version 1.0, because Purify was commercial and
| Solaris only. It saved my behind multiple multiple times.
| cpeterso wrote:
| And BoundsChecker was also great!
|
| https://en.m.wikipedia.org/wiki/BoundsChecker
| sumtechguy wrote:
| That tool saved me tons of time tracking down bugs. It also
| taught me to be a better C/C++ programmer. Run time
| sanitizers like Purify/Valgrind/Boundchecker do not tolerate
| poor C code. What is kind of cool is you can find whole
| classes of bugs in your code. Because as devs we get
| something working once we tend to copy and paste that pattern
| everywhere. So find a bug in one place you will probably find
| it a few dozen other places in your codebase.
| hn_go_brrrrr wrote:
| I worked at a company 11 years ago that was still using Purify!
| unmole wrote:
| I used Purify 8 years ago. On Windows. I don't remember the
| specifics but the company kept a few XP machines around just
| so they could continue using Purify.
| pjmlp wrote:
| Purify fanboy over here.
| atgreen wrote:
| Purify was an amazing tool. I recently noticed that one of my
| libraries (libffi) still has an --enable-purify configure
| option, although it probably hasn't been exercised in.. 20
| years? A Purify patent prevented work-alikes for many years,
| but valgrind eventually emerged as a more-than-worthy
| successor.
|
| Fun fact: the creator of Purify went on to found Netflix and is
| still their CEO.
| mynegation wrote:
| Ha! And I thought that the same person writing bzip2 and Val
| grind is my surprise for the day.
| snovv_crash wrote:
| And Ardupilot
| edsiper2 wrote:
| First of all congratulations to Valgrind and the team behind it!
| This is an essential tool that help me personally over the years
| while developing.
|
| What needs to be done to get Valgrind binaries available for
| MacOS (M1) ?, from a company perspective we are happy to support
| this work. If you know who's interest and can accomplish this pls
| drop me an email to eduardo at calyptia dot com.
| RustyRussell wrote:
| I once submitted a bug fix for an obscure issue to valgrind. They
| asked for a test case, which I managed to provide, but I was a
| bit nervous as I couldn't immediately see how to fit in their
| test suite.
|
| The response from Julian Seward was so nice it set a permanently
| high bar for me when random people I don't know report bugs on my
| projects!
|
| We still run our entire testsuite under valgrind in CI. Amazing
| tool!
| sealeck wrote:
| What was the response?
| vlmutolo wrote:
| > I still use Cachegrind, Callgrind, and DHAT all the time. I'm
| amazed that I'm still using Cachegrind today, given that it has
| hardly changed in twenty years. (I only use it for instruction
| counts, though. I wouldn't trust the icache/dcache results at all
| given that they come from a best-guess simulation of an AMD
| Athlon circa 2002.)
|
| I'm pretty sure I've seen people using the icache/dcache miss
| counts from valgrind for profiling. I wonder how unreliable these
| numbers are.
| andrewf wrote:
| https://sqlite.org/cpu.html#microopt -
|
| _Cachegrind is used to measure performance because it gives
| answers that are repeatable to 7 or more significant digits. In
| comparison, actual (wall-clock) run times are scarcely
| repeatable beyond one significant digit [...] The high
| repeatability of cachegrind allows the SQLite developers to
| implement and measure "microoptimizations"._
|
| There's a bunch of ways for caches to behave differently but
| have they changed much over the past 20 years? i.e. is the
| difference between [2022 AMD cache, 2002 AMD cache]
| significantly greater than the difference between [2002 PowerPC
| G4 cache, 2002 AMD cache, 2002 Intel cache] ?
| BeefWellington wrote:
| I would guess yes, just based on the L1/L2 (later L3) use and
| sizing between all those systems. 2002 vs 2022 is K8 vs
| 5800X3D for AMD, so you're looking at having 1 core and
| 64+64KB of L1 cache, 512KB of L2 cache[1] vs 8 cores (+ht)
| and 32+32KB L1 _per core_ , 512KB L2 _per core_ , 96MB L3.
|
| Just managing the cache access between L2 and L3 I think
| would be additional consideration, but then you have to
| consider the actual architectural differences and on server
| chips locality will matter quite a bit.
|
| [1]: https://en.wikipedia.org/wiki/Athlon_64
| tux3 wrote:
| I don't know how sophisticates the streaming/prefetch/access
| pattern prediction the 2002 cpus did was.
|
| I'm speculating, but if that's not modeled, cachegrind may
| pessimize some less simple predictable patterns and report a
| lot of expected misses when the cpu would have been able to
| prefetch it
| andrewf wrote:
| Agreed, I suspect it'd be most accurate to say the SQLite
| folks are minimizing their working set.
|
| I picked a couple of random performance commits out of
| their code repo, and they look like they might keep 1 or 2
| lines out of i-cache:
| https://sqlite.org/src/info/f48bd8f85d86fd93
| https://sqlite.org/src/info/390717e68800af9b
| t43562 wrote:
| I was working on an application for Symbian mobile phones and I
| was able to implement large parts of it as a portable library -
| the bits which compressed results using a dictionary to make them
| tiny enough to fit into an SMS message or a UDP frame. This was
| before the days of flat-rate charges for internet access and we
| were trying to be very economical with data.
|
| I was able to build and debug them on Linux with Valgrind finding
| many stupid mistakes and the library worked flawlessly on
| Symbian.
|
| It's just one of the many times that Valgrind has saved my bacon.
| It's awesome.
| nicoburns wrote:
| Well damn, no wonder he's so good at optimising the Rust
| compiler. He literally has a PhD in profiling tools!
| bayindirh wrote:
| I still use Valgrind memcheck for memory leak verification of a
| large piece of code I have developed, with a long end-to-end
| test.
|
| Also, it has a nice integration with Eclipse which reflects the
| Valgrind memcheck output to the source files directly, enabling
| you to see where problems are rooted.
|
| All in all, Valgrind is a great toolset.
|
| P.S.: I was pronouncing Valgrind correctly! :)
| gkhartman wrote:
| Many thanks for Valgrind. I can honestly say that it helped me
| become a better C++ programmer.
| sharmin123 wrote:
| compiler-guy wrote:
| I sort of owe callgrind a big chunk of my career.
|
| I was working at a company full of PhDs and well seasoned
| veterans, who looked at me as a new kid, kind of underqualified
| to be working in their tools group. I had been at the firm for a
| while, and they were nice enough, but didn't really have me down
| as someone who was going to contribute as anything other than a
| very junior engineer.
|
| We had a severe problem with a program's performance, and no one
| really had any idea why. And as it was clearly not a
| sophisticated project, I got assigned to figure something out.
|
| I used the then very new callgrind and the accompanying
| flamegraph, and discovered that we were passing very large bit
| arrays for register allocation _by value_. Very, very large. They
| had started small enough to fit in registers, but over time had
| grown so large that a function call to manipulate them
| effectively flushed the cache, and the rest of the code assumed
| these operations were cheap.
|
| Profiling tools at the time were quite primitive, and the
| application was a morass of shared libraries, weird dynamic
| allocations and JIT, and a bunch of other crap.
|
| Valgrind was able to get the profiles after failing with
| everything else I could try.
|
| The presentation I made on that discovery, and my proposed fixes
| (which eventually sped everything up greatly), finally earned the
| respect of my colleagues, and no phd wasn't a big deal after
| that. Later on, those colleagues who had left the company invited
| me to my next gig. And the one after that.
|
| So thanks!
| LAC-Tech wrote:
| I love this story. I'm becoming an older dev now and I've often
| been blindsided by some insight or finding by juniors - it's
| really great to see & you've always got to make sure they get
| credit!
| intelVISA wrote:
| Always find it weird when people berate C++ tooling, Valgrind
| and adjacent friends are legitimately best in class and
| incredibly useful. Between RAII and a stack of robust static
| analyzers you'd have to deliberately write unsafe code these
| days.
| nicoburns wrote:
| That sounds great until you realise in other languages you
| get that by default without any tooling. And with better
| guarantees too (C++ static analysers aren't foolproof).
|
| Where C++ tooling really lacks is around library management
| and build tooling. The problem is less that any of the
| individual tools don't work and more that there are many of
| them and they don't interoperate nicely.
| bluGill wrote:
| What language that has anything like cachegrind which is
| the topic of this thread? Cache misuse is one of the
| largest causes of bad performance these days, and I can't
| think of any language that has anything built in for that.
|
| Sure other languages have some nice tools to do garbage
| collection (so does C++, but it is optional, and reference
| counting does have drawbacks), but there are a lot more to
| tooling than just garbage collection. Even rust's memory
| model has places where it can't do what C++ can. (you can't
| use atomic to write data from two different threads at the
| same time)
|
| No language has good tools around library and builds. So
| long as you stick to exactly one language with the build
| system of that language things seem nice. However in the
| real world we have a lot of languages, and a lot of
| libraries that already exist. Let me know what I can use
| any build/library tool with this library that builds with
| autotools, this other one from cmake, here is one with
| qmake (though at least qt is switching to cmake which is
| becoming the de-facto c++ standard), just to name a couple
| that handle dependencies in very different ways.
| njs12345 wrote:
| > Even rust's memory model has places where it can't do
| what C++ can. (you can't use atomic to write data from
| two different threads at the same time)
|
| Perhaps not in safe Rust, but can you provide an example
| of something Rust can't do that C++ can? It has the same
| memory model as C++20: https://doc.rust-
| lang.org/nomicon/atomics.html
| seoaeu wrote:
| You totally can in safe rust: https://doc.rust-
| lang.org/std/sync/atomic/struct.AtomicU64.h...
| intelVISA wrote:
| To be fair as an outsider to both Rust and Js they seem
| to have pretty robust package management between cargo
| and npm, although npm is kinda cheating as collating
| scripts isn't quite as complex building binaries whereas
| PIP's absolutely unberable with all the virtual env
| stuff.
|
| I've been quite lucky with CMake, after the initial
| learning period I've found everything "just works" as it
| is quite well supported by modern libs.
| jerf wrote:
| I've mentioned this before on HN as a way for a "newbie" to
| look like a superhero in a job very quickly; nice to hear a
| story of it actually working!
|
| There is _so much_ code in the world that nobody has even so
| much as _glanced_ at a profile of, and any non-trivial,
| unprofiled code base is virtually guaranteed to have some kind
| of massive performance problem that is also almost trivial to
| fix like this.
|
| Put this one in your toolbelt, folks. It's also so fast that
| you can easily try it without having to "schedule" it, and if
| I'm wrong and there aren't any easy profiling wins, hey, nobody
| has to know you even looked. Although in that case, you just
| learned something about the quality of the code base; if there
| aren't any profiling quick wins, that means someone else
| claimed them. As the codebase grows the probability of a quick
| win being available quickly goes to 1.
| nullify88 wrote:
| I have a similar experience with xdebug for a PHP shop I used
| to work at. It feels very similar to being a nerd back at
| school, rescuing peoples home work, and being rewarded with
| some respect.
| azurezyq wrote:
| I have a very similar experience, but with a different
| profiling tool. When I first graduated from school and joined a
| big internet company, I'm not that "different". The serving
| stack was all in C++. My colleagues were really capable but not
| that into "tools", they'd rather depend on themselves (guess,
| tune, measure).
|
| But I, as a fresh member in the team, learned and introduced
| Google perftools to the team and did a presentation of the
| breakdown of the running time of the big binary. I have to say
| that presentation was a life-changing moment in my career.
|
| So together with you, I really want to thank those who devoted
| heavily into building these tools. When I was doing the
| presentation, I really felt standing on the shoulders of giants
| and those giants were helping me.
|
| And over years, I used more and more tools like valgrind,
| pahole, asan, tsan.
|
| Much appreciated!
| dijonman2 wrote:
| I'm surprised to see the attribution to the tools and not your
| proposed fixes. Sure the discovery was the first step in the
| order of operations, but can you elaborate on what enabled you
| to understand the problem statement and subsequent resolution?
|
| There has to be a deeper understanding I think
| imetatroll wrote:
| Sounds like the solution probably had something to do with
| switching to passing by reference + other changes I would
| assume.
| intelVISA wrote:
| A big pain point for using coroutines is having to pass-by-
| value more frequently due to uncertain lifetimes.. it's
| jarring when you come from zero copy programming.
| cbrogrammer wrote:
| That is what many people fail to understand as to why us C
| programmers dislike C++
| pjmlp wrote:
| Indeed, because languages with reference parameters
| preceed C for about 15 years, and are present in most
| ALGOL derived dialects.
| azurezyq wrote:
| I can share mine. It's an ads retrieval system. Latency is
| very sensitive and it has to be efficient. To avoid mem
| allocations, special hashtables with fixed number of buckets
| (also open addressing) are used in multiple places in query
| processing. Default is 1000. However, there are cases that
| number of elements are only a handful. Then in this case, it
| fails to utilize the cache, hence slower.
|
| The solution is to tune number of buckets from info derived
| from the pprof callgraph.
|
| There were others too, like redundant serialization, etc. But
| this one is the most interesting.
| alexott wrote:
| I also heavily used callgrind/cachegrind to tune critical
| paths in our high performance web proxy, we're each
| micro/milliseconds counts... For example, in media type
| detection that is called multiple times per request
| (minimum twice for request/response), etc.
| zasdffaa wrote:
| That's surprising. If I was writing this I'd have
| instrumented the code for the buckets to (optionally) log
| the use, and probably add an alert.
|
| (being an armchair expert is easy though)
| mukundesh wrote:
| Using Cachegrind to get hardware independent performance numbers
| (https://pythonspeed.com/articles/consistent-benchmarking-in-...)
|
| Also used by SQLite in their performance measurement
| workflow(https://sqlite.org/cpu.html#performance_measurement)
| [deleted]
| anewpersonality wrote:
| Is Valgrind any use in Rust?
| pjmlp wrote:
| Depends how much unsafe code blocks you make use of.
| jackosdev wrote:
| I work full-time with Rust, use it all the time to see how much
| memory is being allocated to the heap, make a change and then
| see if there's a difference, and also for cache misses:
|
| valgrind target/debug/rustbinary
|
| ==10173== HEAP SUMMARY:
|
| ==10173== in use at exit: 854,740 bytes in 175 blocks
|
| ==10173== total heap usage: 2,046 allocs, 1,871 frees,
| 3,072,309 bytes allocated
|
| ==10173==
|
| ==10173== LEAK SUMMARY:
|
| ==10173== definitely lost: 0 bytes in 0 blocks
|
| ==10173== indirectly lost: 0 bytes in 0 blocks
|
| ==10173== possibly lost: 1,175 bytes in 21 blocks
|
| ==10173== still reachable: 853,565 bytes in 154 blocks
|
| ==10173== suppressed: 0 bytes in 0 blocks
|
| ==10173== Rerun with --leak-check=full to see details of leaked
| memory
|
| valgrind --tool=cachegrind target/debug/rustbinary
|
| ==146711==
|
| ==146711== I refs: 1,054,791,445
|
| ==146711== I1 misses: 11,038,023
|
| ==146711== LLi misses: 62,896
|
| ==146711== I1 miss rate: 1.05%
|
| ==146711== LLi miss rate: 0.01%
|
| ==146711==
|
| ==146711== D refs: 793,113,817 (368,907,959 rd + 424,205,858
| wr)
|
| ==146711== D1 misses: 757,883 ( 535,230 rd + 222,653 wr)
|
| ==146711== LLd misses: 119,285 ( 49,251 rd + 70,034 wr)
|
| ==146711== D1 miss rate: 0.1% ( 0.1% + 0.1% )
|
| ==146711== LLd miss rate: 0.0% ( 0.0% + 0.0% )
|
| ==146711==
|
| ==146711== LL refs: 11,795,906 ( 11,573,253 rd + 222,653 wr)
|
| ==146711== LL misses: 182,181 ( 112,147 rd + 70,034 wr)
|
| ==146711== LL miss rate: 0.0% ( 0.0% + 0.0% )
| rwmj wrote:
| Not used it with Rust, but have used it with OCaml, Perl, Ruby,
| Tcl successfully. In managed languages it's mainly useful for
| detecting problems in C bindings rather than the language
| itself. Languages where it doesn't work well: Python and
| Golang.
| pjmlp wrote:
| > Speaking of software quality, I think it's fitting that I now
| work full time on Rust, a systems programming language that
| didn't exist when Valgrind was created, but which basically
| prevents all the problems that Memcheck detects.
|
| Just like Ada has been doing since 1983.
| oconnor663 wrote:
| My understanding is that dynamically freeing memory is an
| unsafe operation in Ada, do I have that right?
| pjmlp wrote:
| Depends on which dynamic memory you are talking about.
|
| Ada can manage dynamic stacks, strings and arrays on its own.
|
| For example, Ada has what one could call type safe VLAs,
| instead of corrupting the stack like C, you get an exception
| and can redo the call with a smaller size, for example.
|
| As for explicit heap types and _Ada.Unchecked_Deallocation_ ,
| yes if we are speaking about Ada 83.
|
| Ada 95 introduced controlled types, which via Initialize,
| Adjust, and Finalize, provide the basis of RAII like features
| in Ada.
|
| Here is an example on how to implement smart pointers with
| controlled types,
|
| https://www.adacore.com/gems/gem-97-reference-counting-in-
| ad...
|
| There is also the possiblity to wrap heap allocation
| primitives with safe interfaces exposed via storage pools,
| like on this tutorial https://blog.adacore.com/header-
| storage-pools
|
| Finally thanks to SPARK, nowadays integrated into Ada
| 2012[0], you can also have formal proofs that it is safe to
| release heap memory.
|
| In top of all this, Ada is in the process of integrating
| affine types as well.
|
| [0] - Supported in PTC and GNAT, remaining Ada compilers have
| a mix of Ada 95 - 2012 features, see
| https://news.ycombinator.com/item?id=27603292
| touisteur wrote:
| That said, I still use valgrind because we have to
| integrate C libraries sometimes (libpcl is my favorite
| culprit, only because I'm trying , and there's still
| possibility to blow the stack (yeah you can use gnatstack
| to get a good idea of your maximum stack size, but it's
| doesn't cover the whole Ada featureset and stack canaries -
| fstack-check don't catch everything.
|
| _edit_ Also massif, call /cachegrind and hellgrind have
| saved our bacon many, many times.
|
| Even more interesting is writing your own tools with
| valgrind. Here https://github.com/AdaCore/gnatcoverage/tree
| /master/tools/gn... is the code of a branch-trace adapter
| for valgrind (outputs all branches taken/not-taken in
| 'qemu' format). Very useful if you can run a pintool or
| Intel Processor Trace just for that.
|
| And if you keep digging, the angr symbolic execution
| toolkit use (used?) VEX as an intermediate representation.
| _end of edit_
|
| Ada doesn't catch uninitialized variables by default
| (although warnings are getting better). You can either go
| Spark 'bronze level' (dataflow proof, every variable is
| initialized) or use 'pragma Initialize_Scalars' combined
| with -gnatVa.
|
| Some of these techniques described in that now old blog
| post full of links https://blog.adacore.com/running-
| american-fuzzy-lop-on-your-... (shameless plug) where one
| can infer that even proof of absence of runtime errors
| isn't a panacea and fuzzing still has its use even on
| fully-proved SPARK code.
| lma21 wrote:
| When we moved to Linux, Valgrind was THE tool that saved our as*s
| day after day after day. An issue in production? rollback,
| valgrind, fix, push, repeat. Thank you for all the hard work, in
| fact i don't i can thank you enough.
| junon wrote:
| Valgrind's maintainers are super pleasant and have been quite
| helpful in a number of cases I've personally had to reach out to
| them.
|
| Lovely piece of software toward which I owe a lot of gratitude.
| amelius wrote:
| Are people using Valgrind on Python packages?
|
| It seems some packages (even basic ones) are not compatible with
| Valgrind, thereby spoiling the entire debugging experience.
| Olumde wrote:
| Happy birthday Valgrind. Next year you'll be able to drink in the
| US!
|
| Being a UK PhD holder, a sentence stood out out to me was a
| commentary/comparison between UK and US PhDs: "This was a three
| year UK PhD, rather than a brutal six-or-more year US PhD."
|
| My cousin has a US PhD and judging from what he tells me. It is a
| lot more rigorous than UK PhDs.
| wenc wrote:
| The UK PhD is 3 yrs, after a 1 yr Masters and 3 yr bachelors.
| (7 years)
|
| The US PhD is usually 4-5 years after a 4 year bachelors (8-9
| years). It is a little bit longer with more graduate-level
| coursework.
|
| That said, the US bachelors starts at age 17 while a UK
| bachelors starts after 2 years of A-levels. So in terms of
| length it's a wash.
| pbhjpbhj wrote:
| FWIW, you have to be slightly careful as Scotland has a
| different post-16 education provision.
|
| AIUI you can do Highers (equivalent to GCSE, at 16) and enter
| Uni then with sufficiently high grades (aged 16/17). Or, stay
| on for one more year to do Advanced Higher (most common). Uni
| courses can then be 4 or occasionally 3 years. Don't quote
| me!
| piker wrote:
| US college starts around age 18, which I understand is about
| the time A-levels are completed, so I believe there are 2
| more years of education associated with a US PhD.
| not2b wrote:
| It took me four years for my US PhD, but I had a masters and
| industrial experience which might have helped speed things up.
| nneonneo wrote:
| Hah, I teach my students to use Valgrind, _and_ I've been
| pronouncing it wrong this whole time. Guess I'll have to make
| sure to get that right next semester :)
|
| The magic of Valgrind really lies in its ability to detect errors
| without recompiling the code. Sure, there's a performance hit,
| but sometimes all you have is a binary. It's damn solid on Linux,
| and works even with the custom threading library we use for the
| course; shame the macOS port is barely maintained (last I
| checked, it only worked on OSes from a few years back - anything
| more recent will execute syscalls during process startup that
| Valgrind doesn't handle).
| amelius wrote:
| One problem with Valgrind is that the thing you're debugging
| should have been tested with Valgrind from the start, otherwise
| you're just going to be flooded with false triggers.
|
| Now imagine that you're developing a new application and you want
| to use some library, and it _hasn 't_ been tested with valgrind
| and generates tons of false messages. Should you then use it? Or
| look for an alternative library?
| Sesse__ wrote:
| I live not far from Valgrindvegen (Valgrind road); I've always
| wondered whether the developers knew it existed. :-)
| j1elo wrote:
| Valgrind is an amazingly useful tool. The biggest pain point,
| though, has always been to read through and process the huge
| amount of false positives that typically come from 3rd-party
| support libraries, such as GLib. It provides some suppression
| files to be used with Valgrind, but still, GLib has its own
| memory allocator, so things tend to go awry.
|
| Running Helgrind or DRD (for threading issues) with GLib has been
| a bit frustrating, too. If anyone has some advice to share about
| this, I'm all ears!
|
| (EDIT: I had mistakenly left out the phrase about suppression
| files)
| whimsicalism wrote:
| It's unfortunate that so many of these great tools (like `perf`
| and I believe `valgrind`) are basically not available locally on
| the Mac.
|
| And running in a container is not really a solution for most of
| these.
| wyldfire wrote:
| Sanitizers and electric fence are ultra portable, they're
| definitely available on macos. The feature set from valgrind is
| a bit richer but not by much.
| whimsicalism wrote:
| I am not familiar with electric fence but I remember from my
| experience that there are definitely important things that I
| got from `perf` and `valgrind` that the alternative
| sanitizers did not provide. Can't recall what now of course.
| nyanpasu64 wrote:
| asan/ubsan do not detect uninitialized memory reads (though
| ubsan can detect when bools take on invalid bit patterns
| from uninitialized memory), and msan requires rebuilding
| the standard library or something, so I've never used msan.
| Valgrind is slow, but detects uninitialized memory reads
| properly, and doesn't require rebuilding the app (which is
| useful when running a complex or prebuilt app for short
| periods of time).
|
| On the topic of profiling, callgrind can count exact
| function calls and generate accurate call graphs, which I
| find useful for not only profiling, but tracing the
| execution of unfamiliary code. I just wish rr had similarly
| fast tooling (pernosco is close enough to be useful, but I
| think there's value in exploring different workflows than
| what they picked).
| 1over137 wrote:
| >msan requires rebuilding the standard library or
| something
|
| Yes, which is a PITA. But even then, macOS is not
| supported anyway:
|
| https://clang.llvm.org/docs/MemorySanitizer.html#supporte
| d-p...
| dwroberts wrote:
| Valgrind does a lot of low level trickery so it hasn't always
| supported the latest macOS releases straight away (or
| sometimes would support them with serious
| gotchas/limitations)
| glandium wrote:
| valgrind is available on mac. From the homepage: "It runs on
| the following platforms: (...) X86/Darwin and AMD64/Darwin (Mac
| OS X 10.12).". There's a notable omission of ARM64/Darwin in
| there, and I don't think it's an oversight.
|
| What Mac is definitely lacking, though, is reverse debugging.
| Linux has rr, Windows has Time Travel Debugging. macOS still
| doesn't have an equivalent.
| saagarjha wrote:
| Valgrind, as I understand it, was essentially maintained by
| one engineer at Apple who has since left the company, so
| nobody has really updated it.
| plorkyeran wrote:
| He's still at Apple, but he works on the Swift runtime
| these days rather than C/C++ tooling.
| 1over137 wrote:
| That's my understanding too, and I believe you're referring
| Greg Parker:
|
| http://www.sealiesoftware.com/valgrind/
| 1over137 wrote:
| There have been 6 major releases since 10.12 (which was from
| late 2016). In other words, valgrind has basically stopped
| supporting macOS.
| glandium wrote:
| I don't think it means it doesn't work with newer versions.
| 1over137 wrote:
| I'm afraid you're wrong. It does _not_ work with newer
| macOS versions, I 've tried.
| syockit wrote:
| There are times when LeakSanitizer (in gcc-8.2) would not give me
| the full backtrace of a leak, while valgrind would, so to me it's
| still an indispensable tool for debugging leaks. One caveat is
| that it's magnitudes slower than valgrind. Now, if only I know
| how to make valgrind run as fast as LeakSanitizer... (command
| line options?)
| rigtorp wrote:
| You might need to add -fno-omit-frame-pointer to help ASAN
| unwind the stack.
| abbeyj wrote:
| This is definitely an option you want to be using when using
| ASan or LSan. You may also want to consider additionally
| using -momit-leaf-frame-pointer to skip frame pointers only
| in leaf functions while keeping frame pointers for non-leaf
| functions. This can make small leaf functions significantly
| shorter, limiting some of the negative impact of using -fno-
| omit-frame-pointer alone.
|
| Sometimes even -fno-omit-frame-pointer won't help, like if
| the stack is being unwound through a system library that was
| built without frame pointers. In that case you can switch to
| the slow unwinder. Set the environment variable
| `ASAN_OPTIONS=fast_unwind_on_malloc=0` when running your
| program. But note that this will make most programs run
| significantly slower so you probably want to use it only when
| you really need it and not as the default setting for all
| runs.
| tarasglek wrote:
| Beyond raw technical ability, Nick and Julian were the kindest,
| most reasonable developers I've ever interacted with. I think a
| lot of Valgrind's success stems from combination of sophisticated
| tech and approachability of the core team.
___________________________________________________________________
(page generated 2022-07-27 23:02 UTC)