[HN Gopher] Show HN: Rust test harness that measures energy cons...
___________________________________________________________________
Show HN: Rust test harness that measures energy consumption
Author : thijsr
Score : 104 points
Date : 2022-04-05 12:51 UTC (10 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| rictic wrote:
| Many benchmarking systems face measurement issues that make it
| difficult to produce solid results. Any given run might not be
| running on the same hardware, the same OS, built with the same
| compiler, running with the same runtime, with the same versions
| of dependencies, with the same system load, at the same
| temperature, and so on.
|
| One robust solution is to instead do pairwise comparisons, many
| times in a round robin fashion. The results aren't quite as nice
| to plot, as you don't get a single consistent speed value, but
| they are much more reliable and true, and you still get useful
| information, like ">95% chance that this test is at least 20%
| faster at this commit than at the previous one".
|
| A project I contribute to uses this strategy:
| https://github.com/Polymer/tachometer, but I'd love it if more
| benchmarks took this approach.
| ______-_-______ wrote:
| Counting instructions is very accurate and roughly approximates
| power usage. The CPU's self-reported power usage is comparatively
| pretty noisy. Unit tests will probably be done running before you
| can get meaningful data. I have to wonder if a test runner is the
| best point of integration for this. It might make more sense to
| expose it as a bench harness, like criterion.
|
| EDIT: another benefit of a criterion-like approach is that you
| wouldn't require nightly
| thijsr wrote:
| Yes, the CPU self-reported power usage is indeed fairly noisy.
| We've tried to mitigate this by executing certain tests
| multiple times in a row, and using the average power
| consumption across these executions. However, this is data is
| still quite noisy and is influenced by external factors, like
| the operating system, your hardware, power management settings,
| and a lot more. We mention this and possible mitigating actions
| in the README.
|
| We chose for a test harness because one of our goals was to
| make it as easy as possible to run it on existing Rust
| projects. A lot of projects define tests, but benchmarks are
| not often not present. But maybe a bench harness would be a
| better and/or cleaner approach, will look into it!
| Shish2k wrote:
| > Counting instructions is very accurate and roughly
| approximates power usage
|
| I've always assumed this to be true, but I see a lot of
| benchmarking tools / libraries measuring wall-clock time or
| iterations-per-second or something like that, I've never seen a
| benchmark tool which counts CPU instructions. Am I being blind
| or is there some other reason that I'm not seeing them? :S
| ______-_-______ wrote:
| At the end of the day most people care about wall clock time.
| It's a real physical value that's easy to understand and easy
| to compare between systems. Plus, if two functions execute
| say, 1 billion instructions each, but one spends extra time
| stalled waiting on IO or data fetches from RAM, you
| definitely want to account for that in normal benchmarking.
|
| Instruction counting is more of a specialized tool but I like
| to use it whenever I can because it has low variance and
| makes comparing changes a lot easier. Compare how bumpy these
| graphs are for instruction count (first link) and wall clock
| time (second link):
|
| https://perf.rust-lang.org/
|
| https://perf.rust-
| lang.org/?start=&end=&kind=raw&stat=wall-t...
| mhh__ wrote:
| Counting instructions properly is hard and also results in a
| good amount of overhead if you don't use a bunch of tricks or
| a kernel module.
|
| You also can't really count instructions in the cloud.
| wooosh wrote:
| Counting (userspace) instructions is relatively easy
| regardless of language with perf stat, though it does
| require the kernel module. Generally speaking it should
| just work if perf is installed through the package manager
| for your distribution.
|
| edit: valgrind's callgrind utility can also produce exact
| instruction execution counts for a given block of code
| mhh__ wrote:
| Callgrind can give you instruction counts yes. It doesn't
| simulate any microarchitecture other than caches which
| means its only useful for comparing with itself.
|
| Perf stat is very very high overhead. The perf API is
| available and can be tuned a bit more nicely but it's
| mostly a horrible mess. It uses bitfields too which makes
| it somewhat hard to get to from other languages unless
| you trust the shifts and masks.
| wmf wrote:
| Instructions correlate to energy but not to performance. If
| you're benchmarking performance you should use wall clock
| time.
| wooosh wrote:
| Counting instructions does not give information about time
| spent in syscalls/doing IO, which limits its use to CPU-bound
| software.
| hd4 wrote:
| This reminds me of an idea I wanted to submit to the systemd team
| (or wherever it would be more appropriate) to have a Linux
| service report on the current power usage of the OS and maybe
| even translate it into currency-per-hour to show people how much
| they were spending over time with the aim of reducing power
| wastage. Seems like it would be more relevant than ever given the
| global situation around energy these days.
| yjftsjthsd-h wrote:
| So... powertop? Possibly run as a daemon logging to the
| journal, which I grant is somewhat different from how it works
| now (ncurses tool or something you run and log to CSV).
| thijsr wrote:
| Hey, I wanted to share a project that we've been working on!
| Coppers is a custom test harness for Rust that allows you to
| measure the energy consumption of your test suite. A use case for
| this could be to identify regressions in energy usage, or to do
| more targeted energy optimizations. Our goal was to make it as
| seamless as possible to integrate it with existing Rust projects.
| To make that work, we had to rely on some unstable and internal
| Rust compiler features that are only available in nightly. But
| the current implementation seems to be able to measure the energy
| consumption of almost every existing Rust crate we tested! (with
| the exception of embedded and system-specific crates, but that is
| a limitation we're looking into)
| teitoklien wrote:
| it's a pretty cool project :D, Thank you for making it ! I'll
| definitely try it in my next project.
| ducktective wrote:
| Very nice!
|
| Any ideas on how to measure energy consumption of programs in a
| GNU/Linux OS? I know of `powertop` but it measures total power
| consumption (its per-program table is inaccurate)
___________________________________________________________________
(page generated 2022-04-05 23:01 UTC)