[HN Gopher] Are your memory-bound benchmarking timings normally ...
___________________________________________________________________
Are your memory-bound benchmarking timings normally distributed?
Author : mfiguiere
Score : 18 points
Date : 2023-04-06 20:47 UTC (2 hours ago)
(HTM) web link (lemire.me)
(TXT) w3m dump (lemire.me)
| sliken wrote:
| I've been writing micro memory benchmarks and have been rather
| surprised how hard something as simple as quantifying latency and
| bandwidth under multicore loads can be. The memory hierarchy is
| getting ever more complex. Cacheline sizes, prefetch, 3 levels of
| cache, TLB effects, page alignments, cache associativity, etc.
| Also have to be careful that the compiler doesn't optimize away
| parts of your code. It's quite tricky to get a nice clean array
| size vs latency graph, doubly so when multiple cores are
| involved.
|
| Some of my assumptions about latency were wrong. One thing I
| didn't realize is it takes about half the latency to main memory
| to get miss through L1, L2, and L3. Also that you need to have
| around 2x the memory references pending to keep the memory system
| busy. It makes sense in retrospect, you want 16 pending memory
| references to keep 8 memory channels busy, otherwise a memory
| channel will return a cache line, and there won't be any L3 cache
| misses pending for that channel.
|
| Generally I like to keep a small histogram of cycle counters to
| make sure I'm seeing the distribution I expect, seeing an unusual
| distribution is key for tracking down something you didn't
| account for.
| darksaints wrote:
| You might want to check out the gamma distribution. It is also
| zero-bounded just like the log normal distribution, but it was
| originally created to model waiting times within queue theory,
| which is actually an excellent parallel to the idea of measuring
| compute latency.
| pclmulqdq wrote:
| Gamma and delta distributions have been very helpful to me in
| performance work, as well as non-parametric statistical tests.
| However, when you try to tell a lot of other engineers about
| them, they don't really understand why a t-test and a standard
| deviation doesn't work.
| kelseyfrog wrote:
| I mean it's physically impossible for the generating process of
| positive timings to be normally distributed. The normal
| distribution has support x [?] R.
| ericpauley wrote:
| Great article!
|
| Statistical fallacies are rampant in performance eval, even in
| academic settings. When designing statistical tests for
| performance, the keyword you want to use here is non-parametric.
| I.e., a U-test is a non-parametric analog to the t-test. It just
| looks at the rank statistics of results instead of their value,
| thus eliminating dependence on the underling distribution.
|
| Another issue that pops up is sample independence. Statistical
| tests are often predicated on each sample being independent and
| identically distributed (i.i.d.), but in reality this is often
| not the case. For instance, running all the tests of one group
| and then all the tests of the other could heat the CPU and cause
| reduced performance in the second trial.
| zX41ZdbW wrote:
| We use non-parametric statistics for performance testing in
| ClickHouse[1], picked from the article "A Randomized Design
| Used in the Comparison of Standard and Modified Fertilizer
| Mixtures for Tomato Plants".
|
| [1] https://clickhouse.com/blog/testing-the-performance-of-
| click...
___________________________________________________________________
(page generated 2023-04-06 23:00 UTC)