[HN Gopher] How much does Rust's bounds checking cost?
___________________________________________________________________
How much does Rust's bounds checking cost?
Author : glittershark
Score : 118 points
Date : 2022-11-30 18:38 UTC (4 hours ago)
(HTM) web link (blog.readyset.io)
(TXT) w3m dump (blog.readyset.io)
| killingtime74 wrote:
| Can someone smarter than me enlighten me when you would consider
| disabling bounds checking for performance? In ways the compiler
| is not already doing so? The article starts with a bug that would
| have been prevented by bounds checking. It's like talking about
| how much faster a car would go if it didn't have to carry the
| extra weight of brakes.
| pornel wrote:
| In tight numeric code that benefits from autovectorization.
|
| Bound checks prevent some optimizations, since they're a branch
| with a significant side effect that the compiler must preserve.
| masklinn wrote:
| > It's like talking about how much faster a car would go if it
| didn't have to carry the extra weight of brakes.
|
| And there's folks who do exactly that.
| returningfory2 wrote:
| I think the point of the article is the other way around: when
| starting from a language like C that doesn't have bound
| checking, moving to Rust will involve adding bounds checks and
| then an argument will be made that this will regress
| performance. So to test that hypothesis you start with the safe
| Rust code, and then remove the bounds check to emulate what the
| C code might be like. If, as in the article, you find that
| performance is not really affected, then it makes a C-to-Rust
| migration argument more compelling.
| pitaj wrote:
| Sometimes the programmer can prove that bounds checks are
| unnecessary in a certain situation, but the compiler can't
| prove that itself, and the programmer can't communicate that
| proof to the compiler. Bounds checks _can_ result in lost
| performance in some cases (very tight loops), so unsafe
| facilities exist as a workaround (like `get_unchecked`).
| heleninboodler wrote:
| I don't really see this as a person who _wants_ to turn off the
| bounds checking for any real reason, but as someone who just
| wants to have an idea of what the cost of that bounds checking
| is.
| constantcrying wrote:
| >Can someone smarter than me enlighten me when you would
| consider disabling bounds checking for performance?
|
| Because it is faster. Worst case you are triggering a branch
| miss, which is quite expensive.
|
| >It's like talking about how much faster a car would go if it
| didn't have to carry the extra weight of brakes.
|
| So? Every wasted CPU cycle costs money and energy. Especially
| for very high performance applications these costs can be very
| high. Not every car needs brakes, if it doesn't need to stop by
| itself and crashing hurts nobody they are just waste.
| jackmott42 wrote:
| In Rust you can usually use iterators to avoid bounds checks.
| This is idiomatic and the fast way to do things, so usually
| when using Rust you don't worry about this at all.
|
| But, occasionally, you have some loop that can't be done with
| an iterator, AND its part of a tight loop where removing a
| single conditional jump matters to you. When that matters it is
| a real easy thing to use an unsafe block to index into the
| array without the check. The good news is then at least in your
| 1 million line program, the unsafe parts are only a few lines
| that you are responsible for being sure are correct.
| emn13 wrote:
| Sometimes the extra speed can be relevant. Knowing what the
| upside _can_ be can help inform the choice whether it's worth
| it.
|
| Secondly, even assuming you want runtime bounds checking
| everywhere, then this is still a useful analysis because if you
| learn that bounds-checking has no relevant overhead - great! No
| need to look at that if you need to optimize. But if you learn
| that it _does_ have an overhead, then you have the knowledge to
| guide your next choices - is it enough to be worth spending any
| attention on? If you want the safety, perhaps there's specific
| code paths you can restructure to make it easier for the
| compiler to elide the checks, or the branch predictor to make
| em smaller? Perhaps you can do fewer indexing operations
| altogether? Or perhaps there's some very specific small hot-
| path you feel you can make an exception for; use bounds-
| checking 99% of the time, but not in _that_ spot? All of these
| avenues are only worth even exploring if there 's anything to
| gain here in the first place.
|
| And then there's the simple fact that having a good intuition
| for where machines spend their time makes it easier to write
| performant code right off the bat, and it makes it easier to
| guess where to look first when you're trying to eek out better
| perf.
|
| Even if you like or even need a technique like bounds checking,
| knowing the typical overheads can be useful.
| d265f278 wrote:
| I've seen bounds checks being compiled to a single integer
| comparison followed by a jump (on x86 at least). This should
| have a negligible performance impact for most programs running
| on a modern, parallel CPU. However, for highly optimized
| programs that constantly saturate all processor instruction
| ports, bounds checks might of course become a bottleneck.
|
| I think the most preferable solution (although not always
| possible) would be to use iterators as much as possible. This
| would allow rustc to "know" the entire range of possible
| indexes used at runtime, which makes runtime bounds checking
| redundant.
|
| Some old benchmarks here: https://parallel-rust-
| cpp.github.io/v0.html#rustc
| dathinab wrote:
| In some very hot code (most times loops with math stuff were it
| prevents some optimizations and/or the compiler fails to
| eliminate it) it can lead to relevant performance improvements.
|
| Because of this you might find some rust code which opts out of
| bounds check for such code using unsafe code.
|
| But this code tends to be fairly limited and often encapsulated
| into libraries.
|
| So I agree that for most code doing so is just plain stupid. In
| turn I believe doing it on a program or compilation unit level
| (instead of a case by case basis) is (nearly) always a bad
| idea.
| moloch-hai wrote:
| The instructions generated make a big difference. Modern
| processor specifications commonly quote how many instructions of
| a type can be "retired" in a cycle. They can retire lots of
| conditional branches at once, or branches and other ops, when the
| branches are _not taken_.
|
| So it matters whether the code generator produces dead branches
| that can be retired cheaply. Probably, optimizers take this into
| account for built-in operations, but they know less about the
| happy path in libraries.
|
| This is a motivation for the "likely" annotations compilers
| support. The likely path can then be made the one where the
| branch is not taken. Code on the unhappy path can be stuck off in
| some other cache line, or even another MMU page, never fetched in
| normal operation.
|
| The cost seen here is likely from something else, though. Keeping
| array size in a register costs register pressure, or comparing to
| a stack word uses up cache bandwidth. Doing the comparison burns
| an ALU unit, and propagating the result to a branch instruction
| via the status register constrains instruction order.
|
| Even those might not be at fault, because they might not add any
| extra cycles. Modern processors spend most of their time waiting
| for words from memory: just a few cycles for L1 cache, many more
| for L2 or L3, an eternity for actual RAM. They can get a fair bit
| done when everything fits in registers and L1 cache, and loops
| fit in the micro-op cache. Blow any of those, and performance
| goes to hell. So depending how close your code is to such an
| edge, extra operations might have zero effect, or might tank you.
|
| Results of measurements don't generalize. Change something that
| looks like it ought to make no difference, and your performance
| goes up or down by 25%. In that sense, the 10% seen here is noise
| just because it is hard to know what might earn or cost you 10%.
| dathinab wrote:
| In rust there is `#[cold]` for functions as well as (nightly
| only) `likely(cond)`/`unlikely(cond)` and some tricks you can
| have something similar in stable rust.
|
| Also branch paths which lead guaranteed to an panic tend to be
| treated as "unlikely" but not sure how far this is guaranteed.
| pitaj wrote:
| Very interesting. One thing that I'm curious about is adding the
| bounds-check assertion to `get_unchecked` and seeing if that has
| a significant effect.
| masklinn wrote:
| Happens from time to time, I've seen folks going around
| libraries looking for "perf unsafe" and benching if removing
| the unsafe actually lowered performances.
|
| One issue on that front is a question of reliability /
| consistency: on a small benchmark chances are the compiler will
| always trigger to its full potential because there's relatively
| little code, codegen could be dodgier in a context where code
| is more complicated or larger.
|
| Then again the impact of the bounds check would also likely be
| lower on the non-trivial code (on the other hand there are also
| threshold effects, like branch predictor slots, icache sizes,
| ...).
| jackmott42 wrote:
| Occasionally small changes like this will result in bigger than
| expected performance improvements. An example of this happened
| once with C#, when two very tiny changes, each of which were
| borderline measurable, combined they made a big difference.
|
| IIRC it was in the List.Add method, a very commonly used function
| in the C# core libs. First one programmer refactored it to very
| slightly reduce how many instructions were output when compiled.
| Then a second programmer working on the jit compiler
| optimizations which also affected this Add method making it a
| little smaller as well.
|
| Alone, each change was hard to even measure, but seemed like they
| should be a net win at least in theory. Combined, the two changes
| made the Add method small enough to be an in-lining candidate!
| Which meant in real programs sometimes very measurable
| performance improvements result.
|
| As others in this post have noted, a removed bounds check might
| also unblock vectorization optimizations in a few cases. One
| might be able to construct a test case where removing the check
| speeds thing up by a factor of 16!
| downvotetruth wrote:
| Too much - write vague question titles & get comments in kind
| tayistay wrote:
| What if a compiler were to only allow an array access when it can
| prove that it's in bounds? Wherever it can't you'd have to wrap
| the array access in an if, or otherwise refactor your code to
| help the compiler. Then you'd have no panicking at least and more
| predictable performance.
| constantcrying wrote:
| >What if a compiler were to only allow an array access when it
| can prove that it's in bounds?
|
| Even very good static analysis tools have a hard time doing
| this. In a language like C++ this would effectively mean that
| very few index operations can be done naively and compile times
| are significantly increased. Performance is likely reduced as
| well over the trivial alternative of using a safe array.
| glittershark wrote:
| there's a cheeky link to idris's vector type in the second
| paragraph: https://www.idris-
| lang.org/docs/idris2/current/base_docs/doc... which
| accomplishes just that
| jackmott wrote:
| tylerhou wrote:
| With bounds checking by default, even if a compiler can't
| statically prove that an index is in bounds, if the index in
| practice is always in bounds, the compiler inlines the
| check/branch into the calling function, and you're not starved
| of branch prediction resources, the check will be "free"
| because the branch will always be predicted as taken.
| est31 wrote:
| Rust has a tool for that, it's iterators. It is only limited
| however.
| dathinab wrote:
| also `slice.get()` will return on option
|
| and using a range check manually before an index will
| normally optimize the internal bounds check away
| nigeltao wrote:
| That's how WUFFS (Wrangling Untrusted File Formats Safely)
| works:
|
| https://github.com/google/wuffs#what-does-compile-time-check...
| constantcrying wrote:
| I imagine the reason bounds check are cheap is because of the
| branch predictor. If you always predict the in bounds path, the
| check is almost free.
|
| You also do not really care about flushing the pipe on an out of
| bounds index, since very likely normal operations can not go on
| and you move over to handling/reporting the error, which likely
| has no need for significant throughput.
|
| Also I would just like to note that safe arrays aren't a unique
| rust feature. Even writing your own in C++ is not hard.
| jackmott42 wrote:
| Yep, unless your code is wrong, the bounds check will always be
| predictable. Which makes it free in a sense. But sometimes it
| will block other optimizations, and it takes up space in the
| caches.
| pantalaimon wrote:
| That would be bad on Embedded where MCUs usually don't do any
| branch prediction.
| dathinab wrote:
| > If you always predict the in bounds path, the check is almost
| free.
|
| Note that you often only branch in the "bad" case, which means
| even on systems without branch prediction it tends to be not
| very expensive, and compilers can also eliminate a lot of
| bounds checks.
| Jach wrote:
| Always amuses me that it's current year and people think about
| turning off checks, even when they're pretty much free in modern*
| (since 1993 Pentium, which got like 80% accuracy with its
| primitive branch prediction?) CPUs...
|
| "Around Easter 1961, a course on ALGOL 60 was offered ... After
| the ALGOL course in Brighton, Roger Cook was driving me and my
| colleagues back to London when he suddenly asked, "Instead of
| designing a new language, why don't we just implement ALGOL60?"
| We all instantly agreed--in retrospect, a very lucky decision for
| me. But we knew we did not have the skill or experience at that
| time to implement the whole language, so I was commissioned to
| design a modest subset. In that design I adopted certain basic
| principles which I believe to be as valid today as they were
| then.
|
| "(1) The first principle was _security_ : The principle that
| every syntactically incorrect program should be rejected by the
| compiler and that every syntactically correct program should give
| a result or an error message that was predictable and
| comprehensible in terms of the source language program itself.
| Thus no core dumps should ever be necessary. It was logically
| impossible for any source language program to cause the computer
| to run wild, either at compile time or at run time. A consequence
| of this principle is that every occurrence of every subscript of
| every subscripted variable was on every occasion checked at run
| time against both the upper and the lower declared bounds of the
| array. Many years later we asked our customers whether they
| wished us to provide an option to switch off these checks in the
| interests of efficiency on production runs. Unanimously, they
| urged us not to -- they already knew how frequently subscript
| errors occur on production runs where failure to detect them
| could be disastrous. I note with fear and horror that even in
| 1980, language designers and users have not learned this lesson.
| In any respectable branch of engineering, failure to observe such
| elementary precautions would have long been against the law."
|
| -Tony Hoare, 1980 Turing Award Lecture
| (https://www.cs.fsu.edu/~engelen/courses/COP4610/hoare.pdf)
| dataangel wrote:
| They're nowhere near free. Branch prediction table has finite
| entries, instruction cache has finite size, autovectorizing is
| broken by bounds checks, inlining (the most important
| optimization) doesn't trigger if functions are too big because
| of the added bounds checking code, etc. This is just not great
| benchmarking -- no effort to control for noise.
| titzer wrote:
| > I note with fear and horror that even in 1980, language
| designers and users have not learned this lesson. In any
| respectable branch of engineering, failure to observe such
| elementary precautions would have long been against the law.
|
| Here we are, 42 years later, and bounds checks are still not
| the default in some languages. Because performance, or
| something. And our computers are literally 1000x as fast as
| they were in 1980. So instead of paying 2% in bounds checks and
| getting a _merge_ 980x faster, we get 2-3x more CVEs, costing
| the economy billions upon billions of dollars a year.
| nine_k wrote:
| Removing bounds checks is a stark example of a premature
| optimization.
|
| You can remove bounds checks when you can _prove_ that the
| index won 't ever get out of bounds; this is possible in many
| cases, such as iteration with known bounds.
| bugfix-66 wrote:
| Similarly, you can turn off bounds-checking in Go like this:
| go build -gcflags=-B
|
| and see if it helps. Generally the assembly looks better, but it
| doesn't really run faster on a modern chip.
|
| Do your own test, and keep the results in mind next time somebody
| on Hacker News dismisses Go because of the "overwhelming cost of
| bounds checking".
| masklinn wrote:
| > next time somebody on Hacker News dismisses Go because of the
| "overwhelming cost of bounds checking".
|
| That's certainly one criticism I don't remember ever seeing.
| viraptor wrote:
| There's a few examples like this
| https://news.ycombinator.com/item?id=32256038 if you search
| comments for "go bounds checking"
| lowbloodsugar wrote:
| I am a Rust fan, but 10% degradation in performance (29ms to
| 33ms) is not "a pretty small change" nor "within noise
| threshold". If the accuracy of the tests are +/- 10% then that
| needs to be proven and then fixed. I didn't see any evidence in
| the article that there is, in fact, a 10% error, and it looks
| like there is a genuine 10% drop in performance.
| glittershark wrote:
| a 10% drop in performance with bounds checks _removed_ , mind
| you - so if anything the bounds checks are improving
| performance.
| pantalaimon wrote:
| The more likely explanation is that the test is bunk.
|
| Or maybe the unsafe access acts like volatile in C and
| disables any optimization/reordering because the compiler
| thinks it's accessing a register.
| hra5th wrote:
| To be clear, _removing_ the bounds checks led to the observed
| performance degradation. Your statement beginning with "I am a
| Rust fan, but..." suggests that you might have interpreted it
| as the other way around
| lowbloodsugar wrote:
| I certainly did. Thank you for pointing that out. That
| suggests then that the problem is the tests are bogus.
| tragomaskhalos wrote:
| I would expect an iterator into a slice to not incur any bounds
| checking, as the compiler can deduce the end pointer one time as
| start + size. So idiomatic looping should be maximally efficient
| you'd hope.
| lmkg wrote:
| The compiler shouldn't have to deduce anything, an Iterator
| shouldn't have a bounds check to begin with. It ought to be
| using unsafe operations under the hood, because it can
| guarantee they will only be called with valid arguments.
| meindnoch wrote:
| Calling 'next' on an iterator involves a bounds check.
| tylerhou wrote:
| Safe iterators have to have bounds checks for dynamically
| sized arrays; otherwise, how you prevent iterators from
| walking past the end?
| apendleton wrote:
| In Rust at least, once you instantiate the iterator, the
| array it's iterating over can't be mutated until the
| iterator is dropped, and that can be statically guaranteed
| at compile time. So you don't need to bounds-check at every
| access; you can decide at the outset how many iterations
| there are going to be, and doing that number of iterations
| will be known not to walk past the end.
| vore wrote:
| I don't think that's always possible in practice:
| consider Vec<T>, whose size is only known at runtime. A
| Vec<T>'s iterator can only do runtime bounds checking to
| avoid walking past the end.
|
| That said, this is unavoidable in C/C++ too.
| apendleton wrote:
| I think we're suffering from some fuzziness about what
| bounds checks we're referring to. Even in your example,
| you only need to check the size of the Vec<T> when you
| instantiate the iterator, not each time the iterator
| accesses an element, because at the time the iterator
| over the Vec<T>'s contents is instantiated, the Vec<T>'s
| size is known, and it can't change over the life of the
| iterator (because mutation is disallowed when there's an
| outstanding borrow). With a regular for-loop:
| for i in 0..v.len() { println!("{:?}", v[i]);
| }
|
| you check the length at the top (the `v.len()`) and
| _also_ for each `v[i]`. The first is unavoidable, but the
| second can be skipped when using an iterator instead,
| because it can be statically guaranteed that, even if you
| don 't know at compile time what concretely the length
| is, whatever it ends up being, the index will never
| exceed it. Rust specifically differs from C++ in this
| respect, because nothing in that language prevents the
| underlying vector's length from changing while the
| iterator exists, so without per-access bounds checks it's
| still possible for an iterator to walk past the end.
| tylerhou wrote:
| When I read "iterator" I think of an object that points
| into the vector and can be advanced. For Rust's vector,
| that is std::slice::Iter (https://doc.rust-
| lang.org/std/slice/struct.Iter.html). When you advance an
| iterator, you must do a bounds check if the vector is
| dynamically sized; otherwise, you don't know when to
| stop. I.e., if I have let mut it =
| vec.iter(); println!(it.next());
| println!(it.next()); println!(it.next());
|
| This needs to do bounds checking on each call to next()
| to either return Some(a) or None (assuming the length of
| vec is unknown at compile time). (hhttps://doc.rust-
| lang.org/beta/src/core/slice/iter/macros.rs....)
|
| You are right that theoretically a range-based for loop
| _that uses iterators_ does _not_ need to do bounds
| checking because a compiler can infer the invariant that
| the iterator is always valid. In practice I don 't know
| enough about LLVM or rustc to know whether this
| optimization is actually happening.
| oever wrote:
| The Rust compiler guarantees that the memory location and
| size of the iterated array do not change during the
| operation. So the iterator can be a pointer that iterates
| until it points to the end of the array. There is no need
| to do bounds checks: the pointer only goes over the valid
| range.
|
| In C/C++, the array _can_ change. It might be moved, de-
| allocated or resized in the current or a synchronous
| thread. So the pointer that iterates until it is equal to
| the end pointer, might point to invalid data if the size,
| location or existence of the vector changes.
| kaoD wrote:
| What parent means is that you won't have any bound checks
| on array access, just a len(arr) loop.
| gigatexal wrote:
| TL;DR - in the test bounds checking vs no checks showed no
| noticeable difference. Very good article though. Worth reading.
| carl_dr wrote:
| Not too long, did read :
|
| The benchmark went from 28.5ms to 32.9ms.
|
| That as a percentage is 15% and is huge, it's not noise.
|
| The test is flawed in some way, the article is disappointing in
| that the author didn't investigate further.
| dale_glass wrote:
| MySQL is a huge amount of code doing a variety of things in
| each query -- networking, parsing, IO, locking, etc. Each of
| those can easily have significant and hard to predict
| latencies.
|
| Benchmarking that needs special care, and planning for
| whatever it is you want to measure. A million trivial queries
| and a dozen very heavy queries are going to do significantly
| different things, and have different tradeoffs and
| performance characteristics.
| carl_dr wrote:
| The benchmark was specifically testing the hot path of a
| cached query in their MySQL caching proxy. MySQL wasn't
| involved at all.
|
| I agree completely that benchmarks need care, hence my
| point that the article is disappointing.
|
| The author missed the chance to investigate why removing
| bounds checks seemed to regress performance by 15%, and
| instead wrote it off as "close enough."
|
| It would have been really interesting to find out why, even
| if it did end up being measurement noise.
| bjourne wrote:
| The reason performance decreased when he removed bounds checking
| is because asserting bounds is very useful to a compiler.
| Essentially, the compiler emits code like this:
| 1. if (x >= 0) && (x < arr_len(arr)) 2. get element
| from array index x 3. else 4. throw
| exception 5. do more stuff
|
| The compiler deduces that at line 5 0 <= x < arr_len(arr). From
| that it can deduce that abs(x) is a no op, that 2*x won't
| overflow (cause arrays can only have 2^32 elements), etc. Without
| bounds checking the compiler emits: 1. get
| element from array index x 2. do more stuff
|
| So the compiler doesn't know anything about x, which is bad. The
| solution which apparently is not implemented in Rust (or LLVM,
| idk) is to emit code like the following: 1.
| assert that 0 <= x < arr_len(arr) 2. get element from
| array index x 3. do more stuff
| vore wrote:
| I'm not sure I follow: where is abs(x)?
| layer8 wrote:
| It's an example of what could occur within "do more stuff".
| The mentioned 2*x is another example.
| est31 wrote:
| Interesting observation. So one should instead do the
| comparison with something like: 1. if (x >=
| 0) && (x < arr_len(arr)) 2. get element from array
| index x 3. else 4.
| core::hint::unreachable_unchecked 5. do more stuff
|
| Where unreachable_unchecked transmits precisely such
| information to the optimizer: https://doc.rust-
| lang.org/stable/std/hint/fn.unreachable_unc...
| [deleted]
| titzer wrote:
| For Virgil, there is a switch to turn off bounds checking, for
| the only reason to measure their cost. (It's not expected that
| anyone ever do this for production code). Bounds checks do not
| appear to slow down any program that matters (TM) by more than
| 2%. That's partly because so many loops automatically have bounds
| checks removed by analysis. But still. It's negligible.
| dathinab wrote:
| Anything between nothing and one most likely correct branch
| predicted to _not_ jump "branch iff int/pointer > int/pointer".
|
| This kind of bounds check are normally not ever violated (in well
| formed code) so branch prediction predicts them correctly nearly
| always.
|
| It also is (normally) just jumping in the bad case, which means
| with a correct branch predictions thy can be really cheap.
|
| And then cpu "magic" tends to be optimized for that kind of
| checks at they appear in a lot of languages (e.g. Java).
|
| Then in many cases the compiler can eliminate the checks
| partially.
|
| For example any many kinds of for-each element iterations the
| compiler can infer that the result of the conditionally loop
| continuation check implies the bounds check. Combine that with
| loop unrolling which can reduce the number of continuation checks
| and you might end up with even less.
|
| Also bounds checks tend to be an emergency guard, so you tend to
| sometimes do checks yourself before indexing and the compiler can
| often use that to eliminate the bounds check.
|
| And even if you ignore all optimizations it's (assuming in
| bounds) "just" at most one int/pointer cmp (cheap) followed by a
| conditional branch which doesn't branch (cheap).
| api wrote:
| It's a lot cheaper than having an RCE and being completely pwned.
| mastax wrote:
| One technique is to add asserts before a block of code to hoist
| the checks out. The compiler is usually smart enough to know
| which conditions have already been checked. Here's a simple
| example: https://rust.godbolt.org/z/GPMcYd371
|
| This can make a big difference if you can hoist bounds checks out
| of an inner loop. You get the performance without adding any
| unsafe {}.
| est31 wrote:
| Yeah this is because the error message printed contains the
| location of the error as well as the attempted index. Thus,
| there are differences between the bounds failures and the
| optimizer can't hoist the check out (plus probably some
| concerns due to side effects of opaque functions).
| [deleted]
| tester756 wrote:
| I've been shocked when I've heard C programmers being actually
| concerned about performance penalty of checks
|
| like, why bother? CPUs in next 2 years will win that performance
| anyway
|
| and your software will be safer
| Gigachad wrote:
| Most online debates are filled with illogical opinions on
| theoretical issues. You get people on this site complaining
| that they have to spend money on a catalytic converter because
| it's not required for the car to run and only prevents other
| people from getting cancer.
| humanrebar wrote:
| _All_ of the CPUs? C runs a lot of places.
| tester756 wrote:
| If you need perf, then consider
|
| better algorithms, better data structures, multi-threading,
| branchless programming (except safety), data-oriented design
|
| and then elimination of checks, not first.
| rfoo wrote:
| A consistent 5 ms difference in micro-benchmarks is definitely
| not "measurement noise". Noise averages out way before
| accumulating to 5ms. There must be a reason and it mostly likely
| relates to the change. So you can confidently say that removing
| bounds checking (at least with how you did it) is a regression.
|
| ... that being said, I'd argue that the most beneficial memory-
| safety feature of Rust is about temporal things (i.e. prevents
| UAF etc) instead of spatial ones.
| spullara wrote:
| A benchmarking harness without error bars?
| whatshisface wrote:
| Well, there is both random and systemic error in any
| experiment, and if 5ms is small relative to anything you'd
| expect (or there is some other reason to discount it) then it
| might be related to a problem in the benchmarking setup that's
| too small to be worth resolving. Any test is good to within
| some level of accuracy and they don't always average out to
| infinitely good if you rerun them enough times.
| joosters wrote:
| The 5ms isn't the key number. It's 5ms extra over a 28ms
| baseline, that's about 18% difference. If your noise
| threshold is 18%, then I think you have to accept that the
| benchmark probably isn't any good for this stated task.
| viraptor wrote:
| https://github.com/bheisler/criterion.rs is good for tests
| like that. It will give you much more than a single number
| and handle things like outliers. This makes identifying
| noisy tests simpler.
| glittershark wrote:
| The benchmarking harness that the post uses is based on
| criterion
| piwi wrote:
| The article mentions measurement noise several times without
| addressing the uncertainty. It would help to add statistical
| tests, otherwise the spread could let us conclude the opposite of
| what is really happened, just because we are out of luck.
___________________________________________________________________
(page generated 2022-11-30 23:00 UTC)