[HN Gopher] C++'s `Noexcept` Can (Sometimes) Help (Or Hurt) Perf...
___________________________________________________________________
C++'s `Noexcept` Can (Sometimes) Help (Or Hurt) Performance
Author : def-pri-pub
Score : 40 points
Date : 2024-08-05 16:55 UTC (6 hours ago)
(HTM) web link (16bpp.net)
(TXT) w3m dump (16bpp.net)
| TillE wrote:
| > I didn't know std::uniform_int_distribution doesn't actually
| produce the same results on different compilers
|
| I think this is genuinely my biggest complaint about the C++
| standard library. There are countless scenarios where you want
| deterministic random numbers (for testing if nothing else), so
| std's distributions are unusable. Fortunately you can just plug
| in Boost's implementation.
| chipdart wrote:
| > There are countless scenarios where you want deterministic
| random numbers (for testing if nothing else), so std's
| distributions are unusable. Fortunately you can just plug in
| Boost's implementation.
|
| I don't understand what's your complain. If you're already
| plugging in alternative implementations,what stops you from
| actually stubbing these random number generators with any
| realization at all?
| akira2501 wrote:
| It's a compromised and goofy implementation with lots of
| warts. What's the point it in having a /standard/ library
| then?
| chipdart wrote:
| > It's a compromised and goofy implementation with lots of
| warts.
|
| I don't think this case qualifies as an example. I think
| the only goofy detail in the story is expecting a random
| number generator to be non-random and deterministic with
| the only conceivable usecase being poorly designed and
| implemented test fxtures.
|
| > What's the point it in having a /standard/ library then?
|
| The point of standardized components is to provide reusable
| elements that can be used across all platforms and
| implementations, thus saving on the development effort of
| upgrading and porting the code across implementations and
| even platforms. If you cannot design working software,
| that's not a problem you can pin on the tools you don't
| know how to use.
| kevin_thibedeau wrote:
| > with the only conceivable usecase being poorly designed
| and implemented test fxtures.
|
| Reproducible pseudo-randomness is a necessity with fuzz
| testing. It is not a poor design approach when it is
| actually useful.
| forrestthewoods wrote:
| > The point of standardized components is to provide
| reusable elements that can be used across all platforms
| and implementations, thus saving on the development
| effort of upgrading and porting the code across
| implementations and even platforms.
|
| It's a shame that C++'s "standardized" components ARE
| COMPLETELY DIFFERENT on different platforms.
|
| Some of the C++ standard _requires_ per-platform
| implementation work. For example std::thread on Linux and
| Windows obviously must have a different implementation.
| However a super majority of the standard API is just
| vanilla C++ code. For example std::vector or
| std::unordered_map. The fact that the standard defines a
| spec which is then implemented numerous times is absurd,
| stupid, and bad. The specs are simultaneously over-
| constrained and under-constrained. It 's a disaster.
| gumby wrote:
| I consider the current tradeoff to be a feature.
|
| It permits implementations to take advantage of target-
| specific affordances (your thread case is an example) as
| well as taking different implementation strategies (e.g.
| the small string optimization is different in libc++ and
| libstdc++). Also you may use another, independent
| standard library because you prefer its implementation
| decisions. Meanwhile they remain compatible at the source
| level.
| quotemstr wrote:
| > I think this is genuinely my biggest complaint about the C++
| standard library
|
| What do you think of Abseil hash tables randomizing themselves
| (piggybacking on ASLR) on each start of your program?
| compiler-guy wrote:
| Even a speedup of around 1% (if it is consistent and in a
| carefully controlled experiment) is significant for many
| workloads, if the workload is big enough.
|
| The OP has this as in the fuzz, which it may be for that
| particular workload. But across a giant distributed system like
| youtube or Google search, it is a real gain.
| rwmj wrote:
| Shouldn't the compiler deduce noexcept for you?
| Eyas wrote:
| It probably can in a .cc file but if you're importing another
| library and just have access to the header, it wouldn't know
| how to.
| compiler-guy wrote:
| The compiler can tell about the immediate function, but not any
| functions it calls.
|
| If a function marked noexcept calls a function that throws an
| exception, then the program is terminated with an uncaught
| exception. A called function can throw through a non-noexcept
| function to a higher-level exception handler no problem.
|
| So in order to avoid changing the semantics of the function,
| the compiler would have to be able to determine that that
| transitive closure of called functions dynamically don't throw,
| and that problem is undecidable, even assuming the requirement
| that "the compiler can see the source of all those functions"
| is somehow met, which it won't be.
| lionkor wrote:
| No, noexcept confusingly doesn't mean "does not throw
| exceptions" in that sense. There is no constraint that says you
| can only call noexcept code from noexcept code - quite the
| opposite. Noexcept puts NO constraints on the code.
|
| All noexcept does is catch any exception and immediately
| std::terminate. Confusingly this means that noexcept should
| really be called deathexcept, since any exception thrown within
| kills the program.
| nextaccountic wrote:
| Oh, so it's like Rust's panic=abort
| singron wrote:
| Panic=abort is more like -fno-exceptions since it applies
| to all the code being compiled and not just function.
| Codegen can also take advantage of the fact that it won't
| have to unwind.
|
| I don't think there is a rust equivalent of noexcept.
| weinzierl wrote:
| Does C++ std::terminate unwind the stack and call
| destructors? Then noexcept would be pretty close to
| regular Rust panics, wouldn't it?
|
| Basically an unrecoverable exception?
| favorited wrote:
| The behavior of `-fno-exceptions` isn't standardized,
| because it's a compiler feature, not a part of the C++
| standard. The standard says:
|
| > In the situation where no matching [exception] handler
| is found, it is implementation-defined whether or not the
| stack is unwound before std::terminate is invoked. In the
| situation where the search for a handler encounters the
| outermost block of a function with a non-throwing
| exception specification, it is implementation-defined
| whether the stack is unwound, unwound partially, or not
| unwound at all before the function std::terminate is
| invoked.
|
| So, the whole thing is basically implementation-defined
| (including `-fno-exceptions`, since that is something
| that implementing compilers provide).
| dgrunwald wrote:
| > All noexcept does is catch any exception and immediately
| std::terminate.
|
| While that's a possible implementation; the standard is a bit
| more relaxed: `noexcept` may also call `std::terminate`
| immediately when an exception is thrown, without calling
| destructors in the usual way a catch block would do.
|
| https://godbolt.org/z/YTe84M5vq test1 has a ~S() destructor
| call if maybe_throw() throws; test2 never calls ~S().
|
| MSVC does not appear to support this optimization, so using
| `noexcept` with MSVC involves overhead similar to the catch-
| block.
| gpderetta wrote:
| More than an optimization is a different exception handling
| philosophy.
|
| AFAIK itanium ABI exception handling requires two phase
| unwinding: first the stack is traversed looking for a valid
| landing pad: if it succeeds then the stack is traversed
| again calling all destructors. If it fails it calls std
| terminate. This is actually slower as it need to traverse
| twice, but the big advantage is that if the program would
| abort, the state of the program is preserved in the core
| file. This is easily generalized with noexcept functions:
| no unwind info is generated for those, so unwind always
| fail.
|
| MSVC used to do one pass unwind, but I thought they changed
| it when they implemented table based unwind for x64.
| chipdart wrote:
| > All noexcept does is catch any exception and immediately
| std::terminate.
|
| I don't think this is a decent interpretation of what no
| except does. It misses the whole point of this feature, and
| confuses a failsafe with the point of using it.
|
| The whole point of noexcept is to tell the compiler that the
| function does not throw exceptions. This allows the compiler
| to apply optimizations, such as not needing to track down the
| necessary info to unwind the call stack when an exception is
| thrown. Some containers are also designed to only invoke move
| constructors if they are noexcept and otherwise will copy
| values around.
|
| As the compiler omits the info required to recover from
| exceptions, if one is indeed thrown and bubbles up to the
| noexcept function then it's not possible to do the necessary
| janitorial work. Therefore, std::terminate is called instead.
| Arech wrote:
| No, it can't do that. My speculation is that likely it is so
| because in general case this might be a NP hard problem similar
| to the halting problem
| https://en.wikipedia.org/wiki/Halting_problem.
|
| The best it can do is to say whether a given function is
| qualified with `noexcept` (see noexcept() operator
| https://en.cppreference.com/w/cpp/language/noexcept)
| olliej wrote:
| A lot of people have answered no, which is likely the correct
| answer to what you're asking, but I wanted to be clear about
| exactly what that is?
|
| Take a hypothetical piece of code
|
| some_library_header.h: void
| library_function();
|
| some_project_header_1.h: void
| project_function1();
|
| some_project_header_2.h: void
| project_function2();
|
| some_project_file.cpp: void
| project_function2() { /* definition */
| // no exception } void test_function1() {
| library_function(); } void
| test_function2() { project_function1(); }
| void test_function3() { project_function2();
| }
|
| For your question, where are you wanting to know if the
| compiler can deduce noexcept?
| hoten wrote:
| I don't feel like this article illuminates anything about how
| noexcept works. The asm diff at the end suggests _there is no
| difference_ in the emitted code. I plugged it into godbolt myself
| and see absolutely no difference. https://godbolt.org/z/jdro5jdnG
|
| It seems the selected example function may not be exercising
| noexcept. I suppose the assumption is that operator[] is
| something that can throw, but ... perhaps the machinery lives
| outside the function (so should really examine function calls),
| or is never emitted without a try/catch, or operator[] (though
| not marked noexcept...) doesn't throw b/c OOB is undefined
| behavior, or ... ?
| quuxplusone wrote:
| > I don't feel like this article illuminates anything about how
| noexcept works. The asm diff at the end suggests _there is no
| difference_ in the emitted code.
|
| You are absolutely correct. The OP is basically testing the
| hypothesis "Wrapping a function in `noexcept` will magically
| make it faster," which is (1) nonsense to anyone who knows how
| C++ works, and also (2) trivially easy to falsify, because all
| you have to do is look at the compiled code. Same codegen? Then
| it's not going to be faster (or slower). You needn't spend all
| those CPU cycles to find out what you already know by looking.
|
| There _has_ been a fair bit of literature written on the
| performance of exceptions and noexcept, but OP isn 't
| contributing anything with this particular post.
|
| Here are two of my own blog posts on the subject. The first one
| is just an explanation of the "vector pessimization" which was
| also mentioned (obliquely) in OP's post -- but with an actual
| benchmark where you can see why it matters.
| https://quuxplusone.github.io/blog/2022/08/26/vector-pessimi...
| https://godbolt.org/z/e4jEcdfT9
|
| The second one is much more interesting, because it shows where
| `noexcept` can actually have an effect on codegen _in the core
| language_. TLDR, it can matter on functions that the compiler
| can 't inline, such as when crossing ABI boundaries or when (as
| in this case) it's an indirect call through a function pointer.
| https://quuxplusone.github.io/blog/2022/07/30/type-erased-in...
| hoten wrote:
| That's what I'm talking about! Thanks for sharing, I learned
| quite a few things about noexcept from your articles.
| secondcoming wrote:
| The example is bad. Maybe this illustrates it better:
|
| https://godbolt.org/z/1asa7Tjq9
| Arech wrote:
| That's quite interesting and a huge work has been done here,
| respect for that.
|
| Here's what has jumped out at me: `noexcept` qualifier is not
| free in some cases, particularly, when a qualified function could
| actually throw, but is marked `noexcept`. In that case, a
| compiler still must set something up to fulfil the main
| `noexcept` promise - call `std::terminate()` if an exception is
| thrown. That means, that putting `noexcept` on each and every
| function blindly without any regard to whether the function could
| really throw or not (for example, `std::vector::push_back()`
| could throw on reallocation failure, hence if a `noexcept`
| qualified function call it, a compiler must take into account)
| doesn't actually test/benchmark/prove anything, since as the
| author correctly said, - you won't ever do this in a real
| production project. It would be really interesting to take a look
| into a full code of cases that showed very bad performance,
| however, here we're approaching the second issue: if that's the
| core benchmark code: https://github.com/define-private-
| public/PSRayTracing/blob/a... then unfortunately it's totally
| invalid since it measures time with the
| `std::chrono::system_clock` which isn't monotonic. Given how long
| the code required to run, it's almost certain that the clock has
| been adjusted several times...
| zokier wrote:
| > then unfortunately it's totally invalid since it measures
| time with the `std::chrono::system_clock` which isn't
| monotonic. Given how long the code required to run, it's almost
| certain that the clock has been adjusted several times
|
| monotonic clocks are mostly useful for short measurement
| periods. for long-term timing wall-time clocks (with their
| adjustments) are more accurate because they will drift less.
| bodyfour wrote:
| > in that case, a compiler still must set something up to
| fulfil the main `noexcept` promise - call `std::terminate()`
|
| This is actually something that has been more of a problem in
| clang than gcc due to LLVM IR limitations... but that is being
| fixed (or maybe is already?) There was a presentation about it
| at the 2023 LLVM Developer's meeting which was recently
| published on their youtube channel
| https://www.youtube.com/watch?v=DMUeTaIe1CU
|
| The short version (as I understand) is that you don't really
| need to produce any code to call std::terminate, all you need
| is tell the linker it needs to leave a hole in the table which
| maps %rip to the required unwind actions. If the unwinder
| doesn't know what to do, it will call std::terminate per the
| standard.
|
| IR didn't have a way of expressing this "hole", though, so
| instead clang was forced to emit an explicit "handler" to do
| the std::terminate call
| Night_Thastus wrote:
| I thought I saw this post, or a _very_ similar one, a couple
| years ago. Does anyone else remember that? Yet I don 't see it in
| the post history.
| plorkyeran wrote:
| The most common place where noexcept improves performance is on
| move constructors and move assignments when moving is cheaper
| than copying. If your type is not nothrow moveable std::vector
| will copy it instead of moving when resizing, as the move
| constructor throwing would leave the vector in an invalid state
| (while the copy constructor throwing leaves the vector
| unchanged).
|
| Platforms with setjmp-longjmp based exceptions benefit greatly
| from noexcept as there's setup code required before calling
| functions which may throw. Those platforms are now mostly gone,
| though. Modern "zero cost" exceptions don't execute a single
| instruction related to exception handling if no exceptions are
| thrown (hence the name), so there just isn't much room for
| noexcept to be useful to the optimizer.
|
| Outside of those two scenarios there isn't any reason to expect
| noexcept to improve performance.
| jzwinck wrote:
| There is another standard library related scenario: hash
| tables. The std unordered containers will store the hash of
| each key unless your hash function is noexcept. Analogous to
| how vector needs noexcept move for fast reserve and resize,
| unordered containers need noexcept hash to avoid extra memory
| usage. See
| https://gcc.gnu.org/onlinedocs/libstdc++/manual/unordered_as...
| 10tacobytes wrote:
| This is the correct analysis. The article's author could have
| saved themselves (and the reader) a good amount of blind data
| diving by learning more about exception processing beforehand.
| olliej wrote:
| I would like to have seen a comparison that actually includes
| -fno-exceptions, rather than just noexcept. My _assumption_ is
| that to get a consistent gain from noexcept, you would need every
| function called to be explicitly noexcept, because a bunch of the
| cost of exceptions is code size and state required to support
| unwinding. So if the performance cost exception handling is
| causing is due to that, then if _anything_ can cause an exception
| (or I guess more accurately unless every opaque call is
| explicitly indicated to not cause an exception) then that
| overhead remains.
|
| That said, I'm still confused by the perf results of the article,
| especially the perlin noise vs MSVC one. It's sufficiently weird
| outlier that it makes me wonder if something in the compiler has
| a noexcept path that adds checks that aren't usually on (i.e
| imagine the code has a "debug" mode that did bounds checks or
| something, but the function resolution you hit in the noexcept
| path always does the bounds check - I'm really not sure exactly
| how you'd get that to happen, but "non-default path was not
| benchmarked" is not exactly an uncommon occurrence)
| quotemstr wrote:
| There's a lot of mysticism and superstition surrounding C++
| exceptions. It's instructive to sit down with godbolt and examine
| specific scenarios in which noexcept (or exceptions generally)
| can affect performance. Read the machine code. Understand _why_
| the compiler does what it does. Don 't want to invest at that
| level? You probably want to use a higher level language.
___________________________________________________________________
(page generated 2024-08-05 23:00 UTC)