[HN Gopher] Mysterious Memset
___________________________________________________________________
Mysterious Memset
Author : ibobev
Score : 96 points
Date : 2022-05-11 10:43 UTC (12 hours ago)
(HTM) web link (vector-of-bool.github.io)
(TXT) w3m dump (vector-of-bool.github.io)
| awoimbee wrote:
| Couldn't the author pass a reference to a std::string instead of
| switching to an std::u8string, is that a C++ limitation ? And I
| really don't understand why the compiler doesn't assume strict
| aliasing in the first example.
| einpoklum wrote:
| I have one word for us all: restrict ...
|
| https://en.cppreference.com/w/c/language/restrict
|
| Still missing from C++ after all these years. Get it in there,
| people!
|
| See also:
|
| https://stackoverflow.com/q/776283/1593077
| fefe23 wrote:
| I don't get the reference to undefined behavior.
|
| This is an aliasing issue, which is pretty well defined.
|
| Am I missing something?
|
| Other than that: Interesting insight!
| sharikous wrote:
| I think the idea is that if the user had aliased *count and the
| string they would have got undefined behavior.
|
| So the compiler can assume they didn't and is allowed to
| optimize away the loop with memset.
| andi999 wrote:
| Or do more nefarious stuff.
| eklitzke wrote:
| Aliasing in the first example is well defined, so no UB in the
| first example. Aliasing in the second example would be
| undefined in the sense that the strict aliasing rules forbids
| the two pointer types to be aliased in the first place.
| masklinn wrote:
| > I don't get the reference to undefined behavior.
|
| > [...]
|
| > Am I missing something?
|
| `int*` and `char8_t*` aliasing is UB under strict aliasing.
|
| You can call it "an aliasing issue, which is pretty well
| defined". As far as the compiler is concerned it's no different
| than every other UB it uses for constraints inference.
| hermitdev wrote:
| To the author: the initial for loop is not equivalent to a
| do/while loop. The condition of the for loop is executed before
| the loop runs. e.g. the loop body will never run if *count == 0.
| codetrotter wrote:
| There's an if statement around the do while loop on that page,
| so that the do while loop also will not run if *count == 0.
|
| The code snippets for (int i = 0; i < *count;
| ++i) { str[i] = 0; }
|
| and if (*count > 0) { long idx = 0;
| do { str->__data[idx] = 0; idx += 1;
| } while (idx <= *count); }
|
| Will produce the same changes to the string data given the same
| count.
| andi999 wrote:
| Nice. Simplest fix would be to introduce a local variable int
| c=*count before the loop. (which you also would do with a non
| optimizing compiler so the count could be in a register.
| taneq wrote:
| I've been coding commercially in C and C++ (not full time but
| pretty regularly) for over 20 years and I'm starting to think
| the simplest fix is to not use C or C++.
| ncmncm wrote:
| The point, which they probably should have spelled out better,
| was that std::string contains a char* that aliases everything
| there might be a pointer to. It is often hard to prevent
| pointers being created to things; calling a function that takes
| a "T const&" argument, the compiler is typically obliged to
| assume that a pointer to the argument was taken and retained.
|
| So, the "*count" in the example is a stand-in for a zillion
| other things you would actually have written, and that would
| also be affected by aliasing assumptions.
| LAC-Tech wrote:
| Interesting, yet terrifying, as is all C++ wizardry to me.
| spyremeown wrote:
| I had to write business logic in C++20 for an embedded Linux
| box and o-m-g. It sucked. So much. The developer experience was
| horrendous. All the PRs dragged along because of micro-
| optimizations every step of the way. "Let me see the generated
| assembly in Godbolt" why, why?
|
| I know this is probably not all due to the language, but at
| least one bit of it has to be. It's really cool if you're
| writing bare-metal/RTOS level embedded stuff and you're
| worrying about how many assigments can you put into a loop
| round to optimize the cache lines, but I don't understand why
| anyone would ever try to talk to the web using C++.
| AshamedCaptain wrote:
| > "Let me see the generated assembly in Godbolt" why, why?
|
| This is a embedded thing, not a C or C++ thing.
|
| The other day someone was saying here on HN that writing
| Verilog feels like following a process where "you already
| more or less know which circuit do you want, you are just
| trying to figure which is the specific Verilog code that will
| get your synthesizer to generate that circuit".
|
| On embedded platforms (or generally anywhere where you count
| memory usage in units of KB or less), that's exactly what
| many people do. They already know more or less the assembly
| code, they are just looking for the right higher-level
| program that will translate to that assembly. That's one of
| the reasons they get angry when the language tries to be too
| smart.
|
| The reason you just don't code in assembly directly is
| because it's still a pain and your chances of mistake
| increase (e.g. doing complex arithmetic expressions).
| spyremeown wrote:
| Yea, I know all this jazz, I was, once upon a time, the
| dude unrolling loops in asm to get digital filters faster
| on PIC32s.
|
| It still a Linux box, with plenty of memory to spare. This
| IS a C++ problem, where people are driven to do these
| insane optimizations.
| vanderZwan wrote:
| Does that also mean that embedded programming involves
| being conservative about compiler updates? Because
| otherwise those choices might become completely invalid one
| upgrade later
| electroly wrote:
| It's not unreasonable to stick with the same compiler
| version for the entire life of a product. I was involved
| with a hardware project where we attempted to upgrade the
| compiler mid-lifecycle; it caused a subtle malfunction
| (almost certainly due to a latent bug or undocumented
| compiler behavior dependency in our code, but we couldn't
| find it) and we simply decided never to upgrade the
| compiler for the life of the product. It wasn't
| considered a big deal to do so.
| userbinator wrote:
| The basic principle is that known unknowns are better
| than unknown unknowns.
| AshamedCaptain wrote:
| It's worse than that; many times the principle is: known
| evils are better than unknown angels.
| ncmncm wrote:
| Which are in turn better than unknown knowns, or things
| you were _just sure_ were true, but just ain 't.
|
| Had you not been "sure", you might have done something
| that would work right.
| adrianN wrote:
| For safety critical software for example upgrading the
| compiler means you need to recertify the product. That's
| very expensive.
| jcelerier wrote:
| > "Let me see the generated assembly in Godbolt" why, why
|
| What else do you propose ?
| lpapez wrote:
| > All the PRs dragged along because of micro-optimizations
| every step of the way. "Let me see the generated assembly in
| Godbolt" why, why?
|
| I think a large part of it is due to the language. C++ code
| can be simultaneously both very low-level and highly-
| abstracted and then you will get reviewers complaining about
| needlesly copying 48 bytes while making a network request...
| scoutt wrote:
| If might depend if your program does 1 request every now
| and then, or if your program is doing several thousands
| network requests per second (then I might complain too).
| lpapez wrote:
| Of course it depends on what you are building, but my
| point is that the language gives you access to the low-
| level facilities which nudge people to worry/think about
| them even when they are irrelevant and unimportant.
| Because details like copy elision are usually an obvious
| point which can be improved upon, and people generally
| have a need to participate and contribute, small things
| will be mentioned on review and delay the feature even
| when it makes no difference. It's not the fault of the
| language itself but rather the culture around it and the
| easy and obvious answer to this "just dont use C++ if you
| dont need it" stops being easy and obvious when you try
| to actually interop with other languages. </rant>
| saagarjha wrote:
| Better that than having to convince someone your code is fine
| and doesn't need to have variables pulled out of the loop for
| "optimization". Compilers are good at code motion and many
| engineers will microptimize things that end up hurting
| readability for now benefit unless you show them Godbolt.
| gumby wrote:
| > All the PRs dragged along because of micro-optimizations
| every step of the way. "Let me see the generated assembly in
| Godbolt" why, why?
|
| > I know this is probably not all due to the language, but at
| least one bit of it has to be
|
| No, there's a cultural problem. C++ gives you the power and
| flexibility to really optimize for space or performance but
| rarely is that worth it early in development. Instead, just
| start by simply writing the code. Good design will give you
| affordances for appropriate optimization later.
|
| And if your "embedded" system is so massive it can run Linux
| then you could end up with better code density by using
| Python source and including the interpreter!
| ncmncm wrote:
| It would be the same in C, of course. Except worse.
| fguerraz wrote:
| Not really mysterious or a surprise. Passing count as a reference
| was a huge red flag from the very beginning.
| bonzini wrote:
| Replace *count with this->count and you can see why this can
| cause pessimization.
| fguerraz wrote:
| What would that change? If this->count was a pointer you'd
| still have to de-reference it explicitly. Or am I missing the
| point of your argument?
| bonzini wrote:
| I agree that in this toy example passing a count argument
| by reference makes little sense, but you'd get the same
| thing if you accessed the count field field of
| struct s { int count; char *chars;
| };
| ReactiveJelly wrote:
| It surprised me, cause I forgot that aliasing is a problem for
| C. Guess I'm one of today's 10,000.
| jhgb wrote:
| I'm sort-of, kind-of aware of the issue but what surprised me
| was how std::string pulls this into picture here. Is this
| because of what str[i] gets compiled into when str is a
| basic_string of chars? That seems non-obvious to me because
| I'm just accustomed to dealing with C++ strings as fairly
| opaque entities. If they somehow expose to the compiler that
| you're really manipulating characters via pointers, I can
| understand how this complicates things.
| Diggsey wrote:
| It would be a surprise to anyone not very familiar with C++'s
| strict-aliasing rule, which I imagine is quite a lot of people
| (and even many C++ programmers...)
|
| Very few other languages have a similar rule (eg. Rust
| explicitly does not have this rule)
| kevincox wrote:
| Although Rust doesn't have this rule it isn't nearly as
| relevant due to the way immutable (shared) references work.
| While `a: &Foo` and `b: &Bar` may alias it doesn't matter
| much because Rust knows that nothing can write to them so it
| can do basically all of the optimization it needs. Rust also
| knows that `c: &mut T` doesn't alias with anything so in this
| case it can make loads of assumptions.
|
| One chink in this rule is interior mutability. This is why it
| is better to think of `&T` as a shared reference than an
| immutable reference. For example in the following code Rust
| can't assume that `a.get() == 5`. (Even if the second
| argument is changed to i8) pub fn test(a:
| &std::cell::Cell<u8>, b: &std::cell::Cell<u8>) -> u8 {
| a.set(5); b.set(7); a.get() }
|
| https://rust.godbolt.org/z/e1c6Kx4eb
|
| Commenting out the write to b does allow Rust to hardcode the
| return value as 5.
| ahefner wrote:
| The surprising thing to me is that 'char8_t' (a new C++ 20
| thing I'd never heard of) is not just a typedef to char with
| all the same aliasing implications, but a new and distinct
| type to which the magic 'char' alias rules don't apply (also,
| unsigned).
| Sharlin wrote:
| It basically has to be distinct from `char` because you
| can't use portably use `char` to hold a UTF-8 code unit
| (because the guaranteed valid range is only 0x00 to 0x7F)
| Also, this way you can overload based on legacy char vs
| UTF-8 char, and have `std::basic_string<char>` and
| `std::basic_string<char8_t>` (aka `std::string` and
| std::u8string`) be distinct types as well. So finally in
| C++20 we actually have a portable UTF-8 string type!
| planede wrote:
| > you can't use portably use `char` to hold a UTF-8 code
| unit
|
| That's not true, in C++ a byte is guaranteed to be able
| to hold a UTF-8 code unit.
|
| https://timsong-
| cpp.github.io/cppwp/n4868/intro.memory#1.sen...
| Sharlin wrote:
| Yes, but if `char` is signed, as it usually is, its bit
| patterns correspond to values -0x80 to 0x7F. So yeah, you
| can no-cost encode the >=0x80 code units as their two's
| complement counterparts but it feels suspicious. At least
| to me, after writing some Rust lately which very much
| does not do implicit signed-unsigned conversions. Much
| better for char to always represent the "basic character
| set" (ie. usually ASCII) and have a distinct type for
| UTF-8.
| ncmncm wrote:
| Anyway, a _UTF-8 sequence transport_ type. Few of
| u8string member operations make sense for UTF-8 as such.
| Usually that doesn 't matter. Sometimes it matters a
| great deal, and we will need a whole new API for that.
| Sharlin wrote:
| Yeah, good point.
| ncmncm wrote:
| And also distinct from std::byte, which they are hoping
| will pick up aliasing properties of char, allowing use and
| also abuse of char* to be someday eliminated from new code.
| std::byte does alias everything, but no operations are
| defined on it except copying (and, weirdly, bitwise
| operators).
| MauranKilom wrote:
| Also, char is a distinct type from both signed char and
| unsigned char (even though it has the same size as both and
| the same signedness as one of them).
| jhgb wrote:
| Is signed char and unsigned char subject to the same "can
| alias anything" rule? I never bothered to think about
| this in the past. Now I'm not sure. I've known that the
| three types are distinct, but that just means that that
| the rule doesn't _have_ to apply to all of them, not that
| it doesn 't.
| saagarjha wrote:
| Yes, the rule applies to all three character types.
| jhgb wrote:
| Thanks. This seems mildly ungoogleable, or at least I
| haven't been able to find a good search result for this.
| So I'll keep in mind that these are three distinct types
| with the same aliasing behavior.
| saagarjha wrote:
| Actually, I take that back and should clarify: it's all
| three only in C. In C++ it is just char and unsigned
| char. (If you want to search for this, "signed char
| aliasing" gave me good results.)
| matheusmoreira wrote:
| I agree, it's surprising. Types like uint8_t* are widely
| used to reinterpret other structures as byte arrays which
| implies they are universal aliases just like char*. Not
| sure what makes char8_t different.
| wahern wrote:
| > Types like uint8_t* are widely used to reinterpret
| other structures as byte arrays which implies they are
| universal aliases just like char*
|
| Alternatively, it implies that there's alot of broken
| code out there. So much broken code that they've
| accidentally found safety in numbers, and compilers are
| unlikely to change a coincidental behavior upon which
| they wrongly relied. See
| https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66110
| rsstack wrote:
| TIL: https://stackoverflow.com/questions/98650/what-is-the-
| strict...
| rurban wrote:
| __restrict would have also helped.
|
| in the case of memset, it's questionable if that library call is
| actually faster for count < 128
| bonzini wrote:
| Recent Intel processors have "fast rep stos" which basically
| allows the compiler to always inline memset (which they
| couldn't do in the aliasing case).
| kevingadd wrote:
| since it's std::memset, the compiler could theoretically inline
| it or have a small-count path that uses vector intrinsics
| saagarjha wrote:
| It could, but for a case without a known count it doesn't
| really make sense to do that. (I honestly found the concern
| to be kind of strange for that reason, honestly.)
| astrange wrote:
| A memset tail call would be less code size than the loop,
| so it's still worth it even for cold code.
| SuchAnonMuchWow wrote:
| I believe restrict is a C only keyword, so no, it couldn't have
| helped.
| adzm wrote:
| While not standardized, restrict or __restrict are indeed
| supported by most C++ compilers. Visual C++ even supports a
| declspec(restrict) that can apply to the function rather than
| just individual arguments.
___________________________________________________________________
(page generated 2022-05-11 23:01 UTC)