[HN Gopher] Mysterious Memset
       ___________________________________________________________________
        
       Mysterious Memset
        
       Author : ibobev
       Score  : 96 points
       Date   : 2022-05-11 10:43 UTC (12 hours ago)
        
 (HTM) web link (vector-of-bool.github.io)
 (TXT) w3m dump (vector-of-bool.github.io)
        
       | awoimbee wrote:
       | Couldn't the author pass a reference to a std::string instead of
       | switching to an std::u8string, is that a C++ limitation ? And I
       | really don't understand why the compiler doesn't assume strict
       | aliasing in the first example.
        
       | einpoklum wrote:
       | I have one word for us all: restrict ...
       | 
       | https://en.cppreference.com/w/c/language/restrict
       | 
       | Still missing from C++ after all these years. Get it in there,
       | people!
       | 
       | See also:
       | 
       | https://stackoverflow.com/q/776283/1593077
        
       | fefe23 wrote:
       | I don't get the reference to undefined behavior.
       | 
       | This is an aliasing issue, which is pretty well defined.
       | 
       | Am I missing something?
       | 
       | Other than that: Interesting insight!
        
         | sharikous wrote:
         | I think the idea is that if the user had aliased *count and the
         | string they would have got undefined behavior.
         | 
         | So the compiler can assume they didn't and is allowed to
         | optimize away the loop with memset.
        
           | andi999 wrote:
           | Or do more nefarious stuff.
        
         | eklitzke wrote:
         | Aliasing in the first example is well defined, so no UB in the
         | first example. Aliasing in the second example would be
         | undefined in the sense that the strict aliasing rules forbids
         | the two pointer types to be aliased in the first place.
        
         | masklinn wrote:
         | > I don't get the reference to undefined behavior.
         | 
         | > [...]
         | 
         | > Am I missing something?
         | 
         | `int*` and `char8_t*` aliasing is UB under strict aliasing.
         | 
         | You can call it "an aliasing issue, which is pretty well
         | defined". As far as the compiler is concerned it's no different
         | than every other UB it uses for constraints inference.
        
       | hermitdev wrote:
       | To the author: the initial for loop is not equivalent to a
       | do/while loop. The condition of the for loop is executed before
       | the loop runs. e.g. the loop body will never run if *count == 0.
        
         | codetrotter wrote:
         | There's an if statement around the do while loop on that page,
         | so that the do while loop also will not run if *count == 0.
         | 
         | The code snippets                   for (int i = 0; i < *count;
         | ++i) {             str[i] = 0;         }
         | 
         | and                 if (*count > 0) {         long idx = 0;
         | do {           str->__data[idx] = 0;           idx += 1;
         | } while (idx <= *count);       }
         | 
         | Will produce the same changes to the string data given the same
         | count.
        
       | andi999 wrote:
       | Nice. Simplest fix would be to introduce a local variable int
       | c=*count before the loop. (which you also would do with a non
       | optimizing compiler so the count could be in a register.
        
         | taneq wrote:
         | I've been coding commercially in C and C++ (not full time but
         | pretty regularly) for over 20 years and I'm starting to think
         | the simplest fix is to not use C or C++.
        
         | ncmncm wrote:
         | The point, which they probably should have spelled out better,
         | was that std::string contains a char* that aliases everything
         | there might be a pointer to. It is often hard to prevent
         | pointers being created to things; calling a function that takes
         | a "T const&" argument, the compiler is typically obliged to
         | assume that a pointer to the argument was taken and retained.
         | 
         | So, the "*count" in the example is a stand-in for a zillion
         | other things you would actually have written, and that would
         | also be affected by aliasing assumptions.
        
       | LAC-Tech wrote:
       | Interesting, yet terrifying, as is all C++ wizardry to me.
        
         | spyremeown wrote:
         | I had to write business logic in C++20 for an embedded Linux
         | box and o-m-g. It sucked. So much. The developer experience was
         | horrendous. All the PRs dragged along because of micro-
         | optimizations every step of the way. "Let me see the generated
         | assembly in Godbolt" why, why?
         | 
         | I know this is probably not all due to the language, but at
         | least one bit of it has to be. It's really cool if you're
         | writing bare-metal/RTOS level embedded stuff and you're
         | worrying about how many assigments can you put into a loop
         | round to optimize the cache lines, but I don't understand why
         | anyone would ever try to talk to the web using C++.
        
           | AshamedCaptain wrote:
           | > "Let me see the generated assembly in Godbolt" why, why?
           | 
           | This is a embedded thing, not a C or C++ thing.
           | 
           | The other day someone was saying here on HN that writing
           | Verilog feels like following a process where "you already
           | more or less know which circuit do you want, you are just
           | trying to figure which is the specific Verilog code that will
           | get your synthesizer to generate that circuit".
           | 
           | On embedded platforms (or generally anywhere where you count
           | memory usage in units of KB or less), that's exactly what
           | many people do. They already know more or less the assembly
           | code, they are just looking for the right higher-level
           | program that will translate to that assembly. That's one of
           | the reasons they get angry when the language tries to be too
           | smart.
           | 
           | The reason you just don't code in assembly directly is
           | because it's still a pain and your chances of mistake
           | increase (e.g. doing complex arithmetic expressions).
        
             | spyremeown wrote:
             | Yea, I know all this jazz, I was, once upon a time, the
             | dude unrolling loops in asm to get digital filters faster
             | on PIC32s.
             | 
             | It still a Linux box, with plenty of memory to spare. This
             | IS a C++ problem, where people are driven to do these
             | insane optimizations.
        
             | vanderZwan wrote:
             | Does that also mean that embedded programming involves
             | being conservative about compiler updates? Because
             | otherwise those choices might become completely invalid one
             | upgrade later
        
               | electroly wrote:
               | It's not unreasonable to stick with the same compiler
               | version for the entire life of a product. I was involved
               | with a hardware project where we attempted to upgrade the
               | compiler mid-lifecycle; it caused a subtle malfunction
               | (almost certainly due to a latent bug or undocumented
               | compiler behavior dependency in our code, but we couldn't
               | find it) and we simply decided never to upgrade the
               | compiler for the life of the product. It wasn't
               | considered a big deal to do so.
        
               | userbinator wrote:
               | The basic principle is that known unknowns are better
               | than unknown unknowns.
        
               | AshamedCaptain wrote:
               | It's worse than that; many times the principle is: known
               | evils are better than unknown angels.
        
               | ncmncm wrote:
               | Which are in turn better than unknown knowns, or things
               | you were _just sure_ were true, but just ain 't.
               | 
               | Had you not been "sure", you might have done something
               | that would work right.
        
               | adrianN wrote:
               | For safety critical software for example upgrading the
               | compiler means you need to recertify the product. That's
               | very expensive.
        
           | jcelerier wrote:
           | > "Let me see the generated assembly in Godbolt" why, why
           | 
           | What else do you propose ?
        
           | lpapez wrote:
           | > All the PRs dragged along because of micro-optimizations
           | every step of the way. "Let me see the generated assembly in
           | Godbolt" why, why?
           | 
           | I think a large part of it is due to the language. C++ code
           | can be simultaneously both very low-level and highly-
           | abstracted and then you will get reviewers complaining about
           | needlesly copying 48 bytes while making a network request...
        
             | scoutt wrote:
             | If might depend if your program does 1 request every now
             | and then, or if your program is doing several thousands
             | network requests per second (then I might complain too).
        
               | lpapez wrote:
               | Of course it depends on what you are building, but my
               | point is that the language gives you access to the low-
               | level facilities which nudge people to worry/think about
               | them even when they are irrelevant and unimportant.
               | Because details like copy elision are usually an obvious
               | point which can be improved upon, and people generally
               | have a need to participate and contribute, small things
               | will be mentioned on review and delay the feature even
               | when it makes no difference. It's not the fault of the
               | language itself but rather the culture around it and the
               | easy and obvious answer to this "just dont use C++ if you
               | dont need it" stops being easy and obvious when you try
               | to actually interop with other languages. </rant>
        
           | saagarjha wrote:
           | Better that than having to convince someone your code is fine
           | and doesn't need to have variables pulled out of the loop for
           | "optimization". Compilers are good at code motion and many
           | engineers will microptimize things that end up hurting
           | readability for now benefit unless you show them Godbolt.
        
           | gumby wrote:
           | > All the PRs dragged along because of micro-optimizations
           | every step of the way. "Let me see the generated assembly in
           | Godbolt" why, why?
           | 
           | > I know this is probably not all due to the language, but at
           | least one bit of it has to be
           | 
           | No, there's a cultural problem. C++ gives you the power and
           | flexibility to really optimize for space or performance but
           | rarely is that worth it early in development. Instead, just
           | start by simply writing the code. Good design will give you
           | affordances for appropriate optimization later.
           | 
           | And if your "embedded" system is so massive it can run Linux
           | then you could end up with better code density by using
           | Python source and including the interpreter!
        
           | ncmncm wrote:
           | It would be the same in C, of course. Except worse.
        
       | fguerraz wrote:
       | Not really mysterious or a surprise. Passing count as a reference
       | was a huge red flag from the very beginning.
        
         | bonzini wrote:
         | Replace *count with this->count and you can see why this can
         | cause pessimization.
        
           | fguerraz wrote:
           | What would that change? If this->count was a pointer you'd
           | still have to de-reference it explicitly. Or am I missing the
           | point of your argument?
        
             | bonzini wrote:
             | I agree that in this toy example passing a count argument
             | by reference makes little sense, but you'd get the same
             | thing if you accessed the count field field of
             | struct s {             int count;             char *chars;
             | };
        
         | ReactiveJelly wrote:
         | It surprised me, cause I forgot that aliasing is a problem for
         | C. Guess I'm one of today's 10,000.
        
           | jhgb wrote:
           | I'm sort-of, kind-of aware of the issue but what surprised me
           | was how std::string pulls this into picture here. Is this
           | because of what str[i] gets compiled into when str is a
           | basic_string of chars? That seems non-obvious to me because
           | I'm just accustomed to dealing with C++ strings as fairly
           | opaque entities. If they somehow expose to the compiler that
           | you're really manipulating characters via pointers, I can
           | understand how this complicates things.
        
         | Diggsey wrote:
         | It would be a surprise to anyone not very familiar with C++'s
         | strict-aliasing rule, which I imagine is quite a lot of people
         | (and even many C++ programmers...)
         | 
         | Very few other languages have a similar rule (eg. Rust
         | explicitly does not have this rule)
        
           | kevincox wrote:
           | Although Rust doesn't have this rule it isn't nearly as
           | relevant due to the way immutable (shared) references work.
           | While `a: &Foo` and `b: &Bar` may alias it doesn't matter
           | much because Rust knows that nothing can write to them so it
           | can do basically all of the optimization it needs. Rust also
           | knows that `c: &mut T` doesn't alias with anything so in this
           | case it can make loads of assumptions.
           | 
           | One chink in this rule is interior mutability. This is why it
           | is better to think of `&T` as a shared reference than an
           | immutable reference. For example in the following code Rust
           | can't assume that `a.get() == 5`. (Even if the second
           | argument is changed to i8)                   pub fn test(a:
           | &std::cell::Cell<u8>, b: &std::cell::Cell<u8>) -> u8 {
           | a.set(5);             b.set(7);             a.get()         }
           | 
           | https://rust.godbolt.org/z/e1c6Kx4eb
           | 
           | Commenting out the write to b does allow Rust to hardcode the
           | return value as 5.
        
           | ahefner wrote:
           | The surprising thing to me is that 'char8_t' (a new C++ 20
           | thing I'd never heard of) is not just a typedef to char with
           | all the same aliasing implications, but a new and distinct
           | type to which the magic 'char' alias rules don't apply (also,
           | unsigned).
        
             | Sharlin wrote:
             | It basically has to be distinct from `char` because you
             | can't use portably use `char` to hold a UTF-8 code unit
             | (because the guaranteed valid range is only 0x00 to 0x7F)
             | Also, this way you can overload based on legacy char vs
             | UTF-8 char, and have `std::basic_string<char>` and
             | `std::basic_string<char8_t>` (aka `std::string` and
             | std::u8string`) be distinct types as well. So finally in
             | C++20 we actually have a portable UTF-8 string type!
        
               | planede wrote:
               | > you can't use portably use `char` to hold a UTF-8 code
               | unit
               | 
               | That's not true, in C++ a byte is guaranteed to be able
               | to hold a UTF-8 code unit.
               | 
               | https://timsong-
               | cpp.github.io/cppwp/n4868/intro.memory#1.sen...
        
               | Sharlin wrote:
               | Yes, but if `char` is signed, as it usually is, its bit
               | patterns correspond to values -0x80 to 0x7F. So yeah, you
               | can no-cost encode the >=0x80 code units as their two's
               | complement counterparts but it feels suspicious. At least
               | to me, after writing some Rust lately which very much
               | does not do implicit signed-unsigned conversions. Much
               | better for char to always represent the "basic character
               | set" (ie. usually ASCII) and have a distinct type for
               | UTF-8.
        
               | ncmncm wrote:
               | Anyway, a _UTF-8 sequence transport_ type. Few of
               | u8string member operations make sense for UTF-8 as such.
               | Usually that doesn 't matter. Sometimes it matters a
               | great deal, and we will need a whole new API for that.
        
               | Sharlin wrote:
               | Yeah, good point.
        
             | ncmncm wrote:
             | And also distinct from std::byte, which they are hoping
             | will pick up aliasing properties of char, allowing use and
             | also abuse of char* to be someday eliminated from new code.
             | std::byte does alias everything, but no operations are
             | defined on it except copying (and, weirdly, bitwise
             | operators).
        
             | MauranKilom wrote:
             | Also, char is a distinct type from both signed char and
             | unsigned char (even though it has the same size as both and
             | the same signedness as one of them).
        
               | jhgb wrote:
               | Is signed char and unsigned char subject to the same "can
               | alias anything" rule? I never bothered to think about
               | this in the past. Now I'm not sure. I've known that the
               | three types are distinct, but that just means that that
               | the rule doesn't _have_ to apply to all of them, not that
               | it doesn 't.
        
               | saagarjha wrote:
               | Yes, the rule applies to all three character types.
        
               | jhgb wrote:
               | Thanks. This seems mildly ungoogleable, or at least I
               | haven't been able to find a good search result for this.
               | So I'll keep in mind that these are three distinct types
               | with the same aliasing behavior.
        
               | saagarjha wrote:
               | Actually, I take that back and should clarify: it's all
               | three only in C. In C++ it is just char and unsigned
               | char. (If you want to search for this, "signed char
               | aliasing" gave me good results.)
        
             | matheusmoreira wrote:
             | I agree, it's surprising. Types like uint8_t* are widely
             | used to reinterpret other structures as byte arrays which
             | implies they are universal aliases just like char*. Not
             | sure what makes char8_t different.
        
               | wahern wrote:
               | > Types like uint8_t* are widely used to reinterpret
               | other structures as byte arrays which implies they are
               | universal aliases just like char*
               | 
               | Alternatively, it implies that there's alot of broken
               | code out there. So much broken code that they've
               | accidentally found safety in numbers, and compilers are
               | unlikely to change a coincidental behavior upon which
               | they wrongly relied. See
               | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66110
        
           | rsstack wrote:
           | TIL: https://stackoverflow.com/questions/98650/what-is-the-
           | strict...
        
       | rurban wrote:
       | __restrict would have also helped.
       | 
       | in the case of memset, it's questionable if that library call is
       | actually faster for count < 128
        
         | bonzini wrote:
         | Recent Intel processors have "fast rep stos" which basically
         | allows the compiler to always inline memset (which they
         | couldn't do in the aliasing case).
        
         | kevingadd wrote:
         | since it's std::memset, the compiler could theoretically inline
         | it or have a small-count path that uses vector intrinsics
        
           | saagarjha wrote:
           | It could, but for a case without a known count it doesn't
           | really make sense to do that. (I honestly found the concern
           | to be kind of strange for that reason, honestly.)
        
             | astrange wrote:
             | A memset tail call would be less code size than the loop,
             | so it's still worth it even for cold code.
        
         | SuchAnonMuchWow wrote:
         | I believe restrict is a C only keyword, so no, it couldn't have
         | helped.
        
           | adzm wrote:
           | While not standardized, restrict or __restrict are indeed
           | supported by most C++ compilers. Visual C++ even supports a
           | declspec(restrict) that can apply to the function rather than
           | just individual arguments.
        
       ___________________________________________________________________
       (page generated 2022-05-11 23:01 UTC)