hngopher.com

       [HN Gopher] Catch-23: The New C Standard Sets the World on Fire
       ___________________________________________________________________
        
       Catch-23: The New C Standard Sets the World on Fire
        
       Author : donmcc
       Score  : 215 points
       Date   : 2023-04-01 21:01 UTC (1 days ago)
        
 (HTM) web link (queue.acm.org)
 (TXT) w3m dump (queue.acm.org)
        
       | i-use-nixos-btw wrote:
       | This is written with quite a lot of hyperbole.
       | 
       | The predominant focus is realloc(pre,0) becoming UB instead of
       | what the author misleadingly describes as useful, consistent
       | behaviour. It is far from that, and that's the entire reason that
       | it was declared UB in the first place: https://www.open-
       | std.org/jtc1/sc22/wg14/www/docs/n2464.pdf. Note that this wasn't
       | a proposal to change something, it's a defect report: the
       | original wording was never suitable.
       | 
       | The second part is the misconception about the impact of UB.
       | Making something UB does not dictate that its usage will initiate
       | the rise of zombie velociraptors. It grants the implementation
       | the power to decide the best course of action. That is, after
       | all, what they've been doing all this time anyway.
       | 
       | Note that this deviates from implementation-defined behaviour,
       | because an implementation-defined behaviour has to be consistent.
       | Where implementations choose to let realloc(ptr,0) summon the
       | zombie raptors, they are free to do so. Don't like it? Don't
       | target their implementation. Again, this isn't a change from the
       | POV of implementers - it's a defect in the existing wording.
       | 
       | In this case, the course of action that any implementation will
       | choose is to stick with the status quo. It is clearly not a
       | deciding factor in whether or not you embrace the new standard,
       | and to suggest otherwise is dishonest, sensationalist nonsense.
       | The feature was broken, and it's just being named as such.
        
         | Asooka wrote:
         | > The second part is the misconception about the impact of UB.
         | Making something UB does not dictate that its usage will
         | initiate the rise of zombie velociraptors. It grants the
         | implementation the power to decide the best course of action.
         | That is, after all, what they've been doing all this time
         | anyway.
         | 
         | Wrong. UB never happens. That is the promise the program writer
         | makes to the compiler. UB never happens. A correct C program
         | never executes UB. This allows the compiler to assume that
         | anything that is UB never happens. Does some branch of your
         | program unconditionally execute realloc(..., 0) after constant
         | propagation? That branch never happens and can just be deleted.
         | 
         | Reading the defect report, they state "Classifying a call to
         | realloc with a size of 0 as undefined behavior would allow
         | POSIX to define the otherwise undefined behavior however they
         | please." which is wrong. UB cannot be defined, if you define
         | it, you are no longer writing standard C. It should instead
         | have been classified as "implementation-defined behaviour".
         | 
         | In any case it's not that hard to just write a sane wrapper.
         | This one is placed in the Public Domain:                   void
         | *sane_realloc(void *ptr, size_t sz)         {             if
         | (sz == 0) {                 free(ptr); /*free(NULL) is no-op*/
         | return NULL;             }             if (ptr == NULL) {
         | return malloc(sz);             }             return
         | realloc(ptr, sz);         }
         | 
         | I am calling it sane and not safe, because it is not safe. You
         | still have the confusion of what happens when the function
         | returns NULL (was it allocation failure or did we free the
         | object?) - check errno. However, it has the same fully defined
         | semantics on most all implementations and acts like people
         | would expect.
         | 
         | You may be tempted to make the function return the value of
         | errno, mark it [[nodiscard]] and take a pointer-to-pointer-to-
         | void, so that the value of the pointer will only be changed if
         | the reallocation was successful. I am not sure if that is
         | safer. You are trading one possible bug - null pointer on
         | allocation failure, which then will cause a segmentation fault
         | for another - stale pointer on allocation failure, but with
         | updated size. The latter is more likely to be used in buffer
         | overflow attacks than the former.
        
         | Arch-TK wrote:
         | I agree that realloc was poorly defined for the 0 size case, I
         | think UB or IDB both would have worked in this case to really
         | drive that point home, the WG chose UB.
         | 
         | That being said, you're completely wrong about what UB means.
         | Making use of UB may as well initiate the rise of zombie
         | velociraptors. Except for the situation where your
         | implementation explicitly specifies that it provides a
         | predictable behaviour for a specific case of UB, there's
         | literally no guarantee of what will happen. Assuming that the
         | implementation will stick with some status quo and your code
         | won't exhibit absolutely unusual behaviour is just naiive.
         | 
         | Please don't mislead people into thinking that it's ever a good
         | idea to assume that undefined behaviour will be handled
         | sensibly, this kind of mislead assumption is one of the major
         | sources of bugs in C code.
        
           | cryptonector wrote:
           | Right, this should have been left to the implementor if they
           | didn't want to standardize one behavior. Making it UB is the
           | worst possible outcome. Yes, people who write portable code
           | will still want to not rely on `realloc()`'s freeing
           | behavior, but if you do and your realloc() implementation
           | doesn't, then you suffer a leak, while if you do and
           | realloc() decides to wipe your drive and make your power
           | supply explode...
        
           | astrange wrote:
           | > Except for the situation where your implementation
           | explicitly specifies that it provides a predictable behaviour
           | for a specific case of UB, there's literally no guarantee of
           | what will happen.
           | 
           | That situation is "when you have UBSan turned on".
        
             | G_z9 wrote:
             | [flagged]
        
               | G_z9 wrote:
               | How this gets downvoted is beyond me
               | 
               | Yeah, I'm being downvoted by a bot or something.
        
               | SAI_Peregrinus wrote:
               | It's not directly related to the topic at hand. It's
               | meta-commentary about the discussion, not about the
               | actual topic. That, and your post I'm replying to, and
               | this reply I'm making should all be downvoted as they're
               | all off-topic.
        
               | Dylan16807 wrote:
               | > How this gets downvoted is beyond me
               | 
               | Primarily because you're bringing in an argument from a
               | different story entirely rather than figuring out a
               | better option.
               | 
               | But also you're being very rude in that other thread, and
               | calling twitter "high bandwidth" for a discussion is...
               | weird.
               | 
               | > Yeah, I'm being downvoted by a bot or something.
               | 
               | Uh huh.
        
               | G_z9 wrote:
               | Ok, that's not unreasonable. But I think that making an
               | unrelated comment like that is really only a bad thing if
               | it's in bad faith. He made that comment expecting a
               | response, I'm not like hounding him. And yes, I said some
               | rude things. Are you going to downvote every comment that
               | you see from me because some other comments I made were
               | rude? Doesn't really add up. And why are you policing the
               | threads? That's more weird than asking for a twitter
               | space.
               | 
               | I don't think asking for a twitter space is all that
               | weird. I am constantly frustrated talking with people on
               | HN because what could take 2 seconds takes 20 minutes. I
               | often find that a debate never even has a chance to be
               | resolved because everyone just gets worn out trying to
               | talk through a digital straw. Plus, asking for a twitter
               | space doesn't involve exchange of personal information or
               | anything concerning. It's definitely not done and off the
               | wall but I don't think it's problematic.
               | 
               | Edit: the more I think about it, the more sense it makes.
               | HN has a problem with being flooded with vitriol and and
               | lots of other negative behavior, long chains that are
               | just useless. It would make a lot of sense to offload
               | most of that to another platform since HN as a platform
               | is not well suited to debating. Instead of initiating a
               | huge chain of vitriol, a twitter space could be initiated
               | when people want to debate something. Instead of tons of
               | noise and garbage, HN would host a link to the space. And
               | it would be better because the nature of a space lends
               | itself to people coming to a conclusion, covering the
               | issue more thoroughly and people letting loose less hate,
               | all because of the high bandwidth, intimate nature of
               | real-time audio. It also helps filter out people who are
               | bots or aren't serious or who don't really care about the
               | topic being debated. I am legitimately going to email HN
               | admin about this.
        
               | astrange wrote:
               | Hey, I think getting into arguments for a day then
               | randomly giving up and wandering off is what the sites
               | all about. Actually, I think the guy who stops replying
               | first kind of wins - it's similar to why you shouldn't
               | double-text when dating.
               | 
               | > I don't think asking for a twitter space is all that
               | weird.
               | 
               | My issue is that I don't think I have anything to
               | contribute as I'm not making original conclusions but
               | kind of just quoting a typical labor economist.
               | (Different from quoting the average person, they're
               | usually worried about different things.)
               | 
               | Example being https://www.apricitas.io/p/chatgpt-please-
               | take-my-job.
        
               | Dylan16807 wrote:
               | I didn't downvote you, by the way.
               | 
               | But sure posting in the wrong topic will get a downvote
               | to turn your comment gray. What is wrong with that
               | policing?
               | 
               | > Are you going to downvote every comment that you see
               | from me because some other comments I made were rude?
               | Doesn't really add up.
               | 
               | ...what? You continued the thread here. Nobody is
               | downvoting random comments of yours. Your comments on
               | that story and this story are part of the same
               | conversation.
               | 
               | And if you can't figure out how to reply in a deep thread
               | you can just wait a couple minutes for the link to be
               | there.
               | 
               | > a twitter space
               | 
               | Oh, the chat thing. I thought you meant _tweets_. Sure,
               | that 's a reasonable idea for some conversations.
        
               | G_z9 wrote:
               | Yes I've figured out the timer. As a 2010 account, I kind
               | of have to just yield to you.
        
           | coliveira wrote:
           | > this kind of mislead assumption is one of the major sources
           | of bugs in C code.
           | 
           | This is not even close to be true. Most bugs in C code are
           | from programmer mistakes, not from UB behavior. The
           | exaggeration that is spread by some people regarding UB is
           | close to absurd. If something is UB, it may generate
           | different results in different situations, even with the same
           | compiler. The standard is just clarifying this problem. A
           | good compiler will do something sensible, or at least issue a
           | warning when this situation is detected. If you have a bad
           | compiler that does strange things with your code, it's not a
           | defect of UB but the compiler instead.
        
             | wruza wrote:
             | Optimizing compilers don't work like that. They can either
             | deviate from the standard and leave it as defined behavior,
             | or mark it UB and go with it as usual.
             | 
             | To get some insight by analogy, consider this set of
             | constraints (unrelated to C):                 x <= 7
             | 2x >= 5       ...(more with x, y, z but not more
             | constraining x)...
             | 
             | When you feed this to a linear constraint solver, you may
             | get anything from 2.5 to 7 as x. E.g. 3.1415926. Not
             | because a solver wanted to draw some circles, but because
             | it transformed your geometric problem into an abstract
             | representation for its own algorithm, performed some fast
             | operations over it and returned the result. Nobody knows
             | how exactly a specific solving method will behave wrt
             | (underconstrained) x given that the description above is
             | all you have.
             | 
             | When you feed UB into an optimizer, you feed a bit of lava
             | into a plastic pipe, figuratively. You'll get anything from
             | program #2500...0000 to program #6999...9999, where "..."
             | is few more thousands/millions of digits. Run some numbers
             | from there as an .exe to see if something absurd happens.
             | 
             | The nature of UB and optimizers is that you either relax
             | UBs into DBs and get worse efficiency, or you specify more
             | UBs and get worse programming safety. What happens in
             | between can be perceived as completely random. And the
             | better/faster the optimizer is, the more random the outcome
             | will likely be.
             | 
             |  _The exaggeration that is spread by some people regarding
             | UB is close to absurd_
             | 
             | UB-in-code is absurd by definition, no exaggeration here.
        
             | Arch-TK wrote:
             | > Most bugs in C code are from programmer mistakes
             | 
             | These most often lead to the triggering of UB. The reason
             | why programmer mistakes lead to confusing bugs instead of
             | simple and straightforward bugs which are easy to catch in
             | the development process is mainly because UB imposes no
             | restrictions on what the compiler should do. In the vast
             | majority of UB cases the compilers simply don't do
             | anything, and assume it can't happen. This is why
             | dereferencing a pointer and then checking if it's null ends
             | up eliding the null check (because if you've dereferenced
             | it, it can't be null, that would be UB). Accessing past the
             | end of an array is UB so it can't happen, therefore your
             | compiler won't check for it. Accessing past the end of an
             | array and accidentally reading from/writing to another
             | variable - likewise.
             | 
             | UB encompasses ALL behavior for which the standard does not
             | provide an explicit definition. The reason why the C
             | standard provides explicit instances of UB usually boils
             | down to clarifying situations where people were confused
             | about whether something was UB or not. But if the behaviour
             | is not defined in the standard, then it is by definition
             | UB.
        
             | SCLeo wrote:
             | If I am not wrong, one major security bug that C programs
             | usually face is buffer overflow, which is an undefined
             | behavior.
        
         | omoikane wrote:
         | > This is written with quite a lot of hyperbole
         | 
         | The first sight of "catch fire" might not have caught my
         | attention, but by the time it got to "instrument of arson" and
         | "Molotov cocktails", the style was sufficiently distracting
         | that I was convinced I wasn't the intended audience.
        
         | c4mpute wrote:
         | > The second part is the misconception about the impact of UB.
         | [...] It grants the implementation the power to decide the best
         | course of action. That is, after all, what they've been doing
         | all this time anyway.
         | 
         | Wrong, Wrong, Wrong.
         | 
         | UB allows the implementation to take any arbitrary course of
         | action, without informing anyone, without documentation,
         | without any conscious decision, without weighing anything to be
         | better/worse. Nondeterministically catching fire and launching
         | nuclear rockets is a completely compliant reaction to UB.
         | 
         | What you are describing is "implementation defined" behavior.
         | That has to be deterministic, documented, and conforming to
         | some definition of sanity. Examples are the binary
         | representation of NULL, sizes of integer types or stuff like
         | the maximum filename length. Sadly, too many things in C have
         | "undefined behavior", too few have "implementation defined"
         | behavior.
         | 
         | And UB has always been an excuse for compilers to screw over
         | programmers in hideous ways. Programmers are rightfully afraid
         | of any kind of new UB being introduced, because it will mean
         | that whole new classes of bugs will arise because the compiler
         | optimized out that realloc(..., a) where a might be 0, because
         | thats UB, so screw you and your code... And this change is
         | especially dangerous because it makes a lot of existing code
         | UB.
        
           | jcranmer wrote:
           | The case of realloc being declared UB (as opposed to impl-
           | defined) was not driven by the compiler writers but by the
           | people who write the C libraries.
           | 
           | This isn't a case of compilers screwing over the programmers,
           | because the people who are responsible for those
           | optimizations are the people who are scratching their heads
           | as to why it's UB and not impl-defined behavior.
        
           | AlotOfReading wrote:
           | I wish UB were only as nasty as "nondeterministic behavior".
           | In fact, if there's UB in _anything_ the compiler sees,
           | nothing at all can be assumed, including whether you even get
           | an output. What you 've given the compiler isn't C, so it
           | doesn't have any obligations to do anything with it. The
           | codepath with UB doesn't have to run for the nuclear rockets
           | to launch and the nasal demons to appear.
           | 
           | Since approximately every nontrivial program ever written has
           | UB, in actual practice we're only saved by the fact that
           | compilers aren't entirely maliciously compliant.
        
             | Dylan16807 wrote:
             | That's not true. If the program's execution path from start
             | to finish avoids UB then you're safe. (Also the source code
             | itself has to avoid UB, but that part isn't hard.)
             | 
             | It's true that code with UB does not have to be reached,
             | per se, but it does have to be something your program
             | _will_ reach before it can hurt you.
        
               | AlotOfReading wrote:
               | You're correct in practical terms, but I'm making a very
               | pedantic point about what the standard requires happen,
               | mainly because this pedantry has important implications
               | for e.g. safety critical C. Note 1 to the definition in
               | 3.4.3 provides some clarification about the extent of UB
               | and states that UB can manifest at translation time. It
               | also gives says that the translator should behave in a
               | documented manner when encountering UB, but does not
               | _require_ that it do so.
        
               | AnimalMuppet wrote:
               | Fine. HN is, after all, a place where you can be
               | pedantic.
               | 
               | But those of us who are actually writing programs mostly
               | care about "in practical terms", and in practical terms,
               | this doesn't happen, so _we don 't care_. We've got
               | enough trouble worrying about what _does_ happen; we don
               | 't have time and energy to worry about what _doesn 't_
               | and _won 't_ happen.
        
               | still_grokking wrote:
               | That's like saying: "I don't care what the standard
               | says!"
               | 
               | Sure, this is perfectly fine.
               | 
               | Only that you're not writing any C/C++ than, but
               | something in the "gcc 12 language with some switches", or
               | maybe the "LLVM 15 language with some switches", or
               | something like that.
        
               | AnimalMuppet wrote:
               | Well, if Visual Studio (or whatever Microsoft calls their
               | compiler these days), and all known versions of gcc, and
               | all known versions of LLVM all do something sane, then
               | I'm not sure I care all that much about the theoretical
               | possibility that some compiler someday might do something
               | insane.
        
               | AlotOfReading wrote:
               | To provide some more context/motivation for why you might
               | care, I write safety-critical code. I'm often advising
               | people what they need to do for certification, etc. If
               | all you need to do is ensure that you never execute
               | undefined operations and knock out the list of specified
               | UB, that's totally, 100% manageable. Throw some
               | sanitizers on, provide realistic input, and test the hell
               | out of it. Normal stuff.
               | 
               | If the reality is that any UB can invalidate the entire
               | program (as is the interpretation taken by other
               | standards re: C), then that's not remotely sufficient.
               | You have to ensure the complete absence of UB.
        
               | LegionMammal978 wrote:
               | C has both translation-time UB and runtime UB. (C++
               | explicitly separates the two concepts into "ill-defined,
               | no diagnostic required" and "undefined behavior".) You
               | can tell them apart from the condition for UB to occur:
               | if it's a translation-time condition, then it's
               | translation-time UB, and if it's a runtime condition,
               | then it's runtime UB. (Same with implicit UB: is it a
               | translation-time or a runtime assumption being violated?)
               | 
               | Usually when we talk about UB, we're implicitly talking
               | about runtime UB, since translation-time UB is generally
               | far less subtle. If a program contains only conditional
               | runtime UB, the compiler is not permitted to break the
               | entire program from the very beginning, since all
               | possible executions that do not trigger runtime UB must
               | execute correctly as per 5.1.2.3.
        
               | AlotOfReading wrote:
               | 5.1.2.3 only binds _conforming_ programs. Programs
               | containing UB are by definition non-conforming.
               | 
               | I hadn't considered the C++ standard here, but 1.9 is
               | much more clear than corresponding C verbiage. 1.9.5 is
               | exactly what's described upthread, where any "execution
               | [that] contains an undefined operation" has no prescribed
               | behavior. But the note to the requirement immediately
               | before that (1.9.4) doesn't use that language and instead
               | "imposes no requirements on programs that contain UB". If
               | they had intended only to avoid specifying semantics for
               | programs that hit UB during some possible execution, they
               | would have used the same language as 1.9.5.
        
               | Kranar wrote:
               | Your claim is actually false. C differentiates between a
               | conforming program and a strictly conforming program.
               | 5.1.2.3 binds to conforming programs which is permitted
               | to produce output dependent on undefined behavior.
               | 
               | Only strictly conforming programs may not produce output
               | dependent on undefined behavior.
        
               | AlotOfReading wrote:
               | No? Conformance allows unspecified and implementation
               | defined. Strict conformance is the absence of that (i.e.
               | same output in every conforming environment). Neither
               | includes UB, as UB is "outside the standard" in some
               | sense and doesn't have defined semantics.
        
               | Kranar wrote:
               | It's a common misconception that a conforming program may
               | not engender undefined behavior. In fact this very
               | article touches on how realloc has introduced new (and
               | backwards incompatible) undefined behavior precisely to
               | accommodate the POSIX standard (so that POSIX compliant
               | implementations of C can redefine the otherwise undefined
               | behavior however they please).
        
               | AlotOfReading wrote:
               | Can you cite that? It runs against a plain reading of the
               | standards (both C and C++) and would be insane for the
               | standard to allow "correct" programs to include those
               | with undefined behavior. There was even an unadopted
               | proposal (n853 [1]) attempting to clarify this.
               | 
               | While I was making sure I wasn't missing something
               | obvious, I took a look through the rest of the WG14
               | proposals to see if I was somehow off in my understanding
               | regarding translators being allowed to barf over UB
               | anywhere in the program. There was a proposal clarifying
               | the situation to the possible-execution understanding
               | from upthread submitted by Victor Yodaiken (n2278 [2]),
               | but unfortunately it was also never adopted.
               | 
               | [1] https://www.open-
               | std.org/jtc1/sc22/wg14/www/docs/n853.htm
               | 
               | [2] https://www.open-
               | std.org/jtc1/sc22/wg14/www/docs/n2278.pdf
        
               | [deleted]
        
             | coliveira wrote:
             | > approximately every nontrivial program ever written has
             | UB
             | 
             | You can replace "UB" for "bugs" and the result is the same.
             | UB is a bug on the part of the programmer, from the point
             | of view of C, similar to dereferencing a null pointer. When
             | the standard says that something is UB, it is just
             | clarifying what these situations are.
        
               | pmarin wrote:
               | If they are bugs they should be reported to the user and
               | end the compilation with an error.
        
               | the_why_of_y wrote:
               | Compilers actually have some options to enable that.
               | 
               | The problem is, it only works well in the simplest cases
               | when the code will 100% exhibit UB within a single
               | function.
               | 
               | In most cases, the UB would only manifest on particular
               | input values - if you want your compiler to warn about
               | that then it will report one "potential UB" for every 10
               | lines of C code, and nobody wants to use such a compiler.
        
               | mjevans wrote:
               | That's exactly why a compiler shouldn't be able to
               | 'optimize' in the face of UB, it should be an ERROR and
               | the section of undefined behavior highlighted in the
               | error message.
        
               | gpderetta wrote:
               | We rehash this argument every few weeks. Please search
               | the comment history why it is nonsensical.
        
               | circuit10 wrote:
               | Doing that at compile time would require being able to
               | perfectly predict everything the program can do, which is
               | equivalent to solving the halting problem (make the
               | program do something undefined after it finishes, then if
               | you get an error at compile time then it halts) and is
               | mathematically impossible. Doing it at runtime would have
               | a massive performance impact
        
               | chongli wrote:
               | This would mean you'd have to insert a check every time
               | you add two signed integers together, because signed
               | overflow is UB. You'd also have to wrap every memory
               | access with bounds checks, because OOB memory access is
               | UB.
               | 
               | There are also tons and tons of loop optimizations
               | compilers do for side-effect free loops which would have
               | to be removed completely. This is because infinite loops
               | without side effects are UB. So if you wanted these
               | optimizations you'd have to prove to the compiler -- at
               | compile time -- that your loop is guaranteed to terminate
               | since it is not allowed to assume that it will. Without
               | these loop optimizations, numerical C code (such as
               | numpy) would be back in the stone ages of performance.
               | 
               |  _Edit_ : I just wanted to point out that one of the new
               | features in C23 is a standard library header called _<
               | stdckdint.h>_ that includes functions for checked integer
               | arithmetic. This allows you to safely write code for
               | adding, subtracting, and multiplying two unknown signed
               | integers and getting an error code which indicates
               | success or failure. This will be the standard preferred
               | way of doing overflow-safe math.
        
               | heywhatupboys wrote:
               | > because signed overflow is UB
               | 
               | no longer
        
               | mafuy wrote:
               | > you'd have to insert a check every time you add two
               | signed integers together,
               | 
               | This is exactly what is done in serious code. It is
               | typically combined with contracts and static analysis
               | (often human), e.g. "it is guaranteed that this input is
               | in range 10-20, so adding it with this other 16 bit int
               | can be assumed to be below sint32_max".
        
               | pclmulqdq wrote:
               | Great, those checks can stay in "serious" code, and those
               | of us who don't want them can take the UB. C++ 20
               | actually ended up specifying that all ints are twos
               | complement, removing this from the category of "UB," but
               | a lot more weird stuff is programmed in C.
        
               | gpderetta wrote:
               | Note that signed overflow is still UB in c++ even with
               | 2-complement being guaranteed for signed types.
        
               | DangitBobby wrote:
               | Another option would be to define behaviors for integer
               | overflow and out of bounds memory access. Presumably they
               | happen fairly often and it might be a good idea to nail
               | down what should happen in those cases.
        
               | gpderetta wrote:
               | Good luck defining the behaviour of use after free of
               | accessing out of bound stack memory without bound
               | checking and GC.
        
               | bluecalm wrote:
               | UB is a better option though. When your signed integer
               | overflows it's a bug nevertheless. Why force the compiler
               | to generate code for a pointless case instead of letting
               | it optimize the intended one?
               | 
               | If you value never having bugs over performance then just
               | insert a check or run your program with a sanitizer that
               | does that for you. It's a solved problem for a case where
               | performance doesn't matter. The thing is that it does.
        
               | skitter wrote:
               | That would be great if it was possible, but how do you
               | specify & implement sensible behavior for this:
               | void foo(int *a, int b) { a[b] = 1}
               | 
               | At runtime there is no information about whether that
               | write is in bounds and no way to prevent this from
               | corrupting arbitrary data unless you compile for
               | something like CHERI.
        
               | chongli wrote:
               | Those things aren't up to the language, they're up to
               | hardware. C is a portable language that runs on many
               | different platforms. Some platforms might have protected
               | memory and trap on out of bounds memory access. Other
               | platforms have a single, flat address space where out of
               | bounds memory access is not an error, it just reads
               | whatever is there since your program has full access to
               | all memory.
               | 
               | The same goes for integer overflow. Some platforms use
               | 1's complement signed integers, some platforms use 2's
               | complement. Signed overflow would simply give different
               | answers on these platforms. The standards committee long
               | ago decided that there's no sensible answer to give which
               | covers all cases, so they declared it undefined behaviour
               | which allows compilers to assume it'll never happen in
               | practice and make lots of optimizations.
               | 
               | Forcing signed overflow to have a defined behaviour means
               | forcing every single signed arithmetic operation through
               | this path, removing the ability for compilers to combine,
               | reorder, or elide operations. This makes a lot of
               | optimizations impossible.
        
               | johnny22 wrote:
               | doesn't C force 2s complement now? If so, one less thing
               | to worry about.
        
               | adrian_b wrote:
               | The problem is that here is a vicious circle.
               | 
               | Most old computer architectures had a much more complete
               | set of hardware exceptions, including cases like integer
               | overflow or out-of-bounds access.
               | 
               | In modern superscalar pipelined CPUs, implementing all
               | the desirable hardware exceptions without reducing the
               | performance remains possible (through speculative
               | execution), but it is more expensive than in simple CPUs.
               | 
               | Because of that, the hardware designers have taken
               | advantage of the popularity gained by languages like C
               | and C++ and almost all modern programming languages,
               | which no longer specify the behavior for various errors,
               | and they omit the required hardware means, to reduce the
               | CPU cost, justifying their decision by the existing
               | programming language standards.
               | 
               | The correct way to solve this would have been to include
               | in all programming language standards well-defined and
               | uniform behaviors for all erroneous conditions, which
               | would have forced the CPU designers to provide efficient
               | means to detect such conditions, like they are forced to
               | implement the IEEE standard for floating-point
               | arithmetic, despite their desire to provide unreliable
               | arithmetic, which is cheaper and which could win
               | benchmarks by cheating.
        
               | chongli wrote:
               | CPU designers don't like having their hand forced like
               | that. If you create a new standard forcing them to add
               | extra hardware to their designs, they'll skip your
               | standard and target the older one (which has way more
               | software marketshare anyway). They will absolutely bend
               | over backwards to save a few cycles here and a few
               | transistors there, just so they can cram in an extra
               | feature or claim a better score on some microbenchmark.
               | They absolutely do not care at all about making life
               | easier for low-level programmers, hardware testers, or
               | compiler writers.
        
               | rini17 wrote:
               | I don't believe adding simple checks against data already
               | present in L1 caches and marked as "unlikely to fail"
               | should be so onerous.
        
               | cryptonector wrote:
               | Bugs are UB-like in a sense (what's the code going to do?
               | well, you'll have to think about it, or try it and see),
               | but UB is strictly worse than bugs (different compilers,
               | even different versions of the same compiler, can do
               | radically different things way beyond the scope of the
               | bug).
        
               | AlotOfReading wrote:
               | What the standard explicitly calls out as UB is only a
               | small subset of actual UB.
               | 
               | While you can certainly classify all UB as "bugs", doing
               | so misses the critical differences between UB and other
               | categories of bugs. If you have a logic bug for example,
               | your program will correctly and consistently do the wrong
               | thing. It will continue doing that wrong thing with a
               | different compiler, on a different platform today and 10
               | years from now. Implementation defined behavior is a bit
               | looser, but will still be consistent with any particular
               | implementation (which will document the behavior) and
               | will only manifest in the code that depends on it. A PR
               | inserting one of these "normal" bugs doesn't invalidate
               | the entire rest of the program.
               | 
               | UB is different. You can't make assumptions about UB
               | because from the point of view of the standard, UB is
               | "not C". There are no assumptions to be made, it's just
               | all the stuff that doesn't have assigned semantics. And
               | since the input is meaningless, so is the entirety of
               | whatever the compiler gives you back.
        
               | coliveira wrote:
               | > If you have a logic bug for example, your program will
               | correctly and consistently do the wrong thing.
               | 
               | Not correct. Bugs can occur differently in different
               | architectures, even in high level languages. UB is just a
               | kind of bug whose effect depends on how the compiler
               | behaves, so you have to be careful to test your code on
               | different compiler settings. This is nothing new on
               | programming languages, it is only made explicit in the C
               | standard. Suddenly people started to believe that
               | pointing out the obvious source of bugs (UB) in the
               | standard is equivalent to let programs misbehave.
        
               | AlotOfReading wrote:
               | I'm not sure if you're making a point about "unspecified
               | behavior" (where the compiler can choose between multiple
               | valid behaviors), but no, a strictly conforming program
               | will have the same semantics on different architectures.
               | Strictly conforming programs can still have bugs, but
               | their nature is completely different than UB because
               | that's the point of the standard.
        
               | adgjlsfhk1 wrote:
               | > you have to be careful to test your code on different
               | compiler settings.
               | 
               | The problem is you have to test your code on compilers
               | that don't exist yet with compiler settings that do
               | different things from any compiler that ever might exist.
        
               | coliveira wrote:
               | This has always been the case. If you write code that has
               | UB, new compilers can do something yet undefined, by
               | definition.
        
           | chongli wrote:
           | _And UB has always been an excuse for compilers to screw over
           | programmers in hideous ways_
           | 
           | Your reply was great up until this. Compiler writers aren't
           | looking to screw over programmers, they're looking to make
           | code faster. UB gives them the ability to make assumptions
           | about what is and is not true, at a particular moment in
           | time, in order to skip doing unnecessary work at runtime.
           | 
           | By assuming that code is always on the happy path, you can
           | cut a lot of corners and skip checks that would otherwise
           | greatly slow down the code. Furthermore, these benefits can
           | cascade into more and more optimizations. Sometimes you can
           | have these large, complicated functions and call graphs get
           | optimized down to a handful of inlined instructions.
           | Sometimes the speedup can be so dramatic that the entire
           | application is unusable without it!
           | 
           | Many of these optimizations would be impossible if compilers
           | were forced to assume the opposite: that UB will occur
           | whenever possible.
           | 
           | The tool programmers have available to them is compiler
           | flags. You can use flags to turn off these assumptions, at
           | the cost of losing out on optimizations, if your code needs
           | it and you're unable to fix it. But it's better to turn on
           | all possible warnings and treat warnings as errors, rather
           | than ignoring them, to push yourself to fix the code.
        
             | adgjlsfhk1 wrote:
             | the thing that makes UB almost malicious is that it
             | propagates inter-procedurally. This makes reasoning about
             | code with UB basically impossible which means that you
             | should always assume that the compiler is going to screw
             | you over if you use it because there is no way to know
             | whether it will.
        
               | chongli wrote:
               | You should consider a program with undefined behaviour to
               | be the equivalent of a mathematical proof that contains
               | an unstated contradiction. _Ex falso quodlibet_ : from a
               | falsehood anything follows. Also called the principle of
               | explosion.
               | 
               | Undefined behaviour renders your entire program
               | meaningless. It must be avoided at all costs. Using
               | undefined behaviour on purpose is like sticking a fork in
               | an electrical socket.
        
               | Kranar wrote:
               | It's funny that your original post was an objection to
               | how undefined behavior gives license to screw developers
               | over, but here you are talking about how undefined
               | behavior is like sticking a fork in an electrical socket.
        
               | chongli wrote:
               | My original post was an objection to the implied intent
               | on the part of compiler writers. An electrical socket
               | does not have intent, it's just a hazard that also
               | happens to provide enormous benefits to our lifestyles.
               | 
               | I think it's a perfect analogy to undefined behaviour in
               | C: enormous benefits but also a hazard to be wary of. A
               | lot of people don't understand the benefits, they just
               | see the hazard. Throughout this discussion I've been
               | trying to clarify that, with perhaps limited success.
        
               | tsegratis wrote:
               | But just to be clear @chongli is logical
               | 
               | Think of UB as a probabilistic error. I.e. it is always
               | stupid to rely on it
               | 
               | 1. Write code without errors -- sensible 2. Allow
               | compilers to assume the absence of errors -- occasionally
               | sensible, since it speeds up your program
               | 
               | In defence of UB, for the most part they are things that
               | should break your program anyway: stack overflow is never
               | correct. So your choice is mostly to fail badly quickly,
               | or to fail slowly well
               | 
               | Thanks to google making the UB sanitizers you are free to
               | make that choice even in C
        
               | Kranar wrote:
               | I'd argue that it's stupid to think that it's stupid to
               | rely on UB.
               | 
               | Almost any non-trivial software explicitly relies on
               | undefined behavior, including safety critical libraries
               | such as cryptographic libraries, the Linux operating
               | system has rampant undefined behavior that it makes a
               | conscious decision to use. POSIX makes use of undefined
               | behavior for shared libraries (it treats functions loaded
               | from shared libraries as void*, which is undefined
               | behavior).
        
               | Gibbon1 wrote:
               | That's not an argument to keep live grenades laying
               | around, it's an argument to remove them from the spec.
               | 
               | Like signed int being UB. Define it to have 2 complement
               | semantics. Problem solved. I'm sure the nutters trying to
               | extend C++ with templates will howl but this is C not
               | C++. And seriously C++ is dead man walking at this point.
        
               | pjmlp wrote:
               | Until LLVM, GCC, key game engines and GPGPU SDK get
               | rewritten into something else, it is going to be Resident
               | Evil day for a looong time.
        
               | chongli wrote:
               | C23 does make two's complement standard. It also adds
               | checked arithmetic so you can safely avoid signed
               | overflow.
               | 
               | It does not make signed overflow defined behaviour. This
               | would prevent integer operation reordering as an
               | optimization, leading to slower code.
        
               | Gibbon1 wrote:
               | Yeah but it's reversed signed overflow shouldn't be UB by
               | default. You should have to explicitly opt in for that.
               | 
               | The reason of course why they refuse to do that if
               | because if that were that case most shops would up and
               | ban unsafe signed.
        
               | properparity wrote:
               | >This would prevent integer operation reordering as an
               | optimization, leading to slower code.
               | 
               | The sane way to address that is to add explicit opt-in
               | annotations like 'restrict'.
               | #push_optimize(assume_no_integer_overflow)       int x =
               | a + b;       // more performance orientated code
               | #pop_optimize       // back to sane C
               | #push_optimize(assume_no_alias(a, b), assume_stride(a,
               | 16), assume_stride(b, 16))       void compute(float *a,
               | float *b, int index)       {        // here the compiler
               | can assume a and b do not alias        // and it can
               | assume it can always load 16 bytes at a time        //
               | the programmer has made sure it's aligned and padded to
               | so with any index        // there's always 16 bytes to
               | load        // so go on, use any vectorized simd
               | instruction you want       }       #pop_optimize       //
               | back to sane C
        
               | chongli wrote:
               | That's a lot uglier and clunkier than just using the
               | ckd_add, ckd_mul etc. safe checked arithmetic. Plus if an
               | overflow occurs you still get an incorrect result which
               | you probably don't want.
               | 
               | Or maybe I'm wrong? Do people actually want overflows to
               | occur and incorrect results? If they're willing to
               | tolerate incorrect results, why would they also want
               | optimizations disabled?
        
               | Gibbon1 wrote:
               | The thing is it's ugly in the rare case that absolute
               | performance is worth fighting for. And not ugly in the
               | majority case where it isn't in the top three important
               | things.
        
               | chongli wrote:
               | No, GP's proposal is ugly in the majority case. If you're
               | going to make signed overflow defined behaviour then
               | every time you write:                   int c = a + b;
               | 
               | You have to assume it will overflow and give an incorrect
               | result. So now you need to check everything, everywhere,
               | and you don't get any optimizations unless you explicitly
               | ask for them with those ugly #push_optimize annotations.
               | I completely fail to see how this is an advantage.
               | 
               | The way C works right now, the assumption is that you
               | want optimization by default and safety is opt-in. The
               | GP's proposal takes away the optimization by default. It
               | then makes incorrect results the default, but it does not
               | make safety the default. To make safety the default you
               | would have to force people to write conditionals all over
               | the place to check for the overflows with ckd_add,
               | ckd_mul etc. Merely writing:                   int c = a
               | + b;
               | 
               | Does not give you any assurances that your answer will be
               | correct.
        
               | pclmulqdq wrote:
               | C++ 20 did that too.
        
               | Joker_vD wrote:
               | > Undefined behaviour renders your entire program
               | meaningless
               | 
               | That's exactly the complaint. Consider that the
               | implementations of the standard library sometimes have
               | exposed UB: that renders behaviour of _all_ of the
               | running code on the system undefined.
               | 
               | Many programmers believe that the fallout of the UB
               | could, and therefore should, be limited in scope.
        
               | coliveira wrote:
               | To achieve your goal, compilers would have to disable any
               | sufficiently powerful optimization. If you write bugs
               | (UB), a powerful compiler will eventually catch them and
               | generate code that you didn't intend at the beginning.
               | However, this is not the fault of the compiler or the
               | language.
        
               | chongli wrote:
               | Compiler writers have already done this. With flags you
               | can disable any optimization you like, with all of the
               | performance loss that entails. But then people complain
               | that their programs are slow.
               | 
               | What people really want is an AI that ignores the code
               | they write and just "does what they really meant." But of
               | course that's not foolproof either. Every day people ask
               | each other to do things and miscommunications occur, with
               | the wrong thing being done. I don't really know what to
               | say other than "people should be more careful and also
               | more forgiving."
        
               | coliveira wrote:
               | Exactly. All the hoopla about UB is complaining about how
               | compiler optimizations work and the fact that the
               | standard committee makes clear (with each new meeting)
               | what is considered undefined behavior or not. They should
               | instead thank the committee for clarifying this.
        
               | adgjlsfhk1 wrote:
               | It is the fault of the language to the extent that the
               | purpose of the language is to make it easy to write
               | correct programs, and UB makes it really hard (and in
               | some cases impossible) to write correct programs.
        
               | coliveira wrote:
               | It is just the opposite. UB is a clarification to tell
               | programmers what the language considers to be undesired
               | behavior. If they didn't say anything, it would be always
               | a mystery if a certain construct was allowed or not,
               | effectively making it compiler dependent. Compilers would
               | also have less avenue for creating optimizations. In the
               | next iterations of the C standard we may see more
               | constructs classified as UB.
        
               | adgjlsfhk1 wrote:
               | That sounds good in theory, but many things that are UB
               | in C/C++ are UB because they are really hard to verify at
               | compile time which makes them almost impossible to
               | program around. Any signed addition in C is potential UB
               | unless you have a proof that all numbers that will ever
               | be input to the addition won't cause overflow (which is
               | made harder because C doesn't define the size of the
               | default integer types). Furthermore, no progress is UB
               | which means that as a programmer, you have to solve the
               | halting problem for your program before knowing whether
               | it has a bug.
        
               | jcranmer wrote:
               | > many things that are UB in C/C++ are UB because they
               | are really hard to verify at compile time which makes
               | them almost impossible to program around
               | 
               | The second half of the sentence doesn't follow from the
               | first. Take everyone's favorite example, signed integer
               | overflow: all you have to do to avoid UB on signed
               | integer overflow is check for overflow before doing the
               | operation (and C23 _finally_ adds features to do that for
               | you).
               | 
               | Taking a step back, the fundamental thing about UB is
               | that it is very nearly always a bug in your code (and
               | this includes especially integer overflow!). Even if you
               | gave well-defined semantics to UB, the semantics you'd
               | give would very rarely make the program not buggy.
               | Complaining that we can't prove programs free of UB is
               | tantamount to complaining that we can't prove programs
               | free of bugs.
               | 
               | It actually turns out that UB is actually extremely
               | helpful for tools that try to help programmers find bugs
               | in their code. Since UB is automatically a bug, any tool
               | that finds UB knows that it found a bug; if you give it
               | well-defined semantics instead, it's a lot trickier to
               | assert that it's a bug. In a real-world example, the
               | infamous buffer overflow vulnerability Heartbleed stymied
               | most (all?) static analyzers for the simple reason that,
               | due to how OpenSSL did memory management, _it wasn 't
               | actually undefined behavior by C's definition_. Unsigned
               | integer overflow also falls into this bucket--it's very
               | hard to distinguish between intentional cases of unsigned
               | integer overflow (e.g., hashing algorithms) from
               | unintentional cases (e.g., calculating buffer sizes).
        
               | xigoi wrote:
               | > all you have to do to avoid UB on signed integer
               | overflow is check for overflow before doing the operation
               | (and C23 finally adds features to do that for you).
               | 
               | ...making your code practically unreadable, since you
               | have to write ckd_add(ckd_add(ckd_mul(a,a),ckd_mul(ckd_mu
               | l(2,a),b)),ckd_mul(b,b)) instead of a * a + 2 * a * b + b
               | * b.
        
               | chongli wrote:
               | That's not the correct syntax for the ckd_ operations.
               | They take 3 operands, the first being a pointer to an
               | integer where the result should be stored. And they
               | return a bool, which you need to check in a conditional.
               | If you're just going to throw out the bool and ignore the
               | overflows, why bother with checked operations in the
               | first place?
        
               | xigoi wrote:
               | Yeah, I realize that now. That's even worse. So you'll
               | have to write something like                   int
               | aa,twoa,twoab,bb,aaplustwoab,aaplustwoabplusbb;
               | if (ckd_mul(a,a,&aa)) { return error; }         if
               | (ckd_mul(2,a,&twoa)) { return error; }         // ...
               | if (ckd_add(aaplustwoab,bb,aaplustwoabplusbb)) { return
               | error; }         return aaplustwoabplusbb;
               | 
               | So ergonomic!
               | 
               | > If you're just going to throw out the bool and ignore
               | the overflows, why bother with checked operations in the
               | first place?
               | 
               | I'd expect the functions to return the result on success
               | and crash on failure. Or better, raise an exception, but
               | C doesn't have exceptions...
        
               | chongli wrote:
               | Why not just write:                   bool
               | aplusb_sqr(int* c, int a, int b) {             return c
               | && ckd_add(c, a, b) && ckd_mul(c, *c, *c);         }
        
               | xigoi wrote:
               | Obviously you could do that in this case, I just wanted
               | to come up with a complicated formula.
        
               | c4mpute wrote:
               | > all you have to do to avoid UB on signed integer
               | overflow is check for overflow before doing the operation
               | 
               | All you have to do is add a check for overflow _that the
               | compiler will not throw away because "UB won't happen"_.
               | The very thing you want to avoid makes avoiding it very
               | hard, and lots of bugs have resulted from compilers
               | "optimizing" away such overflow checks.
        
               | chongli wrote:
               | This is covered in the article and numerous replies in
               | this thread. Use <stdckdint.h>.
        
               | the_why_of_y wrote:
               | My complaint here is that it took C more than 30 years
               | between defining signed integer overflow as UB and
               | providing programmers with standard library facilities to
               | check if a signed integer operation would result in
               | overflow.
               | 
               | I much prefer Rust's approach to arithmetic, where
               | overflow with plain arithmetic operators is defined as a
               | bug, and panics on debug-enabled builds, plus special
               | operations in the standard library like _wrapping_add_
               | and _saturating_add_ for the special cases where overflow
               | is expected.
        
               | chongli wrote:
               | _My complaint here is that it took C more than 30 years
               | ... I much prefer Rust 's approach_
               | 
               | That's an odd complaint. Rust didn't spring forth fully
               | formed from the ether, it stands on the shoulders of C
               | (and other giants of PL history). 30 years ago you
               | couldn't use Rust at all because it didn't exist.
               | 
               | The reason the committee doesn't just radically change C
               | in all these nice ways to catch up to Rust is because it
               | would be incompatible. Then you wouldn't have fixed C,
               | you'd just have two languages: "old C", which all of the
               | existing C code in the world is written in, and "new C",
               | which nothing is written in. At that point why not just
               | start over from scratch, like they did with Rust?
        
               | the_why_of_y wrote:
               | Interestingly, the first Ada standard in 1983 defined
               | signed integer overflow to raise a CONSTRAINT_ERROR
               | exception.
               | 
               | But apparently it lacked unsigned integers with modular
               | arithmetic?
               | 
               | http://archive.adaic.com/standards/83lrm/html/lrm-11-01.h
               | tml... http://archive.adaic.com/standards/83lrm/html/lrm-
               | 03-05.html
               | 
               | The 2012 version is a bit more readable, and has unsigned
               | integers:
               | 
               |  _For a signed integer type, the exception
               | Constraint_Error is raised by the execution of an
               | operation that cannot deliver the correct result because
               | it is outside the base range of the type. For any integer
               | type, Constraint_Error is raised by the operators "/",
               | "rem", and "mod" if the right operand is zero._
               | 
               |  _For a modular type, if the result of the execution of a
               | predefined operator (see 4.5) is outside the base range
               | of the type, the result is reduced modulo the modulus of
               | the type to a value that is within the base range of the
               | type._
               | 
               | http://www.ada-
               | auth.org/standards/rm12_w_tc1/html/RM-3-5-4.h...
        
               | chongli wrote:
               | See my other comment [1] which addresses the exact things
               | you brought up here. Safe checked arithmetic is a new
               | standard feature in C23. If no progress were not UB, then
               | tons of loop optimizations would be impossible and then
               | we couldn't have nice things, like numpy.
               | 
               | [1] https://news.ycombinator.com/item?id=35406554
        
               | coliveira wrote:
               | > Any signed addition in C is potential UB unless you
               | have a proof that all numbers that will ever be input to
               | the addition won't cause overflow
               | 
               | This has always been the case. Standard C has always
               | operated with the possibility that addition can overflow.
               | The programmer or library writer is responsible to check
               | if the used types are large enough. If you want to be
               | perfectly sure you need to check for overflow. Making
               | this UB has not changed the nature of the issue.
               | 
               | > is made harder because C doesn't define the size of the
               | default integer types
               | 
               | They correctly made this implementation defined. But C
               | now has different byte sized integer types if you want to
               | be sure.
        
               | CJefferson wrote:
               | Is the improved performance of C over say Java, or Rust
               | (which both have much less undefined behaviour -- Java
               | almost none) worth the pain and bugs which have been
               | caused by UB?
               | 
               | Honestly, I don't think so, and as computers get more
               | powerful and the amount of the world which relies on
               | their correct functioning grows, I feel the arguments for
               | UB become increasingly difficult to justify.
        
               | chongli wrote:
               | I went to look up undefined behaviour in Rust and I got
               | this scary warning:
               | 
               |  _Warning: The following list is not exhaustive. There is
               | no formal model of Rust 's semantics for what is and is
               | not allowed in unsafe code, so there may be more behavior
               | considered unsafe. The following list is just what we
               | know for sure is undefined behavior. Please read the
               | Rustonomicon before writing unsafe code._
               | 
               | After the warning was a list of many of the same types of
               | things that are undefined behaviour in C. In addition,
               | there's a bunch more undefined behaviour related to
               | improper usage of the unsafe keyword.
               | 
               | So I don't think you get a free lunch with Rust here.
               | What you get is a "safe" playground if you stay within
               | the guard rails and avoid using the unsafe keyword. But
               | then you are limited to writing programs which can be
               | expressed in safe Rust, a proper subset of all programs
               | you might want to write.
               | 
               | Furthermore, the lack of a formal specification for Rust
               | is one area where it lags behind C, a standardized
               | language. All of the undefined behaviour in C is decreed
               | and documented by the standard, having been decided by
               | the committee. Rust, on the other hand, may have weird
               | and unpredictable behaviour that you just have to debug
               | yourself, which may or may not be compiler bugs.
        
               | Kranar wrote:
               | C does not have a formal specification either. It has a
               | standard's document that is written using formal English,
               | but it does not provide a formal spec of C's semantics. A
               | formal spec of a programming language's semantics would
               | entail using a formal semantic model such as operational
               | or denotational semantics. Some programming languages do
               | specify the formal semantics for the entire language or
               | some subset of the language but C is not one of them.
               | 
               | Your claim that the C Standard lists all undefined
               | behavior is actually false. The C Standard only lists out
               | the explicit list of undefined behavior, but it does not
               | list out the implicit list of undefined behavior. There
               | have been efforts to make just such a list but it's an
               | incredibly difficult task.
        
               | CJefferson wrote:
               | I agree rust isn't perfect, but I think you underestimate
               | the value of "safe" code.
               | 
               | I often write programs that have unsafe code. However,
               | the unsafe code is never more than 100 lines, which means
               | I have a very small amount of code to reason about --
               | Rust users expect (of course, you as a programmer has to
               | enforce) that it should be possible to cause UB from safe
               | code, so my "safe interface" to my unsafe code ensures my
               | code can't cause UB, no matter what I call.
               | 
               | On problem with Rust is generally when you mess up it
               | panics -- I think that's better than buffer overflows and
               | the like, but still not a good user experience.
               | 
               | This means there is a very small amount of code I have to
               | really think about, while in C or C++, basically any
               | place x[i] appears (regardless of if x is a pointer or a
               | std::vector).
               | 
               | You can of course write safe C code, people do, but it's
               | hard, and it only takes one slip up anywhere in your
               | program to blow it.
        
               | chongli wrote:
               | In one sense, C is the unsafe code block for myriad other
               | languages, like Python. Python users don't want to deal
               | with undefined behaviour either. They want to write their
               | high level code in NumPy or PyTorch and just have
               | everything work very fast.
               | 
               | Little do they know: they rely on C for those libraries
               | and for things like ATLAS and LAPACK, which implement the
               | underlying numerical linear algebra code. Well, it turns
               | out that ATLAS relies pretty heavily on optimizing C
               | compilers to generate optimal code on many different
               | platforms. At the bottom of all this are the many loop
               | optimizations included in compilers which, thanks to
               | undefined behaviour in the C spec, are able to assume
               | that code is always on the happy path.
               | 
               | It also turns out that Rust includes bindings to ATLAS
               | and LAPACK. I would imagine at some point people might
               | want to write a new linear algebra package in pure Rust.
               | I think it'll be quite difficult to match the performance
               | of those two in safe Rust, but we'll see.
        
               | jamincan wrote:
               | Isn't LAPACK written in Fortran?
        
               | chongli wrote:
               | You're right, and ATLAS is as well, but Fortran has
               | undefined behaviour [1] for all the same reasons that C
               | does.
               | 
               | [1] https://stackoverflow.com/a/57558908
        
         | benj111 wrote:
         | My understanding was that they're changing realloc() because
         | they previously allowed zero length arrays and because you
         | can't tell if this is a zero length array you need to either
         | get rid of zero length arrays or change realloc().
         | 
         | So the feature wasn't broken to begin with, it was broken by
         | another feature.
        
         | [deleted]
        
         | GuB-42 wrote:
         | UB can initiate the rise of zombie velociraptors.
         | int n;       printf("type 0 to stop the rise of zombie
         | velociraptors");       scanf("%d", &n);       realloc(pre, n);
         | if (n != 0) rise_zombie_velociraptors()
         | 
         | May result in velociraptors raising even if the user enters
         | "0".
         | 
         | The reason is that because realloc(pre, 0) is UB, for the
         | compiler, it cannot happen, so n can't be 0, so the n != 0 test
         | can be optimized out, so, velociraptors.
        
       | a-bit-of-code wrote:
       | Is it just me that thinks that the article is a [skilfully
       | drafted] joke (or parody or whatever the correct word is)? The
       | fact that it has been published close to April 1st raises more
       | suspicions.
        
         | still_grokking wrote:
         | My interpretation would be rather that the C language is a
         | carefully drafted joke or parody.
        
         | brxaf wrote:
         | I thought the same initially, but the realloc() parts are
         | definitely true.
        
       | PointyFluff wrote:
       | [dead]
        
       | Dylan16807 wrote:
       | > C23 furthermore gives the compiler license to use an
       | unreachable annotation on one code path to justify removing,
       | without notice or warning, an entirely different code path that
       | is not marked unreachable: see the discussion of puts() in
       | Example 1 on page 316 of N3054.9
       | 
       | I don't agree with that description at all. Here's the code:
       | 1 if (argc <= 2)       2   unreachable();       3 else       4
       | return printf("%s: we see %s", argv[0], argv[1]);       5 return
       | puts("this should never be reached");
       | 
       | The only code path that's "entirely different" is lines 1,4,5 and
       | in that case of course you remove a return that's after a return.
       | 
       | And the other valid code path is 1,2,5, which has `puts` after
       | `unreachable`.
       | 
       | To need `puts` you have to imagine a code path that gets past the
       | "if" without taking either branch?
       | 
       | Maybe the author means something by "code path" that's very
       | different from how I interpret it?
       | 
       | I would be pretty surprised if the above code means something
       | different from:                 if (argc <= 2) {
       | unreachable();         return puts("this should never be
       | reached");       } else {         return printf("%s: we see %s",
       | argv[0], argv[1]);         return puts("this should never be
       | reached");       }
        
         | cryptonector wrote:
         | There's no problem with this feature. I don't understand TFA's
         | problem with it. As a programmer I get to not use
         | `unreachable()` if I don't want to, and if I do I'm happy that
         | the compiler takes my word for it and does the right thing.
         | This is not at all like code elision in UB cases.
         | 
         | The `realloc()` change though...
        
         | wahern wrote:
         | I think the point is that if the `argc <= 2` path is
         | unreachable, then that means argc is _always_ greater than 2,
         | permitting the compiler to optimize the entire block to just:
         | return printf("%s: we see %s", argv[0], argv[1]);
         | 
         | IOW, the conditional has been elided. But you're right in that
         | the wording of the complaint doesn't match the example. The
         | author presumably had in mind some of the more infamous NULL
         | pointer-related optimizations, without spending the time to put
         | together a properly analogous example.
        
           | dtolnay wrote:
           | I interpreted the author's characterization to be about
           | something like:                 1  if (argc <= 2)       2
           | puts("A");       3  puts("B");       4  if (argc <= 2)
           | 5    unreachable();       6  else       7    return
           | puts("C");       8  return puts("D");
           | 
           | in which not just lines 4-6,8 go away (as you said) but also
           | lines 1-2.
           | 
           | It makes sense to me but I can see why the author would
           | characterize this situation as _" license to use an
           | unreachable annotation on one code path to justify removing
           | an entirely different code path that is not marked
           | unreachable"_. In a different world one might expect A to be
           | printed "before the UB happens".
        
             | masklinn wrote:
             | On the other hand, that has been the behaviour of
             | optimising compilers in the face of UBs for years at this
             | point, decades maybe. The linux kernel was hit by a deref'
             | constraint propagation back in 2009 or so.
             | 
             | This is a behaviour I would absolutely expect from the
             | construct, I would even qualify it as "the point".
        
         | alwaysbeconsing wrote:
         | One way to look at it (and I am not sure if this is correct,
         | but it may be what the essay author meant) is to not treat the
         | `unreachable` as affecting the presence of the decision, but
         | only the result of the decision. If `unreachable` was replaced
         | by a normal statement, we'd have:                   if (argc <=
         | 2)             do_something();         else             return
         | printf("%s: we see %s", argv[0], argv[1]);
         | 
         | So the `return printf` is executed when `argc` is greater than
         | 2. If we remove _just the body_ of the first branch:
         | if (argc <= 2)             ;         else             return
         | printf("%s: we see %s", argv[0], argv[1]);
         | 
         | the same thing holds. And additionally when `argc <= 2`,
         | control _will_ move past the `if`.
         | 
         | Under this view, if the `unreachable` won't cause the entire
         | removal of the `if`, the compiler will produce the equivalent
         | of:                   if (argc > 2)             return
         | printf("%s: we see %s", argv[0], argv[1]);              return
         | puts("this should never be reached")
         | 
         | Again, I don't say this is the correct interpretation, but it
         | is one possibility, that would have to be ruled out by other
         | parts of the standard.
        
           | Dylan16807 wrote:
           | I understand that interpretation, but that's what the end of
           | my comment is about. If we treat unreachable as affecting the
           | block it's in, but pretend it's not there for control flow,
           | then the two versions of the code do different things. That's
           | confusing and hard to preserve.
        
         | badrabbit wrote:
         | Shouldn't the compiler warn or error on unreachable code?
        
           | codeflo wrote:
           | This is not about code that's found to be unreachable through
           | static analysis (where compilers might warn), but about a
           | manual programmer annotation that claims the code is
           | dynamically unreachable even though statically it might look
           | otherwise.
        
             | benj111 wrote:
             | Why would you want that?
             | 
             | Is it to aid building for multiple targets? For debug
             | builds?
        
               | masklinn wrote:
               | > Why would you want that?
               | 
               | To aid with optimisation, it basically lets you ask the
               | compiler to remove branches, and provide constraints to
               | the same.
               | 
               | An implementation might trap in debug code, but given no
               | context would be provided you'd likely avoid this and
               | would instead use your own wrapper macro to output a
               | message of some sort in that case.
        
               | properparity wrote:
               | But why put in unreachable? Doesn't make any sense to me.
               | 
               | If a branch is truly not supposed to ever happen, why
               | have a branch at all? Just remove that code from the
               | source entirely- that helps the optimizer even more,
               | because the most optimal code is of course no code at
               | all.
        
               | masklinn wrote:
               | > But why put in unreachable? Doesn't make any sense to
               | me.
               | 
               | Because sometimes you don't have a choice e.g. say you
               | have a switch/case, if you don't do anything and none of
               | the cases match, then it's equivalent to having an empty
               | `default`. But you may want a `default: unreachable()`
               | instead, to tell the compiler that it needs no fallback.
               | 
               | > If a branch is truly not supposed to ever happen, why
               | have a branch at all? Just remove that code from the
               | source entirely- that helps the optimizer even more,
               | because the most optimal code is of course no code at
               | all.
               | 
               | Except the compiler may compile code with the assumption
               | that it needs to handle edge cases you "know" are not
               | valid. By providing these branches-which-are-not, you're
               | giving the compiler more data to work with. That extra
               | data might turn out to be useless, but it might not.
        
               | benj111 wrote:
               | But this example isn't adding a constraint. The if
               | statement is getting optimised away???
        
               | masklinn wrote:
               | It is adding a constraint. The constraint is that argc
               | can't be smaller than 2. This is a literal "can't", as
               | far as the compiler is concerned it's a logical
               | impossibility.
               | 
               | The branch containing the unreachable() obviously gets
               | removed but the compiler then propagates the constraint
               | (the condition for that illegal branch), and can prune
               | any other path where `argc <= 2` upstream and downstream,
               | as they are dead code per the constraint.
        
               | ufo wrote:
               | It helps optimization. One example is if you have code
               | like this:                   if(condition) {
               | error_stuff()            abort();         }
               | normal_stuff();
               | 
               | If the compiler doesn't know that abort exits the
               | program, they have to compile the normal_stuff path under
               | the assumption that the error path might have run before
               | it. This might result in suboptimal code.
               | 
               | Currently, many compilers support annotations such as
               | __attribute__(noreturn) and __builtin_unreachable() to
               | manually indicate that a code path is unreachable. C23 is
               | now standardizing these features (with a slight tweak to
               | the syntax).
        
               | _0ffh wrote:
               | You can for example use it to give hints to the compiler
               | that allows for optimisations, that it couldn't do
               | otherwise.
               | 
               | Described e.g. here https://web.archive.org/web/201605080
               | 51118/http://blog.regeh...
               | 
               | Github https://github.com/preames/llvm-assume-hack
        
               | flohofwoe wrote:
               | Unreachable is mainly used as an optimization hint. For
               | instance if you put an unreachable into the default
               | branch of a continuous and non-exhaustive (from the pov
               | of the compiler) switch-case statement, the compiler will
               | not emit a range check for the jump table lookup.
        
         | Asooka wrote:
         | This just shows that "unreachable" is almost impossible to use
         | safely. The only safe use of unreachable is if it is
         | immediately after an instruction that makes the program stop
         | running. It is _not_ for  "this cannot happen", because things
         | that "cannot happen" happen all the time. If you use
         | "unreachable", you're just asking for trouble and it seems the
         | compiler authors are happy to oblige.
        
           | josephcsible wrote:
           | This couldn't be more wrong. What you say to never use
           | unreachable for is one of the most important use cases of
           | unreachable. The whole point is to give the optimizer an
           | assumption that it can't figure out on its own.
        
         | ternaryoperator wrote:
         | This reminds me of a point made by the late Stan Kelly-Bootle,
         | who for years wrote the Devil's Advocate column in UNIX Review
         | magazine. In the early 1990s, he was discussing Microsoft's new
         | C compiler and noted that in the promo material for the new
         | compiler, it showed a benchmark for a loop that counted from 1
         | to 10,000 then printed "Hello". MS claimed that without
         | optimization it took a few milliseconds, after optimization: 0
         | ms. A small asterisk explained the optimizer simply removed the
         | loop. Kelly-Bootle pointed out, that the only reason a
         | developer would write such a loop was to introduce a needed
         | delay. Therefore, deleting the loop was not optimizing, but in
         | fact pessimizing. And so, it was in fact Microsoft's
         | Pessimizing C compiler.
        
           | kzrdude wrote:
           | I think it's a practical example of how the C language has
           | made a journey to being more high abstraction than it used to
           | be, in practice. And how that unsettles those used to the old
           | behaviour.
        
           | viraptor wrote:
           | Those delay loops are common on microcontrollers and the
           | usual solution is to either make the counter volatile or
           | insert something opaque to the compiler in the loop body.
           | 
           | It would be of course nice if a warning was produced for that
           | specific case: This whole loop was removed - is it really
           | what you wanted, or is it a broken delay loop?
        
           | hyperhopper wrote:
           | This is not true at all:
           | 
           | I've been many loops that turn into no-ops because all the
           | functionality has been refactored out but this fact is hidden
           | in function calls.
           | 
           | Sure, this should ideally be surfaced as a lint error, not a
           | compiler optimization, but you cannot say that intentional
           | delays are the "only" reason.
           | 
           | Also since processing time is variable, using that as a
           | method should be extremely heavily
           | discouraged/warned/require-opt-in
        
           | codeflo wrote:
           | Of course, that's technically incorrect. The way the
           | standards are written, the compiler is free to replace the
           | program with any other program that has the same (in a
           | precisely defined sense) observable behavior (these are the
           | famous "as if" formulations in language specs). Heating up
           | the CPU is not considered observable behavior.
           | 
           | If someone really just wants a delay, it's easy to either
           | (for programs running on normal OSs) call a sleep function,
           | or (on tiny embedded systems) add an empty inline assembler
           | statement that the compiler can't see through.
        
             | carlmr wrote:
             | >Heating up the CPU is not considered observable behavior.
             | 
             | Neither is measuring delays of cached versus non-cached
             | instructions. Yet it turns out to be very observable.
        
               | codeflo wrote:
               | Of course these things are "observable" in the literal
               | sense. And yet, they aren't considered to be observable
               | by the memory model of any language spec that I know of.
               | Same as CPU power draw, which has been used as a side-
               | channel to extract bits of crypto keys, and is very much
               | influenced by common optimizations.
               | 
               | Practically, if you need to execute a specific sequence
               | of machine instructions in order to prevent side-channel
               | attacks, then you have to rely on assembler, compiler
               | intrinsics and/or OS support. But that was true way
               | before Spectre.
        
         | [deleted]
        
       | JonChesterfield wrote:
       | Author is angry but not wrong. Lifting the most damning quote
       | from the article as I haven't seen it for a while.
       | 
       | C inventor Dennis Ritchie pointed to several flaws in [ANSI C]
       | ... which he said is a licence for the compiler to undertake
       | agressive opimisations that are completely legal by the
       | committee's rules, but make hash of apparently safe programs; the
       | confused attempt to improve optimisation ... spoils the language.
       | 
       | --Dennis Ritchie on the first C standard
        
       | kgbcia wrote:
       | I just need built-in string handling
        
       | MatmaRex wrote:
       | > As C89 was taking shape, the neurodivergent notion of a "zero-
       | length object" was making the rounds
       | 
       | I'm surprised that the authors decided to, and were able to, slip
       | in this little euphemism.
        
         | nimish wrote:
         | It's still apt, even as someone ostensibly in that category.
         | 
         | It does require some abstract thinking to comprehend sets of
         | zero measure, negative measure or complex measure in
         | mathematics. A "zero length object" is also encountered pretty
         | often in practice:http://docs.autodesk.com/CIV3D/2013/ENU/index
         | .html?url=files... and zero-length files come to mind.
         | 
         | The euphemism ends up working out fine, though likely not the
         | author's intent.
        
         | blahedo wrote:
         | Thanks for pointing this out---when I read the article I
         | tripped on that word, thought it odd and not sure what the
         | author was trying to say, and moved on, but now that you call
         | it out it seems very obviously to be used in just the same way
         | that a lot of people used to use the r-slur (and some still
         | do).
        
         | [deleted]
        
         | bee_rider wrote:
         | I wonder if somewhere along the chain there was an automated
         | tool to convert frequently abused mental-health related terms
         | like "insane" into something leas hurtful, or something along
         | those lines?
         | 
         | I haven't seen widespread use of the word "neurodivergent" as a
         | kind of... whatever this is, weirdly euphemistic slur, almost?
        
           | wizzwizz4 wrote:
           | > _I wonder if somewhere along the chain there was an
           | automated tool to convert frequently abused mental-health
           | related terms like "insane" into something leas hurtful, or
           | something along those lines?_
           | 
           | And yet again, these Lincoln systems mess up. While giving
           | advice to the author can avoid Great Grimsby mistakes, making
           | the replacements _automatically_ is an utterly Scunthorpe
           | decision, with failures as Slough as they are foreseeable.
           | 
           | If the intent is to be unhurtful, it should need to choose a
           | _different word_ to if the intent is to be hurtful! Even our
           | most sophisticated automated systems are Milton Keynes at
           | determining that kind of thing.
        
             | shagie wrote:
             | As an aside, this is one of the spots where GPT does a
             | really good job of fixing things.                   Reword
             | the following passage.  Change euphemisms to wording that
             | has similar meaning though no negative conotations.
             | Indicate changed words by putting them in "{{word}}"
             | ###         Standards are supposed to lead         ...
             | Why are such requests made? Often because of arithmetic
             | bugs. And what is a non-null pointer from malloc(0) good
             | for? Absolutely nothing, except shooting yourself in the
             | foot.         ###
             | 
             | To which it returned:                   Standards are meant
             | to provide the groundwork for a better world by enabling
             | code portability. Real standardization necessitates
             | {{managing}} a variety of compiler and library
             | implementations while maintaining sensible rules. The saga
             | of the realloc story demonstrates that C standardization
             | does not operate in this fashion any longer.
             | As C89 was being crafted, there was a {{unconventional}}
             | thought of a "zero-length object": Supporters proposed that
             | a non-null pointer to such an object should be given for
             | requests for zero-byte allocation.              Why do such
             | requests come about? Frequently due to miscalculations. And
             | what is a non-null pointer from malloc(0) useful for?
             | Nothing, apart from {{setting oneself up for failure}}.
             | 
             | It changed (along with some wording around to make the
             | changes fit):
             | 
             | heading cats -> managing
             | 
             | neurodivergent -> unconventional
             | 
             | shooting yourself in the foot -> setting oneself up for
             | failure
        
               | bee_rider wrote:
               | "Unconventional" seems like a bad pick to me, too
               | neutral. Clearly the author intends to say something
               | negative about zero-length objects. And of course it is
               | fine dislike things, it is just a matter of not using
               | hurtful language.
        
               | liquidify wrote:
               | There is no possible way to have style without the
               | potential to bother someone. Just write how you feel. If
               | the readers are so offended, they can stop reading. Life
               | will go on.
        
               | wizzwizz4 wrote:
               | There's no way to say "this thing is rubbish" without the
               | potential to bother people who like it. But it's entirely
               | possible to say it without pissing off those who don't
               | speak, or have motor disabilities, or like Justin Bieber.
        
               | bee_rider wrote:
               | There are so many less hurtful words, I can't accept the
               | idea that style requires these particular words. I mean
               | the sentence is clunky with "neurodivergent" anyway, and
               | this unusual use of the word sticks out and is
               | distracting. The style is not improved by this pick.
               | 
               | How about "awful" "asinine" or "shit-tastic" instead?
        
               | smsm42 wrote:
               | I guess that's be the way to detect if the text has been
               | written by the AI - it'd be completely devoid of
               | metaphors and cleansed from anything that could possibly
               | offend somebody. I wouldn't ever call it a "good job" but
               | I guess it's useful.
        
               | xigoi wrote:
               | Excuse me, but I'm still offended by the word
               | "miscalculations". It implies that calculations can be
               | wrong, which dehumanizes people with dyscalculia.
        
               | smsm42 wrote:
               | File a report to OpenAI, I'm sure they'll teach it to say
               | "calculations that do not exceed certain high standards
               | of accuracy" very soon. That's the beauty of it - you can
               | run the treadmill on computer speed now.
        
               | bongobingo1 wrote:
               | Sorry, `unconventional` is also offensive.
               | 
               | > Reword the following passage. Change euphemisms to
               | wording that has similar meaning though no negative
               | conotations. Indicate changed words by putting them in
               | "{{word}}"
               | 
               | >
               | 
               | > The couples were of unconventional make up, including
               | male and female pairings, male and male pairings as well
               | as female and female.
               | 
               | >> The couples had non-traditional compositions, with
               | pairings consisting of men and women, men and men, and
               | women and women.
               | 
               | So's male and female apparently.
        
           | chongli wrote:
           | _I haven't seen widespread use of the word "neurodivergent"
           | as a kind of... whatever this is, weirdly euphemistic slur,
           | almost?_
           | 
           | It's a continuation of the euphemism treadmill [1]. It won't
           | be long before "neurodivergent" is considered politically
           | incorrect and a new term is invented to replace it.
           | 
           | [1] https://www.urbandictionary.com/define.php?term=Euphemism
           | %20...
        
           | peterashford wrote:
           | Yeah, that's pretty gross, tbh
        
       | Dwedit wrote:
       | Did we ever legalize type punning?
        
         | JonChesterfield wrote:
         | We have "pointer provenance" which allows license to track type
         | punning across more of your program than ever before in order
         | to delete more parts of it with no diagnostic required.
         | 
         | For bonus marks, int and atomic_int are unrelated types, and
         | simd vector types aren't a thing, so enjoy the unfixable
         | performance cost of choosing C.
        
         | kzrdude wrote:
         | Through union yes, I think
        
         | cryptonector wrote:
         | Asking the real questions. Without looking I'm willing to bet
         | the answer is "no, and stop asking".
        
       | ChancyChance wrote:
       | Is the world finally realizing that "a + b" actually returns two
       | values: pass/fail and the value if pass?
       | 
       | "a + b = c;" is a fundamentally flawed operation from a computer
       | architecture perspective.
        
         | notfed wrote:
         | It's a flaw that has a pretty good tradeoff: unparalleled
         | readability.
        
           | ChancyChance wrote:
           | It depends. If you want to study maths, yes. If you want to
           | be a programmer:
           | 
           | [status, value] = add(a, b);
           | 
           | Is much more unparalleled-ly (?) readable from the
           | perspective of how a computer actually operates. In reality,
           | this:
           | 
           | uint c = (uint)a + (uint)b; // (to make that other guy happy)
           | 
           | is really:
           | 
           | c = (a + b) % (sizeof(uint));
           | 
           | in "C", which is less readable but far more accurate.
        
             | ChancyChance wrote:
             | That's 2^sizeof(uint)
        
         | Arch-TK wrote:
         | There is actually another option.
         | 
         | A more sophisticated type system.
         | 
         | Let's say you had some pseudocode like this:
         | let a = 5         let b = 12         let c = a + b
         | 
         | The type of a would be Integer[5..5], the type of b would be
         | Integer[12..12], the type of c would therefore be
         | Integer[17..17]. In a more complex example:
         | def foo(a: Integer[0..10], b: Integer[0..10]):
         | return a + b
         | 
         | The return type of this function would be Integer[0..20].
         | 
         | This kind of type system can solve a number of issues, all but
         | division by zero (which would probably still have to be solved
         | with some kind of optional type).
         | 
         | If type inference dictates that the upper range of an integer
         | would be too large to physically store in a machine data type,
         | then you either resort to bignums or you make it a compilation
         | error. By adding modular and saturating integer types you can
         | handle situations where you want special integer behaviours. By
         | explicitly casting (with the operation returning an optional)
         | you can handle situations where you want to bound the range.
         | This drastically simplifies a lot of code by removing explicit
         | bounds checks in all places except where they are absolutely
         | necessary. If for some reason you care about the space or
         | computational efficiency of the underlying machine type, you
         | can have additional annotations (like C's
         | u?int_(least|fast)[0-9]+_t). If you absolutely must map to a
         | machine type (this is usually misguided, unless you are dealing
         | with existing C interfaces, for which such a language can
         | provide special types) you can have more annotations.
         | 
         | Ada has something resembling this. I believe there are some
         | other languages that implement similar features. I believe this
         | sort of thing has a name, but I am not great with remembering
         | the names of things.
         | 
         | Hopefully this is some food for thought.
        
           | still_grokking wrote:
           | > I believe this sort of thing has a name, [...]
           | 
           | https://en.wikipedia.org/wiki/Refinement_type
           | 
           | But the concept is just a little bit over 30 years old. So
           | don't expect it shows up in most mainstream languages before
           | the end of the next 20 years, and don't expect it to come to
           | the C languages ever.
           | 
           | Meanwhile in mainstream ML-land:
           | 
           | https://github.com/Iltotore/iron
           | 
           | (Or for the older version of the language:
           | https://github.com/fthomas/refined)
           | 
           | (Please also note that for this feature both versions don't
           | need language support at all but are "just" libraries, as the
           | language is powerful enough to express all kinds of type
           | level / compile time computations in general.)
        
           | [deleted]
        
           | JonChesterfield wrote:
           | Compilers do this sort of range tracking anyway. At least
           | within a function. It's useful for loop optimisations.
        
           | im3w1l wrote:
           | I think the issue with this is that the worst-case bounds
           | normally grow much faster than the actual values. And it can
           | be easy to see for the programmer that the values can't
           | actually grow that much because a is only big when b is small
           | or some property like that, but then you have to convince the
           | compiler of the same. I might be misremembering though.
        
             | codethief wrote:
             | > because a is only big when b is small or some property
             | like that
             | 
             | Exactly, the expressiveness of the type system then
             | (typically) becomes the obstacle: How do you express that a
             | and b could each reach INT_MAX but their sum never exceeds
             | INT_MAX?
        
               | Arch-TK wrote:
               | Those kinds of assumptions are where you explicitly cast
               | to a smaller ranged type with the option of an error if
               | the sum does exceed a limit. The point of this type
               | system is not to be able to fully encode every possible
               | interaction between numbers in a system, but rather to
               | remove unnecessary bounds checking in a bunch of cases
               | and make it explicit in the few cases where you ARE
               | actually making an assumption.
        
             | wizzwizz4 wrote:
             | > _but then you have to convince the compiler of the same._
             | 
             | In conventional parlance, this is known as "handling
             | overflow".
        
         | [deleted]
        
         | c4mpute wrote:
         | First, you might have meant c = a+b;
         | 
         | The other way isn't really definable as an assignment
         | mathematically.
         | 
         | And there is a lot more to it than just pass/fail. First, an
         | addition doesn't fail, from a computer architecture
         | perspective, the addition will always succeed, the only thing
         | that could fail (in all the usual architectures) are possible
         | memory fetch and store operations when not strictly dealing in
         | register or immediate operands. Second, there is no fail flag.
         | There is a overflow flag, an underflow flag, a zero flag, a
         | sign and a few more that are irrelevant here. Any of overflow,
         | underflow, zero or sign might mean that the operation "failed"
         | depending on the types of your operand. Where the processor
         | doesn't know anything about the type, so there won't be a
         | straightforward 'fail' flag in any case. Only the library or
         | compiler can use type information such as (un)signedness,
         | bignum-ness, nonzeroness, desired wraparound (for modular
         | types) and other possible types together with aforementioned
         | flags to decide if that addition might have failed.
         | 
         | So nothing is fundamentally flawed, what you are describing is
         | just insufficiently complex (because there is no fail flag,
         | just a ton of other flags) or overly complex (because uint32_t
         | c = a + b is modular 2^32 arithmetics and cannot fail).
        
           | khazhoux wrote:
           | > First, you might have meant c = a+b;
           | 
           | > The other way isn't really definable as an assignment
           | mathematically.
           | 
           | This correction is condescending and unnecessary. Unless the
           | person had never written a single line of code in their life,
           | then they would obviously know "a+b" is not a modifiable
           | lvalue.
           | 
           | And the point about pass/fail was also obviously not mean to
           | capture the full complexity of the flags set by a CPU
           | operation. It was very clearly a statement about how basic
           | addition does not behave in computers the way it does on
           | paper -- as simple as that.
           | 
           | From HN guidelines: "Please respond to the strongest
           | plausible interpretation of what someone says, not a weaker
           | one that's easier to criticize."
        
             | c4mpute wrote:
             | You might be right on the first point. Edit: actually, you
             | might not be. There are languages with compound lvalues and
             | CPU architectures with multiple result registers (x86 being
             | the best-known example). E.g. you can do "(result, flags,
             | err) = do_stuff(a, b, c)" in Go, and x86 DIV storing
             | different parts of the division result in different
             | registers:
             | https://c9x.me/x86/html/file_module_x86_id_72.html And
             | generally with common CPU architectures, flags are another
             | such result register that is always written, such that any
             | operation like c := a+b is actually something like (c,
             | flags) := a+b. And for stuff like multiplication, there is
             | actually the notion of two result registers being the
             | higher and lower part of the resulting operation, like (a *
             | 2^32 + b) = c * d (see x86 MUL). Therefore some precision
             | in language is necessary for the discussion (and yes, the
             | different meanings of ==, =, := in various languages and
             | mathematics are also confusing, even to me ;).
             | 
             | I do strongly disagree on the second one about pass/fail.
             | This kind of nitpicking is necessary here, because the
             | discussion is about a standard intended to precisely
             | describe such operations, and how the underlying hardware
             | might be utilized to execute them. Being imprecise in this
             | context is dangerous, wrong, problematic and leads to the
             | whole point of the discussion being lost in a sea of
             | handwaving.
        
           | JonChesterfield wrote:
           | > The other way isn't really definable as an assignment
           | mathematically.
           | 
           | It's an equality sign. See also, := and unification.
        
       | antiquark wrote:
       | C reached its zenith in C90, and saw a few good ideas in C99.
       | Everything since has been wankery from people who either are
       | bored, or have a severe case of C++-envy.
        
         | pjmlp wrote:
         | Even then it was already outdated when compared against
         | languages like Modula-2 and Object Pascal, it got lucky to ride
         | into the waves of UNIX adoption.
        
       | GianFabien wrote:
       | Maybe I'm being dense. To me it appears that the standards are
       | telling compiler writers what should be done. In doing so the
       | compilers will become ever more complex and thus bug-prone.
       | 
       | I learnt C back when K&R (first edition) was the reference. Ok,
       | it was hardly much more than a universal assembler to make every
       | computer look like a PDP-11. In my experience C is the language
       | to use when you want to be close to the metal. For the rest I use
       | which ever high-level language/environment is best suited.
       | Admittedly some FFI are a pain to use, but once you get the
       | boilerplate bedded down your much higher level language gets the
       | coordination done.
        
         | RobotToaster wrote:
         | >To me it appears that the standards are telling compiler
         | writers what should be done.
         | 
         | Isn't that what standards are supposed to do?
        
           | JonChesterfield wrote:
           | Traditionally they recorded existing practice and gently
           | encouraged diverging implementations to converge.
           | 
           | The alternative approach is to invent things by committee,
           | hopefully with some implementers watching, and hope for the
           | best.
        
       | juunpp wrote:
       | > The ckd_* macros steer a refreshingly sane path around
       | arithmetic pitfalls including C's "usual arithmetic conversions."
       | 
       | A 7 letter function to add two numbers and that returns a
       | boolean... not entirely sure I'd call that 'sane'.
        
         | ludocode wrote:
         | I'd prefer if it were more letters. It bothers me when API
         | designers omit random letters just to save a few keystrokes.
         | These are particularly egregious because I keep forgetting
         | which letters they kept. Is it "chk"? or "ckd"? or "chd"? or
         | something else?
         | 
         | I wrote a portability library that wraps these with compiler
         | intrinsic and standard C fallbacks. I chose to spell out the
         | full word in addition to making the type explicit. It's a lot
         | more verbose of course but a lot clearer to read:
         | 
         | https://github.com/ludocode/ghost/blob/develop/include/ghost...
        
           | goatlover wrote:
           | A saner language would handle the conversion for you so it
           | would work with just the normal math operators.
        
             | masklinn wrote:
             | How would that work for the largest type supported by the
             | platform?
        
               | pjmlp wrote:
               | A panic would be thrown, like in memory safe system
               | programming languages, those that were in use outside
               | Bell Labs and unfortunely lost to UNIX.
        
       | pjmlp wrote:
       | And zero focus on improving the root causes of memory corruption
       | due to strings and array indexing errors.
       | 
       | The security world will keep burning it seems.
        
         | heywhatupboys wrote:
         | > The security world will keep burning it seems.
         | 
         | There is no alternative to network protocols and IPC that the
         | stringtypes C has. You get a length and a byte array. If you
         | trust the user, you can assume length is correct. Otherwise no.
        
           | pjmlp wrote:
           | Sure there are, as proven by distributed networking stacks
           | not written in C.
           | 
           | In fact Ethernet early days goes back to Mesa not C.
           | 
           | UNIX did not invent networking, networking predates UNIX for
           | at least a decade.
        
       | RustyRussell wrote:
       | Frankly, the C standards ctte went off the deep end when they
       | effectively banned NULL to memset etc (obv with zero length).
       | 
       | Not because these functions couldn't handle it, but because this
       | assertion simplifies optimizations _elsewhere_.
       | 
       | This has required adding extra checks in my code, found mainly by
       | trial and error, and has made it less readable _and_ less
       | optimal.
       | 
       | Finally, the checked arithmetic operations returning _false_ on
       | success is a horror show. Fortunately it will be found on the
       | first time the code is run, but that 's a damnably low bar :(
        
         | Kamq wrote:
         | > Finally, the checked arithmetic operations returning false on
         | success
         | 
         | That's what got you? C functions returning error flags (with
         | zero meaning no error) isn't exactly new.
        
         | Dwedit wrote:
         | Replace memset with a macro, that's the C way.
        
         | notfed wrote:
         | Isn't the return value just a carry bit?
        
           | spc476 wrote:
           | Not every CPU C runs on has a carry bit. MIPS, SPARC, RISC-V,
           | all don't have the concept of a "carry bit."
        
         | ericpauley wrote:
         | > Finally, the checked arithmetic operations returning false on
         | success is a horror show.
         | 
         | This seems in line with C conventions? Generally a 0 return
         | code means success.
        
           | wruza wrote:
           | With int statuses, not with bools. It's just a twisted logic
           | in return value you have to deal with in your head.
           | 
           | "If checked operation has a status, then it failed." - ok
           | 
           | "If checked operation [is true], then it failed." - wat
        
             | SAI_Peregrinus wrote:
             | The checked operations ask "did an error occur?". If it's
             | false, then the check passed and no error occurred. If it's
             | true, then the check indicated an error.
        
             | masklinn wrote:
             | > With int statuses, not with bools
             | 
             | Which C historically did not have, so int played that role.
             | The function is the same, and the existing idioms remain.
        
               | wruza wrote:
               | I find it strange to introduce real bools (which these
               | macros return according to their official signatures) and
               | then to assign them a meaning of a still-nonexistent but
               | widely used C type. At least my C intuition stumbles upon
               | that immediately, no matter how long I think about it.
               | 
               | Ah, anyway, standard C/libc is basically a lost cause. It
               | can't get any worse, since you have to refer to a manual
               | at every call to not step on a landmine.
        
       | blippage wrote:
       | #embed is what I really want. And separators.
       | 
       | > Standard C advances slowly
       | 
       | They're not joking, either. C is conservative to a fault, I
       | think.
        
         | AlbertoGP wrote:
         | > #embed is what I really want. And separators.
         | 
         | If you want to try out those features now, I made a pre-
         | processor that translates that into standard C99:
         | 
         | https://sentido-labs.com/en/library/cedro/202106171400/use-e...
         | 
         | https://sentido-labs.com/en/library/cedro/202106171400/#numb...
         | 
         | It includes a cc wrapper called cedrocc that you can use as a
         | drop-in replacement:
         | 
         | https://sentido-labs.com/en/library/cedro/202106171400/#cedr...
        
       | solidsnack9000 wrote:
       | "Looking forward, marijuana legalization will surely beget
       | notions such as fractional-, imaginary-, and negative-length
       | objects, each with as much potential for mayhem as zero-length
       | objects."
       | 
       | It's a funny thing to say.
        
         | garbagecoder wrote:
         | >negative-length
         | 
         |  _nervous Minkowski laughter_
        
         | firstlink wrote:
         | Rust seems to do fine with ZSTs somehow.
        
           | kibwen wrote:
           | ZSTs work splendidly in Safe Rust, but you do need to
           | consider them if you're writing unsafe generic code. Here's
           | the relevant section of the Rustonomicon: https://doc.rust-
           | lang.org/nomicon/exotic-sizes.html#zero-siz... .
        
       | andrepd wrote:
       | > All C standards from C89 onward have permitted compilers to
       | delete code paths containing undefined operations--which
       | compilers merrily do, much to the surprise and outrage of
       | coders.16 C23 introduces a new mechanism for astonishing elision:
       | By marking a code path with the new unreachable annotation,12 the
       | programmer assures the compiler that control will never reach it
       | and thereby explicitly invites the compiler to elide the marked
       | path.
       | 
       | I don't agree with this in the slightest. I'm not "outraged" by
       | undefined behaviour, it's a _fundamental tool_ for writing
       | performant code. Ensuring that dereferencing a null pointer or
       | accessing outside the bounds of an array is undefined behaviour
       | is what lets the compiler not emit a branch on every array access
       | and pointer dereference.
       | 
       | Furthermore, I really don't understand the outrage that there is
       | another _explicit_ tool to achieve behaviour the author may or
       | may not consider harmful. If it 's an explicit macro, it's not a
       | tarpit!
        
       | GuB-42 wrote:
       | I actually like unreachable() a lot. What it does is that it
       | invokes undefined behavior, that's all.
       | 
       | It does nothing trickier than any other kind of UB. In fact, I
       | could implement unreachable() like this: void unreachable() {
       | (char *)0 = 1; }.
       | 
       | Standardizing it however gives interesting options for compilers
       | and tool writers. The best use I can find is to bound the values
       | of the argument of a function. For example, if we have "void
       | foo(int a) { if (a <= 0) unreachable(); }, it tells the compiler
       | that a will always be >0 and it will optimize accordingly, but it
       | can also be used in debug builds to trigger a crash, and static
       | analyzers can use that to issue warnings if, for example, we call
       | foo(0). The advantage of using unreachable() instead of any other
       | UB is that the intention is clear.
        
         | lprib wrote:
         | Using `unreachable()` instead of `assert()` for your
         | preconditions without profiling first is just pre-loading the
         | gun to shoot yourself in the foot in the future. When those
         | preconditions are inevitably violated at some point, you will
         | get random UB corruption rather than simply aborting as is the
         | case for assert.
        
         | lionkor wrote:
         | Respectfully, you would already be doing this in any C
         | codebase, with `assert()`, right? We are all checking our
         | preconditions with assert... right?
        
           | GuB-42 wrote:
           | AFAIK, assert() is not undefined behavior, so it can't be
           | used for optimization. It is either implementation-defined in
           | debug mode, or does nothing in release mode.
           | 
           | For example:                 assert(a >= 0);       if (a < 0)
           | printf("a is negative");
           | 
           | In release mode, assert() will be gone, so the if/printf()
           | will stay. If we used "if (a < 0) unreachable();" instead of
           | assert(), it would optimize away both lines.
        
           | pornel wrote:
           | NDEBUG makes these checks disappear, so that's not an option
           | for checks that are supposed to stay in the program.
        
         | ptx wrote:
         | > _What it does is that it invokes undefined behavior, that 's
         | all. [...] it can also be used in debug builds to trigger a
         | crash_
         | 
         | How can it be used to trigger a crash (a specific behavior) if
         | the behavior it invokes is undefined? Are you saying it would
         | be defined differently for debug builds so that it doesn't
         | invoke undefined behavior?
        
       | quintussss wrote:
       | I always wonder how much these new C standards use, as C is now
       | mostly used in areas where one is severely limited when it comes
       | to compiler choice. Where I work, we use GCC 6.2 and iso9899:1990
       | (C90). If we were able to use a modern compiler, we would
       | probably just use C++.
        
       | eternalban wrote:
       | C is a very large language masquerading as a small language.
        
         | MichaelZuo wrote:
         | What does that make C++?
        
           | eternalban wrote:
           | https://upload.wikimedia.org/wikipedia/commons/a/a7/Frankens.
           | ..
           | 
           | (don't get me wrong. love C. but in an innocent sort of way,
           | like a teenager quite unaware of betrayals, heartbreak, love
           | triangles, or UB, UsB, and IDB..)
        
         | pjmlp wrote:
         | Only because many keep worshiping K&R C, ignoring what is the
         | actual C that modern compilers support.
        
       | GuB-42 wrote:
       | > C178 purports to be a bug-fix revision of C11. Does the word
       | "toto" on page 1 indicate (a) the editor's musical tastes; (b)
       | that nobody bothered to spell-check the document; (c) that we're
       | not in Kansas anymore; or (d) none of the above?
       | 
       | As a french guy I'd go with (d).
       | 
       | I've often seen "toto" used as a placeholder name, sometimes
       | followed by "titi", "tata", "tutu", I have even used it myself.
       | It is similar to "foo", "bar", "baz". I don't know if it is
       | specific to France, of French speaking countries, but it is
       | definitely a thing here.
        
         | rahen wrote:
         | Most likely toto as the French for foobar.
         | 
         | Jens Gustedt is part of the C comity and participated to C23.
         | He also works for INRIA in France:
         | https://en.wikipedia.org/wiki/French_Institute_for_Research_...
        
       | layer8 wrote:
       | While the situation with realloc() is unfortunate, it is also not
       | difficult to write a wrapper that does what the author wants.
       | I've done that before, because it has long been known that not
       | all realloc() implementations conform to the (prior) C standard.
       | One can furthermore assume that existing implementations won't
       | change their behavior just because C23 made it UB.
        
         | p0nce wrote:
         | Honestly I'm happy the C standard now address how realloc
         | behaves in detail. It was already hard before, and now it's
         | documented.
        
       | __s wrote:
       | tl;dr `realloc(p, 0)` is slated to be undefined behavior in C23,
       | whereas it's been somewhat implementation defined until now, with
       | recommendation being realloc(p, 0) is equivalent to free(p)
       | 
       | Seems a bit tone deaf to create new undefined behavior in memory
       | handling, especially when a sane default behavior seems to be de
       | facto
       | 
       | I've used that free-on-0 behavior myself. Unfortunately the code
       | that uses this will often have 0 be a length variable, so hard to
       | grep for this. Ideally musl/glibc will both stick to that
       | undefined behavior being free & gcc/clang won't go about making
       | this something to point their optimizations at
       | 
       | Lest we have to stop using realloc outside of a safe_realloc
       | wrapper                 static void *safe_realloc(void *p, size_t
       | newlen)       {         if (newlen == 0) { free(p); return NULL;
       | }         return realloc(p, newlen);       }
       | 
       | What got this whole thing weird is that C doesn't like zero sized
       | objects, but implementations were allowed to return a unique
       | pointer for a zero sized allocation. Which then raises the matter
       | that being portable there require freeing that reserved chunk for
       | non-free implementations. In theory this reservation code could
       | be more efficient when code frequently reallocates between 0 &
       | some small value. & there was uncertainty because NULL is a way
       | to say allocation failure, but then if one did a NULL check on
       | realloc's return value they also had to check that the size was
       | non-zero
        
         | wahern wrote:
         | > Seems a bit tone deaf to create new undefined behavior in
         | memory handling,
         | 
         | It's only tone deaf to people who understand "undefined
         | behavior" as an epithet or as synonymous with giving a license
         | to compilers to screw you over. The term doesn't have either of
         | those meaning to those on the C committee. In fact, one of the
         | explicit rationales for the proposal is that, "Classifying a
         | call to realloc with a size of 0 as undefined behavior would
         | allow POSIX to define the otherwise undefined behavior however
         | they please." https://www.open-
         | std.org/jtc1/sc22/wg14/www/docs/n2464.pdf
         | 
         | > especially when a sane default behavior seems to be de facto
         | 
         | The above proposal, N2464, gives the behavior for AIX, zOS, BSD
         | (unspecified), MSVC (crt unspecified), and glibc. They _each_
         | have different behaviors.
         | 
         | Why they chose to finally make it undefined (it was marked as
         | obsolescent for a long time) rather than keep it as
         | implementation-defined, I don't know. Perhaps because it 1)
         | simplifies the standard, and 2) by making it undefined it
         | suggests compilers should start warning about it--despite all
         | this time neither has there arisen a consensus among
         | implementations about the best behavior, nor are programmers
         | aware that the behavior actually varies widely.
         | 
         | EDIT: The draft SUSv5/POSIX-202x standard has indeed directly
         | addressed this issue. See, e.g.,
         | https://www.austingroupbugs.net/view.php?id=374 The most recent
         | draft included the following addition to RETURN VALUE:
         | OB     If size is 0,       OB CX  or either nelem or elsize is
         | 0,       OB     either:            OB     * A null pointer
         | shall be returned       OB CX    and, if ptr is not a null
         | pointer, errno shall be set to [EINVAL].            OB     * A
         | pointer to the allocated space shall be returned, and the
         | memory object pointed to by ptr                shall be freed.
         | The application shall ensure that the pointer is not used to
         | access an object.
         | 
         | CX marks points of divergence with C17. The first CX is because
         | of the addition of reallocarray, absent from C17. The second is
         | because POSIX will mandate the setting of EINVAL if NULL is
         | returned.
        
           | peppermint_gum wrote:
           | >It's only tone deaf to people who understand "undefined
           | behavior" as an epithet or as synonymous with giving a
           | license to compilers to screw you over. The term doesn't have
           | either of those meaning to those on the C committee.
           | 
           | It's unfortunate but not surprising that the C committee
           | isn't aware of the problems with the undefined behavior.
           | 
           | In fact, after I started reading WG14 meetings minutes, I
           | completely lost faith that any of the serious problems with
           | the standard will ever get fixed.
        
             | coliveira wrote:
             | This is not a problem with the committee and is not a
             | problem with compiler writers. The committee is only
             | marking certain behaviors as UB. Compilers can do what they
             | think is more sensible in these situations. And compiler
             | writers are not forcing you to accept these extreme
             | optimizations. You always have the option of disabling
             | optimizations and accept that your code has bugs (UB). You
             | just need to test the code you write under different
             | compiler settings, similarly to how you test code in
             | different environments.
        
               | __s wrote:
               | "just disable optimizations" is not a solution unless the
               | compiler allows enough fine grained control where that
               | solution is `-ffree-zero-sized-realloc`
        
           | adgjlsfhk1 wrote:
           | > It's only tone deaf to people who understand "undefined
           | behavior" as an epithet or as synonymous with giving a
           | license to compilers to screw you over.
           | 
           | Unfortunately, this is the correct understanding of UB.
        
         | JoshTriplett wrote:
         | realloc to 0 size being free is useful in particular because it
         | means a function pointer to realloc is a complete memory
         | allocator: call realloc with pointer NULL to get malloc, and
         | call realloc with size 0 to get free.
        
         | moremetadata wrote:
         | > What got this whole thing weird is that C doesn't like zero
         | sized objects, but implementations were allowed to return a
         | unique pointer for a zero sized allocation.
         | 
         | Some of the windows API's work like this, so how much is
         | pressure from MS?
         | 
         | Same discussion from 7 months ago.
         | 
         | https://news.ycombinator.com/item?id=32352965
         | 
         | https://thephd.dev/c23-is-coming-here-is-what-is-on-the-menu...
         | 
         | https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2897.htm
         | 
         | Pattern matching ram for variables/objects whilst they exist
         | even if zero'ed or prefilled with a value doesnt give perfect
         | security. Random values would make it harder to work out the
         | variable/object.
        
       | tzs wrote:
       | > Pointers to free'd memory are akin to uninitialized pointers,
       | so free(p) followed by if (p==q) is an instrument of arson
       | 
       | What's the reason for this?
        
         | coliveira wrote:
         | Using a freed pointer is incorrect behavior, a bug in shorter
         | terms. If you do anything with a freed pointer (other than
         | assigning new memory), you're inviting all kinds of bugs
         | (independent of what the compiler might be doing with your
         | code).
        
           | xigoi wrote:
           | Obviously _dereferencing_ a freed pointer is incorrect
           | behavior, but what harm is there in using its numerical
           | value?
        
         | tedunangst wrote:
         | 900 years ago there was a CPU which stored pointers in special
         | registers and trapped if you loaded a pointer with an invalid
         | segment. And so loading the pointer into a register to compare
         | it would crash.
        
         | Dylan16807 wrote:
         | I can't tell you exactly why but it's consistent with just
         | about everything else involving p being undefined, and the
         | result of the comparison would be useless anyway.
        
           | tzs wrote:
           | Why would the comparison be useless?
           | 
           | I can imagine situations where a pointer q might sometimes be
           | a copy of pointer p and sometimes might point to something
           | else, and the code wants to free q if and only if it is not a
           | copy of p (because p has been free'd earlier).
        
             | Dylan16807 wrote:
             | Because a new object can have the same address as p, so
             | comparing to p isn't enough to tell you if you have a copy
             | of p or a live pointer to something else.
        
         | jcranmer wrote:
         | Given the following code:                   void *p =
         | malloc(N);         do_random_stuff(p);         void *q =
         | malloc(N);
         | 
         | With this rule, the compiler can conclude that p and q cannot
         | alias, even if it doesn't have body of do_random_stuff. Without
         | it, it would first have to prove that p is never freed before
         | calling q, which is basically impossible (moving the body of
         | intervening code into a different file, for example, would do
         | the trick).
        
       | firstlink wrote:
       | > and that such changes may impose themselves on old code without
       | recompilation when dynamically linked libraries are upgraded.
       | 
       | All I can do is laugh. This is what the dynamic linker fanatics
       | wanted. This is what they explicitly advocate for to this day.
       | Share and enjoy!!
        
         | bayindirh wrote:
         | I'd rather have small binaries and memory efficient systems
         | instead of huge blobs having their own complete disconnected
         | environments with non-coherent behavior on the same situation.
         | Also, wasting tons of memory while at it.
         | 
         | If I have something that critical, I can always statically
         | compile.
        
         | coliveira wrote:
         | Exactly! Shared libraries mean that new code with modified
         | behavior can and will be called when made available,
         | independent of how the original code was compiled. It is
         | interesting that people come out to complain about this obvious
         | behavior.
        
           | hermitdev wrote:
           | The problem isn't changing implementation. This is expected
           | with shared libs. The problem is changing the contract of the
           | function and then expecting it to be drop in compatible. It's
           | not. It _should_ be treated as a breaking ABI change, because
           | the old behavior and new behavior are not compatible, yet
           | it's being masqueraded as such. It's quite literally the same
           | behavior/attitude behind the "w" vs "wt" change that led to
           | aCropolyse.
        
         | AshamedCaptain wrote:
         | I really don't think anyone could possibly want the _specified
         | behavior_ of a function changing below their feet.
         | 
         | However, the author is unlikely to be correct here. E.g., to
         | this day, glibc contains _multiple implementations of memcpy_
         | just to satisfy those executables that depend on the older,
         | memmove-like behavior that was once part of the unspecified
         | behavior of glibc. The only way to get the dynamic linker to
         | choose one of the newer versions is to, well, rebuild the
         | executable. It is inconceivable that glibc would not use symbol
         | versioning with an actual specification change.
         | 
         | The behavior is practically the same as with static linking,
         | and you still get the benefits of dynamic linking.
        
           | throwaway892238 wrote:
           | People who don't understand dynamic linking are doomed to re-
           | implement it, poorly.
        
         | tedunangst wrote:
         | It's a really weird complaint. The standard specifies that it's
         | now undefined behavior. That imposes zero requirements to
         | change the library. Whatever it is the library was doing, it's
         | one possible undefined behavior.
        
       | cryptonector wrote:
       | The `realloc()` change calls for pitchforks.
        
       | otabdeveloper4 wrote:
       | Hopefully not literally. (But C23 is exactly the kind of
       | programming language you expect to do that.)
        
       ___________________________________________________________________
       (page generated 2023-04-02 23:02 UTC)