hngopher.com

       [HN Gopher] ISO C became unusable for operating systems development
       ___________________________________________________________________
        
       ISO C became unusable for operating systems development
        
       Author : pcw888
       Score  : 173 points
       Date   : 2022-01-21 11:20 UTC (11 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | throwawayvibes wrote:
        
       | foxfluff wrote:
       | I haven't got the time to read the paper yet but I believe I'd
       | emerge with the more or less the same opinion that I've had
       | before: nobody's forcing you to pass -O2 or -O3. It's stupid to
       | ask the compiler to optimize and then whine that it optimizes. I
       | usually am OK with the compiler optimizing, hence I ask it to do
       | so. I'm glad that others who disagree can selectively enable or
       | disable only the optimizations they're concerned about. Most of
       | the optimizations that people whine about seem quite sane to me.
       | Of course, sometimes you find real bugs in the optimizer
       | (yesterday someone on libera/#c posted a snippet where gcc with
       | -O2 (-fdelete-null-pointer-checks) removes completely legit
       | checks)
        
         | marcosdumay wrote:
         | Optimizations aren't supposed to change the meaning of your
         | code. And specifically for C, unsafe optimizations are supposed
         | to only apply with -O3. Level 2 is supposed to be safe.
        
           | dooglius wrote:
           | I recall reading recently that gcc moved to making overflow
           | undefined at -O0
        
           | foxfluff wrote:
           | > And specifically for C, unsafe optimizations are supposed
           | to only apply with -O3. Level 2 is supposed to be safe.
           | 
           | Citation needed? AIUI optimization levels are entirely up to
           | the compiler. GCC's man page says nothing about the safety of
           | these levels.
           | 
           | > Optimizations aren't supposed to change the meaning of your
           | code.
           | 
           | And they generally won't. The trouble is when the meaning of
           | your code wasn't well defined to begin with (or you thought
           | it meant something it didn't).
        
             | marcosdumay wrote:
             | > GCC's man page says nothing about the safety of these
             | levels.
             | 
             | Well, it used to.
        
           | klodolph wrote:
           | This is definitely untrue. The difference between -O1, -O2,
           | -O3, and -Os is as follows:
           | 
           | - O1 includes optimizations that only require a small amount
           | of additional compilation time. (Fast compilation, code is
           | much faster than -O0.)
           | 
           | - O2 includes O1, and additional optimizations that use a
           | medium amount of compilation time. (Medium compilation,
           | reasonably fast code.)
           | 
           | - O3 includes O2, and additional optimizations that take more
           | compilation time. (Slower compilation, faster, larger code.
           | Unrolled loops and the like.) It often makes sense to compile
           | hot inner loops with O3 because the compiler can do some
           | fairly aggressive vectorization, if you know what you are
           | doing... and the code remains correct in either case, if it
           | was correct in the first place. It's basically "free"
           | vectorization, when it works.
           | 
           | - Os includes O2, except flags that increase code size.
           | (Medium compilation, code is slightly slower and smaller than
           | O2.)
           | 
           | The only optimization level that actually changes the meaning
           | of correct code is -Ofast, which is _rarely_ used. It 's
           | described in the manual with the phrase "Disregard strict
           | standards compliance".
           | 
           | ALL levels of optimization will change the meaning of your
           | code if your code relies on undefined behavior. For example,
           | 
           | https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
           | 
           | > -faggressive-loop-optimizations
           | 
           | > This option tells the loop optimizer to use language
           | constraints to derive bounds for the number of iterations of
           | a loop. This assumes that loop code does not invoke undefined
           | behavior by for example causing signed integer overflows or
           | out-of-bound array accesses. The bounds for the number of
           | iterations of a loop are used to guide loop unrolling and
           | peeling and loop exit test optimizations. This option is
           | enabled by default.
           | 
           | It's always enabled.
        
           | josephcsible wrote:
           | Imagine you and the compiler are parties to a contract. Your
           | obligation is to not let UB happen. The compiler's obligation
           | is to not change the meaning of your code. Insisting that the
           | compiler should uphold its end of the deal when you don't
           | uphold yours is like saying if you sign a contract to buy
           | widgets, you should have to pay even if the vendor fails to
           | deliver them.
        
             | marcosdumay wrote:
             | > Your obligation is to not let UB happen.
             | 
             | Sorry, no can do. If that's the contract than C is a
             | completely useless language.
        
               | Filligree wrote:
               | > Sorry, no can do. If that's the contract than C is a
               | completely useless language.
               | 
               | That _is_ the contract. Your inference may be correct,
               | but it does no good to insist that C should behave
               | differently than it does. There are better options.
        
               | mpweiher wrote:
               | Nope, that's not the contract.
        
               | 3836293648 wrote:
               | That's absolutely the contract compiler implementors are
               | following. If you don't want to be a part of that
               | contract you need to implement your own compiler from the
               | ground up with optimisations based on your interpretation
               | of UB.
        
               | Filligree wrote:
               | I have always been told that, if a program contains UB,
               | then the computer is justified in doing anything at all.
               | This would naturally imply I am required to avoid any UB.
               | 
               | Do you disagree with this?
        
               | marcosdumay wrote:
               | There's no contract. Nobody ever shared a piece of paper
               | saying "your software must not use UB".
               | 
               | The closest thing to a "contract" a language has is
               | training material and reference books. And guess what,
               | nearly all of C material mostly ignore UB, because
               | enumerating it all is completely impractical even for
               | reference books.
               | 
               | Most of the C software in use on the wild was created
               | before the hyper-optimization trend of C compilers, when
               | your program did mostly what you said, and if what you
               | said is undefined, it would still behave in some way that
               | is close to a literal reading of the code. For most of
               | the C software that exists, the modern treatment of UB is
               | a huge post-facto unilateral contract change.
        
               | plorkyeran wrote:
               | The C specification is the contract between you and the
               | compiler.
        
               | marcosdumay wrote:
               | It's not a contract if you can't expect one of the
               | parties to have read it.
               | 
               | Compiler developers do not need a contract with the
               | users. No compiler team works with that impression. No
               | compiler user expects one either.
               | 
               | But if you want to use it to justify some impractical
               | behavior, well, both your justification is a complete
               | fabrication, and justifying it is completely meaningless.
               | It doesn't become practical because you have some
               | sociological theory.
        
               | Filligree wrote:
               | You seem to be using the legal definition of 'contract'.
               | 
               | We're using the computer science definition, which is
               | simply "assuming it's correctly implemented, if you feed
               | this program Foo, it'll do Bar".
        
               | int_19h wrote:
               | Which compiler team doesn't take the C standard
               | seriously?
               | 
               | And the C standard is _by definition_ a contract between
               | the implementation and its users.
        
             | dooglius wrote:
             | I didn't sign a contract though, the compilers just do dumb
             | things and I am at their mercy. If I were negotiating a
             | contract then I wouldn't allow all the cases of UB.
        
               | [deleted]
        
               | gpderetta wrote:
               | Writing a C compiler that does a trivial translation to
               | ASM is not very hard, in fact there are already quite a
               | few. I don't understand why people don't use that if
               | that's what they want.
        
               | josephcsible wrote:
               | Ironically, it's because the code they generate is too
               | slow.
        
         | phicoh wrote:
         | It seems to me a flaw in a language and in a compiler if the
         | average programmer has to avoid higher levels of optimizating
         | because it cannot be predicted what they do.
        
           | foxfluff wrote:
           | I don't see it that way. Isn't that precisely the point of
           | higher levels? Don't go there if you don't know what you're
           | doing. Since we can't magically install knowledge into
           | average programmer's heads, the only other choice would be to
           | have those higher levels not exist because the average
           | programmer will never understand them, and also the compiler
           | can't mind-read the average programmer to understand that
           | what they want isn't exactly what they said they want.
        
             | fennecfoxen wrote:
             | I think the general problem is that if so much is going to
             | hinge on this, it should be much easier to tell whether or
             | not you know what you're doing. Undefined behavior really
             | wouldn't be a problem if you could spot it reliably
             | (possibly using tools, like the compiler toolchain itself).
             | 
             | This loose mapping between four optimization levels and a
             | variety of obscure language situations is invisible and
             | situated only in a handful of top engineers' memories.
             | That's not a great way to run a language if you actually
             | care about the correctness of programs written in that
             | language.
        
             | immibis wrote:
             | I always thought the main point of the levels was simply
             | compilation speed. That and debuggability.
        
             | wongarsu wrote:
             | > Isn't that precisely the point of higher levels?
             | 
             | I used to think that higher levels are for giving the
             | compiler more time to think so it can come up with a better
             | result (and for somewhat specifying the desired space-
             | performance tradeoff). In my mind the "don't do it if you
             | don't know what you are doing"-things were seperate flags,
             | like -ffast-math.
             | 
             | Obviously I eventually learned that bad things can happen,
             | but "just build prod with -O2" still seems to be what
             | almost everyone does.
        
             | mpweiher wrote:
             | Nope.
             | 
             | That optimisations must not change the visible behavior
             | (except for running time) of a program was _the_ cardinal
             | rule of optimisation.
             | 
             | Then it got flushed down the toilet. (With the trick of
             | retroactively redefining the behavior to be undefined in
             | the first place and thus all bets being off).
        
           | Wowfunhappy wrote:
           | What if the only languages/compilers without that "flaw" had
           | similar performance to lower optimization levels?
           | 
           | (I don't know if that's the case, but it's what I thought the
           | GP was implying.)
        
             | phicoh wrote:
             | There is a saying that you can make any program run fast if
             | you don't care about correctness.
             | 
             | We have reached the point where C programmers cannot
             | understand the language/compiler anymore.
             | 
             | Given that this has been going on for a long time, my hope
             | is that Rust will be the next systems programming language.
        
               | foxfluff wrote:
               | Do you think people understand Rust?
        
               | pornel wrote:
               | The safe subset of Rust does not permit any UB. This
               | greatly simplifies understanding of UB in the majority of
               | the codebase -- because there isn't any. Only when
               | there's an `unsafe` keyword in Rust, you put your "C" hat
               | on.
               | 
               | (it was and continues to be a challenge to uphold this
               | guarantee, because UB in LLVM is so ubiquitous, e.g. Rust
               | had to patch up float to int conversions and eliminate
               | no-op loops)
        
               | duped wrote:
               | It's easier to understand production Rust than production
               | C
        
               | phicoh wrote:
               | Rust can be hard to write, but by and large is not hard
               | to read.
               | 
               | Rust does have more abstraction mechanisms than C. So it
               | is possible to go overboard in that area. But that is
               | more a coding style issue. C programs that nest multiple
               | levels of functions pointers can also be hard to
               | understand.
        
               | cma wrote:
               | Rust code in unsafe blocks has lots of similar issues
               | with fuzziness around aliasing rules:
               | 
               | https://github.com/rust-lang/unsafe-code-
               | guidelines/blob/mas...
        
           | couchand wrote:
           | It doesn't seem like anyone is taking the position here that
           | C is flawless. Rather, the question is, given both its flaws
           | and its ubitquity, is it acceptable to trade off developer
           | head-wall interactions for the possibility of high
           | performance.
           | 
           | The only winning move is not to play. (with C)
        
         | maxlybbert wrote:
         | I'm amused by the people who ask for optimization and then
         | complain about it. Or the "it did what I said, not what I
         | meant" crowd.
         | 
         | But, officially, undefined behavior is always undefined, not
         | just at higher optimization levels.
        
           | vgatherps wrote:
           | This is a pretty dismissive response to something that's a
           | real problem.
           | 
           | Sure, the "Why are you deleting my checks for if *this is
           | null" is a little silly - but there are definitely sharp
           | edges where UB conflicts with actually useful things you
           | might want to do. Did you know seqlocks are undefined (benign
           | race conditions)? Ever ran into padding concerns playing
           | poorly with trying to do atomic CAS?
           | 
           | It's not unreasonable for the standard to say 'padding is
           | undefined', 'data-races are undefined' - but having no way to
           | say "hey, trust me, please un-poison this thing you don't
           | like" is pretty unfortunate.
        
             | klodolph wrote:
             | I know this is not relevant to your point... but padding in
             | C is not "undefined". The amount of padding is
             | implementation-defined, but it can only occur in specific
             | places. The value of padding bits is unspecified.
             | "Unspecified" in the standard means that it can be various
             | different things, but it has to actually _be_ one of those
             | things. As opposed to undefined behavior, which is more of
             | a  "poison value" that trims paths from your CFG.
        
             | jcranmer wrote:
             | > Did you know seqlocks are undefined (benign race
             | conditions)?
             | 
             | It wouldn't be undefined behavior if you used atomic
             | variables--data races involving atomics _aren 't_ undefined
             | behavior.
        
           | int_19h wrote:
           | UB is only UB when it comes to ISO C++ guarantees. The
           | implementations can always make their own guarantees above
           | and beyond that; they don't have to restrict themselves to
           | those parts that are explicitly "unspecified" or
           | "implementation-defined" in the Standard.
        
           | zaphar wrote:
           | Alternatively if my compiler has a "run faster" flag I would
           | not expect it to change the "semantics" of my code in the
           | process.
           | 
           | Additionally in C UB is often not intentional nor trivial to
           | detect in your codebase since it may be the interaction of
           | two pieces of code that are not anywhere obviously close to
           | each other. There comes a point where faster but broken code
           | isn't better it's _just_ broken.
        
             | maxlybbert wrote:
             | The standard committee's position is that if undefined
             | behavior isn't easy for the programmer to detect, why would
             | it be easy for the compiler to detect?
             | 
             | I'm a little more familiar with the C++ committee than the
             | C committee. The C++ committee prefers to declare something
             | an error than to declare it undefined behavior. They only
             | declare something undefined when they believe it would be
             | unreasonably difficult for the compiler to detect the
             | problem (e.g., using multiple, incompatible, definitions
             | for an inline function; which can happen if you have a
             | macro expanding differently in different parts of the code,
             | or you have multiple definitions for the function, but only
             | one definition is ever visible to the compiler at any
             | moment in time).
             | 
             | I'm pretty sure the "signed overflow is undefined" rule is
             | something of a special case: it should be easy to detect
             | when source code doesn't have a hard coded upper bound, but
             | giving an error or warning in all cases will create too
             | many false positives, and declaring that it wraps on
             | overflow has been deemed unacceptable by the committee.
        
               | [deleted]
        
               | GoblinSlayer wrote:
               | UB problems happen only when compilers detect it,
               | understand it is an opportunity for optimization, because
               | they are allowed to do anything, then do that
               | optimization. When compiler can't detect UB, it actually
               | works as expected: when integers overflow they do exactly
               | that and nothing else, when null pointers are
               | dereferenced they do exactly that and nothing else, when
               | uninitialized memory is read it does exactly that and
               | nothing else.
        
               | klodolph wrote:
               | Reading uninitialized memory is not UB... but anyway.
               | Here's an example of how compilers treat null
               | dereferencing as UB:                   void func(struct x
               | *aptr, struct x *bptr) {             x->a = 5;
               | y->a = 10;             x->a = 10;         }
               | 
               | This is a contrived example but you can imagine writing
               | code that makes a sequence of assignments like this. An
               | optimizing compiler can remove the "dead store" which
               | assigns 5 to x->a. However, this optimization is valid
               | because writing to a null pointer is undefined... if
               | writing to a null pointer trapped (SIGFAULT), which is
               | what it _actually does_ at runtime, then the optimization
               | would be incorrect... because you could observe the
               | intermediate value of x- >a = 5.
               | 
               | The compiler is not really "detecting that your program
               | has undefined behavior". Instead, it is assuming that
               | undefined behavior doesn't happen, and optimizes your
               | program using that assumption. It's unclear what
               | "detecting undefined behavior" would mean here... what
               | kind of diagnostic the compiler should emit, or how to
               | suppress the diagnostic. It's clear that this simple code
               | is pervasive, and the preferred approach for dealing with
               | it is somewhat outside the scope of a normal C
               | compiler... something like static analysis, runtime
               | instrumentation, or some kind of formal methods.
        
               | foxfluff wrote:
               | > However, this optimization is valid because writing to
               | a null pointer is undefined... if writing to a null
               | pointer trapped (SIGFAULT), which is what it actually
               | does at runtime, then the optimization would be
               | incorrect... because you could observe the intermediate
               | value of x->a = 5.
               | 
               | I'm not sure this is a good example, because there's no
               | way for you to read them from the same thread (and
               | reading from another thread would be a race). Reading
               | them from a signal handler would yield unspecified values
               | (unless you're using lock-free atomic types, in which
               | case we're probably not worried about optimizing these
               | assignments).
               | 
               | > When the processing of the abstract machine is
               | interrupted by receipt of a signal, the values of objects
               | that are neither lock-free atomic objects nor of type
               | volatile sig_atomic_t are unspecified, as is the state of
               | the floating-point environment.
        
               | klodolph wrote:
               | That's a good point, however I think it's not hard to
               | come up with a different example that shows how compilers
               | assume that pointers aren't NULL. Like,
               | void function(struct x *ptr) {             ptr->x = 1;
               | if (ptr == NULL) {
               | some_big_chunk_of_code();             }         }
               | 
               | The idea is that this could be the result of inlining,
               | where the inlined code does a null check, but in the
               | context it's being inlined into, we already know ptr is
               | not null.
        
               | mjevans wrote:
               | The compiler should help the programmer optimize instead.
               | 
               | "Warning: dead store detected (file):line clobbered by
               | (file):line. Use keyword volatile to always store or
               | remove (file):line."
               | 
               | This way the optimization is promoted from the resulting
               | binary to the source code, and bugs / typos / etc can be
               | corrected.
        
               | klodolph wrote:
               | That's a completely unworkable solution.
               | 
               | The above code isn't necessarily code _as the programmer
               | wrote it,_ but the code after it appears after
               | macroexpansion and various optimization passes. If you
               | were writing C++, the code may be templated.
        
               | gpderetta wrote:
               | You would get literally millions of such warnings on your
               | average C codebase.
        
               | zaphar wrote:
               | So basically you are saying that in the average C
               | codebase there are millions of landmines waiting to
               | silently fail on you. This is not a ringing endorsement
               | of the language it's more like a ringing indictment
               | instead.
        
               | gpderetta wrote:
               | I certainly do not think that each dead store which has
               | been eliminated is a landmine, this is such an extreme
               | position that it isn't even worth considering. In the
               | very rare cases were I needed a store to actually happen
               | I will use inline asm or some form of volatile.
        
               | foxfluff wrote:
               | Optimizations often deal with inlined code. The code in
               | the inlined function might be fine in the general sense,
               | but once taken in a particular context, there's dead or
               | redundant code because the compiler can deduce values
               | from context. It might even all fold into a constant.
               | 
               | Telling the developers to optimize these would be
               | equivalent to telling them to start inlining their code
               | (or creating variants of every function that might be
               | inlined, to optimize it for the specific context where it
               | will be inlined).
        
               | foxfluff wrote:
               | That's not true at all. Compilers can detect some
               | instances of UB, and they will happily warn when they do.
               | 
               | Most instances of UB are ones that the compiler could not
               | detect. Your compiler isn't going to detect that your
               | integer overflows, it assumes that it won't. Shit blows
               | up if that assumption was wrong.
               | 
               | If you want to detect UB, you should run ubsan and the
               | like. The compiler is not running it.
        
               | jerf wrote:
               | In some sense, the problem isn't that the compiler
               | "can't" detect that an integer addition will overflow.
               | The problem is that with some exceptions (a loop that is
               | clearly indexing along an array, for example), the
               | compiler will have to flag _every_ integer addition as
               | UB. UB notifications aren 't that useful if you're
               | getting a dozen for every five line function in your
               | entire codebase.
               | 
               | Programmers do not, in general, want to do the amount of
               | work it takes to _safely_ add two numbers together. You
               | need something  "weird". Like an error path for every
               | addition. Or a type system for ranges, where you actually
               | have to specify ranges (because adding two bytes together
               | that you casually left as 0-255 will create a new ranged
               | value 0-510; multiplication is even more fun). Or some
               | other exotic idea. So we just let them overflow and the
               | chips fall where they may.
        
               | foxfluff wrote:
               | > In some sense, the problem isn't that the compiler
               | "can't" detect that an integer addition will overflow.
               | 
               | It absolutely can't if it doesn't know what the values
               | are going to be.
               | 
               | > the compiler will have to flag every integer addition
               | as UB
               | 
               | No, I disagree. If it has no knowledge of my runtime
               | values, it can't flag addition as UB _because it isn 't_.
               | It's UB only if I'm using values that would overflow, and
               | the compiler in general can't know that. If I've done my
               | program right, it will never overflow. There is no UB,
               | any flag would be just absolutely wrong. The compiler
               | couldn't detect UB. There's nothing to flag.
               | 
               | If it knows the values, then of course it can detect it
               | and flag it:                   x.c:3:18: warning: integer
               | overflow in expression of type 'int' results in '-2'
               | [-Woverflow]             3 |  int
               | a=2147483647+2147483647;
               | 
               | > Programmers do not, in general, want to do the amount
               | of work it takes to safely add two numbers together.
               | 
               | I agree that overflow-checking is needlessly sucky in C
               | but that's not "the problem." I usually make sure my
               | arithmetic is safe but the compiler won't know it.
        
               | MauranKilom wrote:
               | > UB problems happen only when compilers detect it,
               | understand it is an opportunity for optimization, because
               | they are allowed to do anything, then do that
               | optimization.
               | 
               | This is a deep misconception, and not at all how most
               | undefined behavior is related to optimization.
               | 
               | Trivial example: Compilers assume that your variables
               | aren't written to randomly from other threads. Without
               | this assumption, virtually no optimization would be
               | possible. _Therefore_ , data races are UB - they violate
               | a hard assumption of the compiler. But at no point did
               | the compiler say "oh, you wrote to this variable without
               | synchronization! Now I'll show you, hehehe!".
               | 
               | This is the same for e.g. removed overflow checks.
               | Compilers can optimize many loops only _if_ they assume
               | signed integer overflow never happens. So, because
               | compilers should be able to assume this, it is UB if it
               | happens in your code. The same logic that deduces  "this
               | loop cannot wrap around" deduces "this if condition [an
               | overflow check] cannot ever be true".
               | 
               | But it's easier for a programmer who got their UB-reliant
               | checks optimized out to attribute malice to the compiler
               | than to understand optimization fundamentals, and thus we
               | get people complaining to high heaven and back.
        
               | dooglius wrote:
               | You are strawmanning the parent, he did not claim that
               | compilers will go "oh, you wrote to this variable without
               | synchronization! Now I'll show you, hehehe!". Your
               | description of the compiler's behavior matches his
               | description; the compilers detect a case where the
               | program might exhibit behavior defined ISO C spec (signed
               | integer overflow), see it as a potential for optimization
               | because they are allowed to do anything (certain passes
               | can be made if overflow is assumed not to happen), then
               | do that optimization.
        
               | MauranKilom wrote:
               | You can cast literally any optimization into this shape.
               | For starters, virtually any optimization pass requires
               | sane assumptions about (absence of) data races and/or out
               | of bounds writes. The point you are arguing (or claiming
               | OP argued) boils down to "compilers behave as I want when
               | I turn all optimization passes off".
               | 
               | No, the OP wrote specifically that "the compiler detects
               | [UB]" and then does optimizations exploiting that. Not
               | "detects _potential_ UB ". The former is a common
               | misconception, the latter is basically a truism.
        
               | dooglius wrote:
               | > You can cast literally any optimization into this shape
               | 
               | Tell me how these use UB: integer constant folding,
               | unused load removal, optimal use of GP registers to
               | minimize stack spilling
               | 
               | >sane assumptions
               | 
               | "sane" here is doing a lot of work. Assuming overflow
               | won't happen is not a sane assumption, assuming some
               | other thread won't randomly write into your stack memory
               | is.
        
               | jcranmer wrote:
               | > unused load removal
               | 
               | Oh come on, this is like the most UB-centric thing in the
               | entire compiler. The compiler uses UB to know that it has
               | enumerated all uses of a memory location, as any pointer
               | that didn't get its value from the data dependency tree
               | that compiler used had to use UB to compute the address.
               | (Be it from buffer overflow, use after end-of-lifetime,
               | random integer converted to pointer, etc.)
        
               | dooglius wrote:
               | I might be using the wrong terminology, what I'm talking
               | about is removing e.g. `int x = *ptr` where x is then not
               | used in scope (more realistically,
               | `(*struct_inst).member` shouldn't bother to load the
               | other members of the struct). What you're doing sounds
               | like removing a global it detects is never used, which I
               | agree relies too much on UB and should not be done.
        
               | puffoflogic wrote:
               | Aside from the possible trap ("segfault") mention in the
               | sibling comment, the first example relies on absence of
               | UB because with UB you could enumerate the contents of
               | the stack to find where `x` is living, and discover it
               | isn't there, contra the spec. Even allocating variables
               | directly into registers and never onto the stack relies
               | on UB.
        
               | dooglius wrote:
               | Why is it UB to use registers to store variables? This
               | seems like an implementation detail. The ISO C spec
               | doesn't even have the word "stack" as far as ctrl-f can
               | show me.
        
               | gpderetta wrote:
               | If any object can potentially be reached by forging
               | pointers or enumerating all memory locations, you can't
               | even keep local variables in registers around potential
               | stores to unknown pointers (or ever, in a multithreaded
               | program).
        
               | puffoflogic wrote:
               | > Why is it UB to use registers to store variables?
               | 
               | It isn't; that sentence is even a type error. UB is not a
               | property of compiler output. It's a property of compiler
               | input.
               | 
               | The use of registers to store variables between uses
               | (when those variables never have their address taken)
               | _relies_ on the lack of any defined behavior which would
               | allow the program to determine this is being done. The
               | fact you can 't find this in the spec is precisely
               | because it is not defined.
        
               | gpderetta wrote:
               | Removing a load is visible behaviour if the load would
               | have trapped for example.
               | 
               | Everybody agrees that compilers should only do sane
               | optimizations. Nobody agrees on which subset is sane.
        
               | dooglius wrote:
               | That's fair, I'll concede that I make some implicit
               | assumptions. But the magnitude of people who hit problems
               | and accidentally hit UB should give a strong indication
               | that a lot of things done now are not close to sane.
        
               | gpderetta wrote:
               | Lobbying the standard body to reduce the amount of UB in
               | the C standard is a perfectly reasonable position.
               | Requiring compilers to DWIM is not.
        
               | MauranKilom wrote:
               | > integer constant folding                   int a = 10;
               | int var1;         int b = 20;              foo(&var1);
               | int c = a + b;
               | 
               | You'd like to constant fold c? Better assume no UB:
               | void foo(int* x) { *(x-1) = 0; }
               | 
               | > unused load removal
               | 
               | Same idea as above.
               | 
               | > Assuming overflow won't happen is not a sane assumption
               | 
               | If the alternative is loops being much slower _when you
               | request optimizations_ , then maybe it is. Consult your
               | compiler manuals if this default irks you.
        
               | dooglius wrote:
               | Okay, yes, you are technically right. What I mean is,
               | making an assumption _other than_ that the code won't
               | alter stack/heap/instruction memory that the
               | implementation is using (e.g. aliasing, overflow, use of
               | NULL)
        
               | gpderetta wrote:
               | Isn't TBAA exactly the assumption that the program won't
               | alter the stack/head/instruction memory in an illegal
               | way?
        
               | maxlybbert wrote:
               | The parent said
               | 
               | > UB problems happen only when compilers detect it,
               | understand it is an opportunity for optimization, because
               | they are allowed to do anything, then do that
               | optimization.
               | 
               | Perhaps I'm misunderstanding the comment, but I took
               | "detect it, understand it is an opportunity for
               | optimization, because they are allowed to do anything,
               | ..." to mean "detect that undefined behavior occurs and
               | then ..."
               | 
               | Instead, it should be "detect that an optimization can be
               | applied and that the optimized code will do the right
               | thing in all cases that undefined behavior does not
               | occur, and apply the optimization."
               | 
               | I've met programmers who seem to believe that compilers
               | intentionally apply weird transformations when they
               | detect undefined behavior. But really, the compilers are
               | applying sensible transformations for well-defined code
               | and ignoring whether those transformations are sensible
               | for code that isn't well-defined.
        
               | jcranmer wrote:
               | Here is an actual example of taking advantage of UB
               | behavior in a compiler.
               | 
               | The compiler sees a variable x that's allocated on the
               | stack. It looks at all the uses of the address of x, and
               | sees that they are all loads and stores. It therefore
               | decides to promote x into a register variable.
               | 
               | Where's the UB, you ask. Well, since the address of x was
               | never leaked, it is UB to compute another pointer to that
               | address (say, via an integer-to-pointer conversion). The
               | compiler _never_ checked for that possibility to affirm
               | that it was UB; it knew that any code that did that would
               | be UB and simply ignored the possibility.
               | 
               | This makes arithmetic overflow a very poor vehicle for UB
               | because it's _unusual_ in that you can 't really take
               | advantage of the UB without pointing to the specific
               | operation that caused the UB to occur. This is why I
               | believe that arithmetic overflow UB is gratuitous, and
               | people objecting to UB _because_ of what happens
               | specifically with arithmetic overflow go on to make
               | suggestions that are completely untenable because they
               | don 't have familiarity with how most UB works.
        
               | steveklabnik wrote:
               | UB is closer to "we assume this can never happen and make
               | decisions accordingly" than "Ha ha, I caught you making a
               | mistake, let's actively try to mess up your life now."
        
               | chc wrote:
               | The problem is that this doesn't _mean_ anything, at
               | least in terms of the C programming language. You say
               | "When integers overflow, they do that and nothing else,"
               | but there is no definition of signed integer overflow, so
               | there is no "that" for them to do. That is what it means
               | for something to be undefined behavior. You have some
               | definition in your mind, which might happen to match what
               | some versions of some compilers do at some optimization
               | levels -- but that isn't actually the definition, because
               | there isn't one.
        
             | sanxiyn wrote:
             | "Run faster" involves things like register allocation. How
             | can you justify register allocation doesn't change
             | semantics of code? It absolutely does! If you smash the
             | stack, variables on stack will be smashed, and variables on
             | register will not. This is change of semantics.
             | 
             | Compilers justify register allocation by assuming stack
             | can't be smashed, contrary to hardware reality. Because
             | smashing the stack is UB. That's what C UB is for.
        
               | puffoflogic wrote:
               | No no no; you see, in these kinds of debates "UB" is only
               | the things I disagree with!
        
               | derefr wrote:
               | C exposes a (fairly low-level) abstract machine with its
               | own semantics, encapsulating the operational semantics of
               | the CPU. Optimization flags are expected to affect the
               | uarch-operational semantics, but not the semantics of the
               | C abstract machine.
               | 
               | Think of it like trading one cycle-accurate emulator for
               | a better, more tightly-coded cycle-accurate emulator. Or
               | a plain loop-and-switch bytecode interpreter for a
               | threaded-code bytecode interpreter. The _way_ your code
               | is running _on the underlying machine_ changes, but the
               | semantics _of your code_ relative to the abstract machine
               | the code itself interacts with should not change.
               | 
               | > Compilers justify register allocation by assuming stack
               | can't be smashed, contrary to hardware reality.
               | 
               | ...which _should_ be entirely fine, as the existence of a
               | stack isn 't part of the exposed semantics of the C
               | abstract machine. Values can be "on the stack" -- and you
               | can get pointers to them -- but nothing in the C abstract
               | machine says that you should be _able_ to smash the
               | stack. There 's nothing in the C standard itself
               | describing a physically-laid-out stack in memory. (There
               | are ABI calling conventions defined in the standard, but
               | these are annotations for target-uarch codegen, not facts
               | about the C abstract machine.) There could just as well
               | be a uarch with only the ability to allocate things "on
               | the stack" by putting them in a heap -- in fact, this is
               | basically how you'd have to do things if you wrote a C
               | compiler to target the JVM as a uarch -- and a C compiler
               | written for such a uarch would still be perfectly
               | compliant with the C standard.
               | 
               | If C were an interpreted language, _the fact that the
               | stack can be smashed_ would be referred to as a  "flaw in
               | the implementation of the runtime" -- the runtime
               | allowing the semantics of the underlying uarch to leak
               | through to the abstract-machine abstraction -- rather
               | than "a flaw in the interpreter" per se; and you'd then
               | expect a _better_ runtime implementation to fix the
               | problem, by e.g. making all pointers to stack values
               | actually be hardware capabilities, or making each stack
               | frame into its own memory segment, or something crazy
               | like that.
               | 
               | As a compiled language -- and especially one that has
               | encapsulation-breaking holes in its abstract machine,
               | like the C inline-assembly syntax -- you can't exactly
               | expect a runtime shim to fix up the underlying uarch to
               | conform to the C abstract machine. But you _could_ at
               | least expect the C compiler to not _let_ you do anything
               | that _invokes_ those cases where it 's exposing
               | potentially-divergent uarch semantics rather than a
               | normalized C-abstract-machine semantics. As if, where
               | there _should_ be a shim  "fixing up" a badly-behaved
               | uarch, you'd instead hit a NotImplementedException in the
               | compiler, resulting in a compilation abort.
        
               | gpderetta wrote:
               | Even comparing two pointers can expose potentially
               | divergent uarch semantics. Should that be disallowed at
               | compile time?
        
               | zaphar wrote:
               | Exactly, If I wanted to be operating of the level where I
               | am worrying about registers and stack smashing and such I
               | would just drop down assembly directly. You use C because
               | it has an in-theory abstract machine that I can target
               | and avoid needing to use assembly.
        
             | gpderetta wrote:
             | Compilers should have a -do-what-i-mean-not-what-i-say
             | flags. The required mind reading is implementation defined.
        
           | jvanderbot wrote:
           | Nothing in this article convinces me the problem is contained
           | to _optimized_ C. The strict type aliasing is a general
           | problem, right?
        
         | bcrl wrote:
         | At least in compilers like gcc, optimization needs to be
         | enabled to get sane warnings emitted by the compiler, so some
         | people are indeed being forced to pass in -O2 to get sane build
         | warnings for projects.
         | 
         | I would really like it for the C standard to clean up Undefined
         | Behaviour. Back in the 1980s when ANSI C was first specified, a
         | lot of the optimizations that modern compiler writers try to
         | justify via Undefined Behaviour simply weren't part of many
         | compiler's repertoires, so most systems developers didn't need
         | to worry about UD and there was no push for the standard to do
         | so as a result.
         | 
         | If people really want the optimizations afforded by things like
         | assuming an int can't overflow to a negative number in a for
         | loop, my personal position is that the code should be annotated
         | such that the optimization is enabled. At the very least, the
         | compiler should warn that it is making assumptions about
         | potentially UB when applying such optimizations.
         | 
         | There is this false belief that all legacy code should be able
         | to compiled with a new compiler with no changes and expect
         | improved performance. Anyone who works on real world large
         | systems _knows_ that you can 't migrate to newer compilers or
         | updated OSes with zero effort (especially if there's any C++
         | involved). I understand that compiler writers want to improve
         | their performance on SPEC, but the real world suffers from the
         | distortions caused by viewing optimizations through the narrow
         | scope of benchmarks like SPEC.
        
           | foxfluff wrote:
           | > If people really want the optimizations afforded by things
           | like assuming an int can't overflow to a negative number in a
           | for loop, my personal position is that the code should be
           | annotated such that the optimization is enabled.
           | 
           | That's not an entirely absurd idea, especially if you wanted
           | to enable optimization piece-wise for legacy code. But in the
           | end does it matter whether you specify the optimizations you
           | want in the build system or in the code?
           | 
           | > At the very least, the compiler should warn that it is
           | making assumptions about potentially UB when applying such
           | optimizations.
           | 
           | The assumptions are usually "assume no UB." I don't know what
           | kind of warnings you expect but this idea has been brought up
           | before and it's not good because it would just make the
           | compiler flood warnings on any code that does any arithmetic
           | on variables (as opposed to compile time constants).
        
           | foxfluff wrote:
           | > At least in compilers like gcc, optimization needs to be
           | enabled to get sane warnings emitted by the compiler, so some
           | people are indeed being forced to pass in -O2 to get sane
           | build warnings for projects.
           | 
           | That's a fair point and it irritates me too. But you're not
           | forced to run -O2 in production if you don't want to. I've
           | had projects where we compile with multiple compilers and
           | different settings -- including different libcs -- just to
           | get all the warnings, while we ship non-optimized or
           | minimally optimized debug builds to the customer..
        
           | zajio1am wrote:
           | > If people really want the optimizations afforded by things
           | like assuming an int can't overflow to a negative number in
           | 
           | Perhaps people wanted int overflow to be UD because that is
           | semantically clean. Code where possible integer overflow is
           | intended and not bug is IMHO significant minority of all
           | integer usage, in vast majority it is a bug, similar to
           | invalid memory access. So it would be reasonable for some
           | compilers/platforms to handle integer overflows by cpu
           | exception translated to signal killing the process (similar
           | to SIGSEGV for invalid memory access), and only integer
           | operations explicitly marked for overflow (i.e. unsigned in
           | C) be allowed.
           | 
           | It is just our experience that makes 'integer overflow should
           | wrap' and 'invalid memory access should crash' default
           | positions, while 'integer overflow should crash' and 'invalid
           | memory access should be ignored/return 0' are alternative
           | positions. Conceptually, both are just stricter or looser
           | ways to handle code errors and it makes sense that C does not
           | prescribe neither one.
        
       | h2odragon wrote:
       | Torvalds was a strong advocate of GCC 2.95 (iirc), early on in
       | Linux history, because he knew the kind of code it would emit and
       | didn't trust the newer compilers to produce code that was correct
       | in those circumstances.
       | 
       | The workarounds and effort required to tell a compiler today that
       | no, you really did want to do the thing you said might well be
       | insupportable. I figure they started going astray about the time
       | self modifying code became frowned upon.
        
         | mananaysiempre wrote:
         | To be fair, the backend in the early GCC 3.x series was just
         | kind of stupid sometimes. Even now I find strange if cheap and
         | harmless heisengremlins in GCC-produced x86 code (like MOV R,
         | S; MOV S, R; MOV R, S) from time to time, while the Clang
         | output, even if not always _good_ , is at least reasonable.
         | This is not to diss the GCC team--the effort required to port a
         | whole compiler to a completely new IR with a completely
         | different organizing principle while keeping it working most of
         | that time boggles the mind, frankly. But the result does
         | occasionally behave in weird ways.
        
           | usbqk wrote:
           | Can Linux compile under clang nowadays?
        
             | eminence32 wrote:
             | Yes, see:
             | https://www.kernel.org/doc/html/latest/kbuild/llvm.html
        
             | pjmlp wrote:
             | Yes, when that Linux variant is called Android.
             | 
             | Since almost 5 years by now.
        
             | yjftsjthsd-h wrote:
             | While sibling comment is correct that the simple answer is
             | "yes", I'd like to add that Google uses Clang as the
             | preferred compiler for Android. There's a _little_ bit of
             | drift between Android 's version of Linux and mainline, but
             | the effect is still that building Linux with Cland is being
             | extensively used and tested.
        
       | RustyRussell wrote:
       | I still retch when told memcpy and memset cannot take NULL and 0
       | length.
       | 
       | Try asserting that they're not NULL in glibc and try to boot your
       | machine! Oops... bad compiler people, bad!
        
       | qualudeheart wrote:
       | So we can switch to rust now?
        
         | yjftsjthsd-h wrote:
         | Sure; we look forward to your patches (keep in mind that they
         | must preserve compatibility and portability to all currently
         | supported architectures).
        
           | flykespice wrote:
        
       | nn3 wrote:
       | Clickbait on arxiv? Et tu, Brute?
       | 
       | No, most operating systems that people are actually use are
       | written in ISO C, so the headline is by definition wrong.
        
         | immibis wrote:
         | They are written in non-ISO C
        
       | jcranmer wrote:
       | I think undefined behavior (as a general concept) gets an unfair
       | share of the blame here. It's notable that almost all criticism
       | of undefined behavior in C tends to focus on just two sources of
       | UB: signed integer overflow and strict aliasing; other sources of
       | UB just don't generate anywhere near the same vitriol [1].
       | Furthermore, it's notable that people _don 't_ complain about UB
       | in Rust... which arguably has a worse issue with UB in that a)
       | there's not even a proper documentation of what _is_ UB in Rust,
       | and b) the requirement that  &mut x be the sole reference to x
       | (and therefore is UB if it is not) is far more stringent than
       | anything in C (bar maybe restrict), and I'm sure that most Rust
       | programmers, especially newbies starting out with unsafe, don't
       | realize that that's actually a requirement.
       | 
       | There is a _necessity_ for some form of UB in a C-like language,
       | and that has to deal with pointer provenance. You see, in C,
       | everything lives in memory, but on real hardware, you want as
       | much to live in a register as possible. So a compiler _needs_ to
       | be able to have reasonable guarantees that, say, any value whose
       | address is never taken can never be accessed with a pointer, and
       | so can be promoted to a register. As a corollary, this requires
       | that things like out-of-bound memory accesses, or worse,
       | converting integers to pointers (implicating pointer provenance
       | here) need to have UB in at least some cases, since these could
       | in principle  "accidentally" compute an address which is the same
       | as a memory location whose address was never taken.
       | 
       | That suggests that the problem isn't UB per se. If we look at the
       | two canonical examples, arithmetic overflow and strict aliasing,
       | we can see that one of the features of these things is that they
       | have a pretty obvious well-defined semantics [2] that can be
       | given for them, and furthermore, there's no way to _access_ these
       | well-defined semantics even avoiding this feature altogether. And
       | I think it 's the lack of this ability to work around UB that is
       | the real issue with C, not UB itself.
       | 
       | [1] For example, it is UB to pass in rand as the comparison
       | function to qsort. I'm sure many people will not realize that
       | before I wrote this, and even parsing the C specification to find
       | out that this is UB is not trivial. For an interesting challenge,
       | try giving a definition of what the behavior should be were it
       | not UB--and no, you can't just say it's impl-defined, since that
       | still requires you to document what the behavior is.
       | 
       | [2] I will point out that, for arithmetic overflow, this
       | semantics is _usually wrong_. There are very few times where you
       | want  <large positive number> + <large positive number> =
       | <negative number>, and so you're mostly just swapping out an
       | unpredictably wrong program for a predictably wrong program,
       | which isn't really any better. However, the most common time you
       | do want the wrapping semantics is when you want to _check_ if the
       | overflow happened, and this is where C 's lack of any overflow-
       | checked arithmetic option is really, really painful.
        
         | dooglius wrote:
         | For [1] I would say that when the compiler does not control the
         | standard library (e.g. most linux systems) then the compiler
         | should treat qsort like any normal non-libc function call, it
         | shouldn't be able to detect that I might be relying on
         | implementation-specific guarantees. From a libc perspective, I
         | would probably have a safe version of qsort used either on
         | request or when NDEBUG is undefined, in which the
         | implementation must either detect the inconsistency and abort
         | in an implementation-defined matter, or select an ordering such
         | that a<b in the ordering implies the existence of a finite
         | chain a<x_1<x_2<...<x_n<b of comparison results in which each
         | x_i is the equivalence class where the compare function said
         | two entries were equal. I think that should be consistent with
         | most sorting algorithms. (Whether the normal qsort should be
         | the same as the safe version would only be needed if there were
         | a significant-enough performance improvement.)
         | 
         | For [2] I disagree for two reasons. For one, merely having an
         | answer at all, rather than entering an undefined behavior is
         | valuable; having x+y=pseudorand(x,y) would still be better than
         | complete UB because the damage that can be done is limited. The
         | other reason is that addition/subtraction/multiplication does
         | actually work modulo 2^n, so if I write e.g. (a+b-c) where b
         | can be large but b-c is known to be small, it works even if the
         | language is technically left-associative and having the a+b
         | first is UB.
         | 
         | I pretty much agree with your point that the main problem is
         | there is no way to work around UB; if there were a workaround
         | there would be much less reason to complain. I would go a bit
         | further and say such a workaround needs to be in the code
         | itself to be effective, not merely a command line arg like
         | fwrapv, so that it gets kept around when doing header inlining,
         | LTO, copy-pasting code to other projects, etc.
        
           | jcranmer wrote:
           | > For [1] I would say that when the compiler does not control
           | the standard library (e.g. most linux systems) then the
           | compiler should treat qsort like any normal non-libc function
           | call, it shouldn't be able to detect that I might be relying
           | on implementation-specific guarantees.
           | 
           | I am guessing, for example, that you are not aware that every
           | compiler will optimize `printf("Hello, world!\n");` to
           | `puts("Hello, world!\n");`. memcpy and memset are even more
           | heavily-optimized function calls by the compiler: for
           | example, they don't prevent memory-to-register promotion.
        
             | dooglius wrote:
             | I am aware of these things, the rant by Ulrich Drepper on
             | this breaking glibc guarantees [0] explicitly what I had in
             | mind when writing that.
             | 
             | [0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25609
        
       | mcguire wrote:
       | From the paper:
       | 
       | " _For example, a well-known security issue in the Linux kernel
       | was produced by a compiler incorrectly assuming a pointer null
       | check was unnecessary ([40] fig. 6) and deleting it as an
       | optimization. Or consider this (simplified) patch report for
       | Linux [25]: The test [for termination] in this loop: [...] was
       | getting completely compiled out by my gcc, 7.0.0 20160520. The
       | result was that the loop was going beyond the end of the [...]
       | array and giving me a page fault [...] I strongly suspect it's
       | because __start_fw and __end_fw are both declared as (separate)
       | arrays, and so gcc concludes that __start_fw can never point to
       | __end_fw. By changing these variables from arrays to pointers,
       | gcc can no longer assume that these are separate arrays._ "
       | 
       | Both of those sound like simple bugs in the compiler optimizer
       | implementation, in which case they could hardly be use as
       | examples of how the current C standard was bad.
        
         | TazeTSchnitzel wrote:
         | They aren't bugs. The C standard considers those cases to be
         | undefined.
        
         | foxfluff wrote:
         | No, they're definitely features. The removal of null pointer
         | checks in gcc is enabled by -fdelete-null-pointer-checks and is
         | often enabled by default.
        
       | flykespice wrote:
       | Wow, this was an eye-opening read on my trust with ISO C.
       | 
       | It makes much more understandable why linux codebase is riddled
       | with compiler extensions, ISO C is simply not reliable anymore.
       | 
       | The issue is bigger than what trembles on the surface, just like
       | Dennis Ritchie said, it is a timebomb, soon enough these nuances
       | will burst into a big issue in linux kernel, or worse yet, some
       | essential system like avionics.
        
       | foxfluff wrote:
       | This was discussed somewhat recently:
       | https://news.ycombinator.com/item?id=28779036
        
       | zokier wrote:
       | ISO C is mostly concerned about making sure that stuff is
       | portable; operating systems on the other hand are intrinsically
       | platform-specific to a degree. So it is not really surprising
       | that pure ISO C is not enough for OS development
        
         | yjftsjthsd-h wrote:
         | You can still be portable between compilers; Linux builds with
         | GCC and Clang, and used to build with tcc (although that
         | probably required patches)
        
           | zokier wrote:
           | Linux builds on clang after a decade of dedicated effort to
           | make it happen, and that is with clang overall being
           | comparatively similar to gcc (e.g clang implements many gcc
           | extensions):
           | https://github.com/ClangBuiltLinux/linux/wiki/Project-
           | histor...
        
       | WalterBright wrote:
       | D got around the overflow-is-UB problem by declaring that
       | 2s-complement arithmetic will be used with wraparound semantics.
       | 
       | Is there any reason for modern C to still support anything else?
        
         | zajio1am wrote:
         | Integer overflow is usually logic error, so a reasonable
         | default behavior would be a trap instead of silent overflow
         | (regardless of how signed integrers are stored in memory). Some
         | architectures support that (e.g. MIPS).
        
           | WalterBright wrote:
           | The article mentions cases where integer overflow is expected
           | behavior.
        
         | mhh__ wrote:
         | There are still processors that don't use two's complement,
         | although I'm not sure that should really stop them if they
         | wanted to declare all C implementations must be t-c.
        
       | aw1621107 wrote:
       | Ralf Jung has a blog post looking at some of the claims in this
       | paper [0]. Some hopefully representative quotes:
       | 
       | > The paper makes many good points, but I think the author is
       | throwing out the baby with the bathwater by concluding that we
       | should entirely get rid of this kind of Undefined Behavior. The
       | point of this blog post is to argue that we do need UB by showing
       | that even some of the most basic optimizations that all compilers
       | perform require this far-reaching notion of Undefined Behavior.
       | 
       | <snip>
       | 
       | > I honestly think trying to write a highly optimizing compiler
       | based on a different interpretation of UB would be a worthwhile
       | experiment. We sorely lack data on how big the performance gain
       | of exploiting UB actually is. However, I strongly doubt that the
       | result would even come close to the most widely used compilers
       | today--and programmers that can accept such a big performance hit
       | would probably not use C to begin with. Certainly, any proposal
       | for requiring compilers to curtail their exploitation of UB must
       | come with evidence that this would even be possible while keeping
       | C a viable language for performance-sensitive code.
       | 
       | > To conclude, I fully agree with Yodaiken that C has a problem,
       | and that reliably writing C has become incredibly hard since
       | undefined behavior is so difficult to avoid. It is certainly
       | worth reducing the amount of things that can cause UB in C, and
       | developing practical tools to detect more advanced kinds of UB
       | such as strict aliasing violations.
       | 
       | <snip>
       | 
       | > However, I do not think this problem can be solved with a
       | platform-specific interpretation of UB. That would declare all
       | but the most basic C compilers as non-compliant. We need to find
       | some middle ground that actually permits compilers to
       | meaningfully optimize the code, while also enabling programmers
       | to actually write standards-compliant programs.
       | 
       | [0]: https://www.ralfj.de/blog/2021/11/24/ub-necessary.html
        
         | GoblinSlayer wrote:
         | I'm afraid the author as usual confuses the cases when UB
         | produces strange results and when the compiler actively and
         | deliberately exacerbates the results. He also claims gcc
         | interpretation of UB provides a lot of performance. Ironically
         | gcc or clang already do experiments on different interpretation
         | of UB like signed integer overflow and implicit initialization.
        
         | scott_s wrote:
         | HN thread: https://news.ycombinator.com/item?id=29435263
        
         | ghoward wrote:
         | >> We sorely lack data on how big the performance gain of
         | exploiting UB actually is. However, I strongly doubt that the
         | result would even come close to the most widely used compilers
         | today--and programmers that can accept such a big performance
         | hit would probably not use C to begin with. Certainly, any
         | proposal for requiring compilers to curtail their exploitation
         | of UB must come with evidence that this would even be possible
         | while keeping C a viable language for performance-sensitive
         | code.
         | 
         | There actually is data, [1] and it goes against intuition
         | because it shows that UB makes code _slower_.
         | 
         | It turns out that when programmers are faced with the
         | possibility of UB, they (quite rightly) try to work around it.
         | (In my case, I implemented signed two's-complement arithmetic
         | entirely with unsigned types. I also implemented my own array
         | types, including bounds checks when indexing the arrays.) These
         | workarounds make their code slower, in general.
         | 
         | When you think about it that way, it makes sense because
         | programmers almost never want UB in their software, so on
         | average, the software that people care about will work around
         | UB and become slower.
         | 
         | UB in C was defensible when platforms were not homogenous and
         | when compiler writers did not abuse the spirit of it. But
         | nowadays, most of it _is_ indefensible because platforms are
         | mostly homogenous, and compiler writers have been abusing UB
         | for  "performance."
         | 
         | We saw the same thing happen with speculative execution in
         | chips. For years, chip manufacturers got away with lots of
         | tricks to increase performance. Then Spectre and Meltdown were
         | discovered. As a result, a lot of software had to be changed,
         | which resulted in the software slowing down.
         | 
         | (Yes, there are now mitigations in chips, but I think those
         | mitigations will be successfully attacked too. I think that it
         | will continue that way until most or all speculative execution
         | is removed.)
         | 
         | Likewise, with compilers exploiting UB against the original
         | spirit of UB, which was just to make C easy to port to any
         | architecture, we are probably going to have a reckoning about
         | it in the same way we did Spectre and Meltdown. In a way, we
         | already do, but it's spread out like a thousand paper cuts.
         | Maybe that means that it will stay invisible enough that
         | programmers never wake up; I hope not.
         | 
         | tl;dr: compilers exploiting UB actually slows down good code in
         | much the same way that Spectre and Meltdown do.
         | 
         | [1]:
         | https://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_20...
        
           | bigbillheck wrote:
           | > the original spirit of UB, which was just to make C easy to
           | port to any architecture
           | 
           | I thought the original spirit of undefined behavior was 'we
           | have competing compiler implementations and don't have the
           | political will to pick one'.
        
             | ghoward wrote:
             | I think those are one and the same; I was just more polite
             | about it, I guess. I think yours is closer to the truth
             | than mine.
        
           | sanxiyn wrote:
           | That data is bullshit. It measures -fno-strict-overflow -fno-
           | strict-aliasing. I in fact agree flipping defaults on these
           | UBs is reasonable, but that is very far from "portable
           | assembler" or UB-free C.
           | 
           | THE primary UB of C is out-of-bounds write. Any proposal for
           | UB-free C should address how to compile out-of-bounds write.
           | 
           | If you insist out-of-bounds write should compile in "portable
           | assembler" way and reliably corrupt memory instead of
           | allowing compilers to assume it doesn't happen, compilers
           | can't allocate local variables in register. If it's on stack,
           | out-of-bounds write may reliably corrupt stack, so how can it
           | be on register?
           | 
           | "Portable assembler" people invariably say "that's not what I
           | meant!" when compiler people raise this issue, but the
           | question is "then what do you mean?" And I am yet to see any
           | good answer.
        
             | mpweiher wrote:
             | > and reliably corrupt memory
             | 
             | Who said memory corruption had to occur "reliably"??
             | 
             | "Permissible undefined behavior ranges from ignoring the
             | situation completely with unpredictable results"
             | 
             | > compilers can't allocate local variables in register.
             | 
             | Why not?
             | 
             | > If it's on stack, out-of-bounds write may reliably
             | corrupt stack, so how can it be on register?
             | 
             | "Permissible undefined behavior ranges from ignoring the
             | situation completely with unpredictable results"
             | 
             | > And I am yet to see any good answer.
             | 
             | What would qualify an answer as "good" IYHO?
        
               | indymike wrote:
               | >> and reliably corrupt memory > Who said memory
               | corruption had to occur "reliably"??
               | 
               | Usually when I write code that corrupts memory, it
               | reliably corrupts memory an unreliable way.
        
             | nkurz wrote:
             | Could you you explain your position here more fully? I'm a
             | C programmer mostly in the "portable assembler" camp, and I
             | don't see why insisting on out-of-bound writes to be
             | performed prevents compilers from keeping local variables
             | in registers. I think the difference might be your use of
             | the word "reliably".
             | 
             | I don't think anyone in my camp is insisting that the
             | results be reliable in the sense you seem to take for
             | granted. What we want is for the compiler to attempt to do
             | what it's instructed to do, without eliding code based on
             | counterfactual reasoning that assumes the absence of
             | undefined behavior. In the absence of an explicit read that
             | is being ignored, we're generally fine with cached values
             | of local variables being out of date.
             | 
             | Maybe you could give an example that makes your point more
             | clearly, and we can see if we actually disagree?
        
               | sanxiyn wrote:
               | See https://news.ycombinator.com/item?id=30024534 (by
               | jcranmer) and
               | https://news.ycombinator.com/item?id=30026095 (by me).
               | Does it help?
        
               | nkurz wrote:
               | Unfortunately, these don't help much.
               | 
               | I thought about responding to jcranmer's post instead of
               | yours, but wasn't sure how to approach it. It's a good
               | comment, and I appreciated his attempt, but I feel like
               | he's thoroughly missing the point of the complaints about
               | UB. The complaint isn't that too much UB exists in C, but
               | that compiler writers seem to use the presence of UB as
               | an excuse for being allowed to break code that has worked
               | historically. And the complaint isn't just that the code
               | is broken, but that no apparent care is taken to avoid
               | doing so despite the small amount of gain. It's a
               | worldview mismatch, and I don't know how to bridge it.
               | 
               | Your comment seems about the same as the one I responded
               | to. You seem to assume that people who complain about UB
               | in C would have a problem with keeping local variables in
               | registers, but I've never seen anyone actually make this
               | complaint. Take for example the Arxiv paper we are
               | discussing: he doesn't bring this up. This makes me
               | suspect that your mental model of the people who are
               | complaining about UB in C is probably flawed. I
               | understand the technical issue you are gesturing toward,
               | I just don't see it as being something that refutes any
               | of the issues brought up in the Arxiv paper.
               | 
               | My hope was that a concrete example might help to
               | clarify, but I do realize this might not be the right
               | forum for that.
        
               | jcranmer wrote:
               | My experience with a lot of the UB-is-bad crowd is that
               | they don't have much of an appreciation for semantics in
               | general. That is to say, they tend to react to particular
               | compiler behavior, and they don't have any suggestions
               | for how to rectify the situation in a way that _preserves
               | consistent semantics_. And when you try to pin down the
               | semantics, you usually end up with a compiler that has to
               | "do what I mean, not what I said."
               | 
               | A salient example that people often try on me, that I
               | don't find persuasive, is the null pointer check:
               | void foo(int *x) {         int val = *x;         if (!x)
               | return;         /* do real stuff */       }
               | 
               | "The compiler is stupid for eliminating that null check!"
               | "So you want the code to crash because of a null pointer
               | dereference then." "No, no, no! Do the check first, and
               | dereference only when it's not null!" "That's not what
               | you said..."
        
               | nkurz wrote:
               | > "No, no, no! Do the check first, and dereference only
               | when it's not null!"
               | 
               | I don't think I've heard anyone express this preference.
               | If they did, I'd agree with you that they are being
               | unreasonable. My impression is that practically everyone
               | on the "portable assembler" team would think it's
               | perfectly reasonable to attempt the read, take the
               | SIGSEGV if x is NULL, and crash if it's not handled. Most
               | would also be happy skipping the read if the value can be
               | shown to be unused. None would be happy with skipping the
               | conditional return.
               | 
               | Where it gets thornier is when there is "time travel"
               | involved. What if the unprotected read occurs later, and
               | instead of just returning on NULL, we want to log the
               | error and fall through:                 void foo(int *x)
               | {         if (!x) /* log error */;         int val = *x;
               | return;       }
               | 
               | Is it reasonable to have the later unconditional read of
               | x cause the compiler to skip the earlier check of whether
               | x is NULL? In the absence of an exit() or return in the
               | error handler, I think it would be legal for the compiler
               | to skip both the error logging and the unused load and
               | silently return, but I don't want the compiler to do
               | this. I want it to log the error I asked it to (and then
               | optionally crash) so I can identify the problem.
               | Alternatively, I'd probably be happy with some compile
               | time warning that tells me what I did wrong.
        
               | mzs wrote:
               | On a lot of systems load from address zero doesn't
               | segfault, I'm fine with the cpu loading it. I'm
               | disappointed that the new compiler removed the check
               | someone wrote a dozen years ago to prevent the cpu from
               | overwriting something important though.
        
               | gpderetta wrote:
               | The null pointer need not map to address zero exactly to
               | cater to this sort of corner cases.
        
               | jcranmer wrote:
               | In the case where loads from address zero are legal, the
               | dereference-means-non-null-pointer assumption is
               | disabled.
        
               | mzs wrote:
               | Seems it wasn't disabled in 2009
               | 
               | https://github.com/torvalds/linux/commit/a3ca86aea5079041
               | 488...
        
               | jcranmer wrote:
               | That case wasn't one of the cases where null pointers are
               | valid memory, judging from the commit message. (There
               | _was_ a case where gcc had a bug where it didn 't disable
               | that check properly, but this doesn't appear to be that
               | case.)
        
               | bewaretheirs wrote:
               | Actually, yes, I do want that code to crash if x is NULL.
        
               | pcwalton wrote:
               | Then you should logically be OK with deleting the null
               | pointer check, as that check is unreachable.
        
               | nkurz wrote:
               | Are you familiar with this classic example?
               | http://blog.llvm.org/2011/05/what-every-c-programmer-
               | should-...
               | 
               | The problem (if I understand it right) is that "Redundant
               | Null Check Elimination" might be run first, and will get
               | rid of the safety return. But then the "Dead Code
               | Elimination" can be run, which gets rid of the unused
               | read, and thus removes the crash. Which means that rather
               | than being logically equivalent, you can end up with a
               | situation where /* do real stuff */ (aka /* launch
               | missiles */) can be run despite the programmer's clear
               | intentions.
        
               | jwatte wrote:
               | I would expect the code to crash when passed a null
               | pointer, absolutely! And the if statement is there to let
               | me recover after using "continue" in the debugger. And if
               | I run on an ISA without protected memory, that load will
               | not actually crash, and I'm OK with that. That's a level
               | of UB differing-behavior (more like ID, really) that's
               | reasonable.
               | 
               | I know of no experienced C programmer who would expect
               | the compiler to re-order the statements. That sounds very
               | much like a strawman argument to me!
               | 
               | I'd also be OK with a compiler that emitted a warning:
               | "This comparison looks dumb because the rules say it
               | should do nothing." Emitting that warning is helpful to
               | me, the programmer, to understand where I may have made a
               | mistake. Silently hiding the mistake and cutting out
               | parts of code I wrote, is _not_ generally helpful, even
               | if there exist some benchmark where some inner loop will
               | gain 0.01% of performance if you do so.
               | 
               | After all: The end goal is to produce and operate
               | programs that run reliably and robustly with good
               | performance, with minimum programmer cost. Saying that
               | the possible performance improvements of code that
               | _nobody will run because it 's buggy_ absolutely trump
               | every other concern in software development, would be a
               | statement I don't think reflects the needs of the
               | software development profession.
        
               | sanxiyn wrote:
               | No, I don't assume people are against register
               | allocation, but any concrete proposal I have seen kind of
               | _implies_ such conclusion. I am trying to understand what
               | people actually want, since they seem clearly different
               | from what people _say_ they want.
               | 
               | Okay let's discuss a concrete example.
               | *x = 12345678;         f();         return *x; // can you
               | copy propagate 12345678 to here?
               | 
               | f() does this:                   for (int *p = 0; p++; p
               | < MEMSIZE)             if (*p == 12345678)
               | *p = 12345679.
               | 
               | That is, f scans the memory for 12345678 and replace all
               | instances with 12345679. There is no doubt this actually
               | works that way in assembly. Things like cheat engines do
               | this! C compilers assume this doesn't happen, because it
               | is UB.
               | 
               | Hence, portable assembly C compiler can't omit any load.
               | Now I understand there are minority of people who will
               | answer "that's what I want!", but like register
               | allocation, I think people generally want this to
               | optimize. But that necessarily implies memory search-and-
               | replace can't compile in portable assembly manner.
        
               | nkurz wrote:
               | Thanks for making an example!
               | 
               | I'm against compiling this particular example to return a
               | constant. I presumably wrote the awkward and unnatural
               | construction return *x because I want to to force a load
               | from x at return. If I wanted to return a constant, I'd
               | have written it differently! I'm odd though in that I
               | occasionally do optimizations to the level where I
               | intentionally need to force a reload to get the assembly
               | that I want.
               | 
               | Philosophically, I think our difference might be that you
               | want to conclude that one answer to this question
               | directly implies that the "compiler can't omit any load",
               | while I'd probably argue that it's actually OK for the
               | compiler to treat cases differently based on the apparent
               | intent of the programmer. Or maybe it's OK to treat
               | things differently if f() can be analyzed by the compiler
               | than if it's in a shared library.
               | 
               | It would be interesting to see whether your prediction
               | holds: do a majority actually want to return a constant
               | here? My instinct is that C programmers who complain
               | about the compiler treatment of UB behavior will agree
               | with me, but that C++ programmers dependent on
               | optimizations of third party templates might be more
               | likely to agree with you.
        
               | sanxiyn wrote:
               | Oh, so you are on "that's what I want!" camp. But I am
               | pretty sure you are in minority, or at the very least
               | economic minority. Slowdown implied by this semantics is
               | large, and easily costs millions of dollars.
               | 
               | > while I'd probably argue that it's actually OK for the
               | compiler to treat cases differently based on the apparent
               | intent of the programmer.
               | 
               | This is actually what I am looking for, i.e. answer to
               | "then what do you mean?". Standard should define how to
               | divine the apparent intent of the programmer, so that
               | compilers can do divination consistently. So far,
               | proposals have been lacking in detailed instruction of
               | how to do this divination.
        
               | bvrmn wrote:
               | > and easily costs millions of dollars
               | 
               | Looks like UB bugs can cost more. It's a new age of UB
               | sanitizers as a reaction to a clear problem with UB.
        
               | pcwalton wrote:
               | Bugs based on _optimizations_ that compilers make based
               | on assumptions enabled by undefined behavior (like the
               | null check issue from 2009 in the Linux kernel) actually
               | don 't cost very much. They get a disproportionate amount
               | of scrutiny relative to their importance.
        
               | mjevans wrote:
               | A programmer wrote things in a specific order for a
               | specific reason.
               | 
               | Lets instead assume that the variable assignments above
               | are to some global configuration variables and then f()
               | also references those and the behavior of f() changes
               | based on the previously written code.
               | 
               | The objections from the 'C as portable assembler' camp
               | are:
               | 
               | * Re-ordering the order of operations across context
               | switch bounds (curly braces and function calls). -- re-
               | ordering non-volatile store / loads within a context is
               | fine, and shouldn't generate warnings.
               | 
               | * Eliminating written instructions (not calling f())
               | based on optimizations. -- Modification to computed work
               | should always generate a warning so the optimization can
               | be applied to the source code, or bugs corrected.
        
               | wbl wrote:
               | Globals and locals are different. All compilers will give
               | a global a specific memory location and load and store
               | from it. Locals by contrast can be escape analyzed.
        
               | mjevans wrote:
               | The example didn't show where X was defined; it could be
               | anything.
        
               | comp54321 wrote:
               | There's not a bright line between optimization passes
               | being aware of UB in unreasonable vs "exploitative" ways.
               | The principled way to think about optimization is that an
               | optimization pass can assume certain invariants about the
               | code but must preserve other invariants through the
               | transformation as long as the assumptions hold. I think
               | basically any invariant you assume in a memory-unsafe
               | language C could be invalidated by real code.
               | 
               | A lot of the things people complain about are cases where
               | the compiler got better at reasoning through valid
               | transformations to the code, _not_ cases where the
               | compiler . E.g. I 'm sure that even very old compilers
               | have had optimization passes that remove redundant NULL
               | checks that logically depend on dereferencing NULL
               | pointers being undefined. But these probably were more
               | ad-hoc and could only reason about relatively simple
               | cases and constrained scopes. Another example is integer
               | overflow. I'd bet a lot of older compilers have loop
               | optimizations that somehow depend on the loop counter not
               | overflowing, or pointer addresses not wrapping around.
               | 
               | I think it's perfectly reasonable to say that C's
               | definition of undefined behaviour is too expansive and
               | too difficult to avoid in real code.
               | 
               | But I think that a lot of the complaints end up being
               | equivalent to "don't depend on UB in ways that I notice
               | in my code" or "don't depend on UB in ways that compilers
               | from the 90s didn't". That's not something you can
               | practically apply or verify when building a compiler
               | because there isn't a bright line between assuming an
               | invariant in reasonable vs unreasonable ways, unless you
               | can define a narrow invariant that distinguishes them.
        
             | ghoward wrote:
             | I want a safe C. If that means I have to deal with fewer
             | available optimizations, I'm all for that! That's why I
             | wrote two's-complement arithmetic with unsigned types and
             | why I actually have bounds checks on my array accesses in
             | my code.
             | 
             | That means I would gladly take a compiler that could not
             | allocate local variables in registers.
             | 
             | Or even better, just do a bounds check. The standard does
             | not preclude storing a raw pointer and a length together.
             | Why couldn't the compiler do something like that and do a
             | bounds check?
             | 
             | (Yes, the compiler would have to use its own version of
             | libc to make that happen, but it could.)
             | 
             | In other words, I want correct code first, performant code
             | second. I'll take the performance hit for correct code.
             | 
             | People claim that exploiting UB gives 1-2% speedup and that
             | that's important. What they conveniently leave out is that
             | you only get that speedup _when your code has UB_ , which
             | means that you only get that speedup when your code has a
             | bug in it.
             | 
             | Fast wrong code is bad. Very bad. Correct code, even if
             | slower, is an absolute must.
        
               | foxfluff wrote:
               | > People claim that exploiting UB gives 1-2% speedup and
               | that that's important. What they conveniently leave out
               | is that you only get that speedup when your code has UB,
               | which means that you only get that speedup when your code
               | has a bug in it.
               | 
               |  _Assuming there is no UB_ and optimizing under that
               | assumption gives a speedup. You get the speedup whether
               | that assumption is wrong or not. If that assumption is
               | wrong, your program might also blow up in hilarious ways.
               | But in no way does making the assumption that your code
               | has no undefined behavior rely on your code having
               | undefined behavior.
        
               | ghoward wrote:
               | > Assuming there is no UB and optimizing under that
               | assumption gives a speedup.
               | 
               | Sure, but to make sure you do not have any UB, at least
               | for things like signed arithmetic, you basically have to
               | either give up signed arithmetic, which is what I've done
               | by using only unsigned, or ensure your arithmetic is
               | always within range, which requires formal methods.
               | Otherwise, you have UB, and that 1-2% speedup is because
               | of that UB. If you do what I do, you get no speedup. If
               | you use formal methods to ensure ranges, you do, but at
               | an enormous cost.
               | 
               | > But in no way does making the assumption that your code
               | has no undefined behavior rely on your code having
               | undefined behavior.
               | 
               | But it does. Because the optimizations that assume UB
               | only kick in when UB could be present. And if UB _could_
               | be present, it most likely is.
               | 
               | Once again, those optimizations will _never_ kick in on
               | my code that uses purely unsigned arithmetic. So yes,
               | those optimizations _do_ require UB.
        
               | foxfluff wrote:
               | Is it formal methods to conclude that the vast majority
               | of arithmetic, which for the software I typically write
               | consists of iterators and buffer offsets dealing with
               | fixed size buffers, can never overflow an int? Is it
               | formal methods when I conclude that the 12-bit output
               | from an ADC cannot overflow a 32-bit int when squared? Is
               | it formal methods to conclude that a system with 256
               | kilobytes of RAM is not going to have more than 2^31
               | objects in memory? I guess I do formal methods then, at
               | an enormous cost. Sometimes I have runtime checks with
               | early bail outs that allow code later down the stream to
               | be optimized.
        
               | MaxBarraclough wrote:
               | > I would gladly take a compiler that could not allocate
               | local variables in registers
               | 
               | That sounds rather extreme. I imagine it would be ruinous
               | to performance. To my knowledge the various safe
               | alternatives to C have no trouble with register
               | allocation. Safe Rust and Ada SPARK, for instance, or
               | even Java.
               | 
               | > The standard does not preclude storing a raw pointer
               | and a length together. Why couldn't the compiler do
               | something like that and do a bounds check?
               | 
               | That technique is called _fat pointers_. It 's been well
               | researched. Walter Bright, who just posted a comment
               | elsewhere in this thread, has a blog post on it. [0] I
               | imagine ABI incompatibility is one of the main reasons it
               | hasn't caught on.
               | 
               | C++ compilers do something similar, storing array lengths
               | in an address like _myArray[-1]_. They need to store
               | array lengths at runtime because of the _delete[]_
               | operator (unless they can optimise it away of course).
               | C++ still allows all sorts of pointer arithmetic though,
               | so it wouldn 't be easy for a compiler to offer Java-like
               | guarantees against out-of-bounds access. Doing so takes a
               | sophisticated tool like Valgrind rather than just another
               | flag to gcc.
               | 
               | [0] https://digitalmars.com/articles/C-biggest-
               | mistake.html
        
               | pcwalton wrote:
               | Why not compile at -O0? That gives you everything that
               | you want today, including no register allocation.
        
               | sanxiyn wrote:
               | Yes, this is a consistent position, but bounds checking
               | all indexing and making pointers larger by including
               | length will cause slowdown much larger than 1-2%. My
               | estimate is at least 10% slowdown. So citing 1-2%
               | slowdown is misleading.
        
               | ghoward wrote:
               | I agree 1-2% might be misleading, but Dr. John Regehr
               | puts overflow checking overhead at 5%. [1]
               | 
               | I'll take 5% or even 10% for correctness because, again,
               | fast and wrong is very bad.
               | 
               | [1]: https://blog.regehr.org/archives/1154
        
             | jwatte wrote:
             | The concept of "friendly C" is not new, and this is one of
             | the straw men that it knocked down, because it is actually
             | possible to define "what I mean."
             | 
             | It's totally OK to say that an out-of-bounds write may or
             | may not be detectable by reading any particular other local
             | or global variable or heap value. That doesn't in turn
             | translate to compilers being allowed to turn a bug in a
             | SPECint kernel into an empty infinite loop (a case that
             | happened!)
        
         | mlindner wrote:
         | I pretty strongly disagree with these quotes you have here. You
         | don't need UB to do optimization, in fact because of UB in C
         | many forms of optimization are impossible in C because of
         | aliasing uncertainty. If things are defined to not alias then
         | even though things are fully defined there is a lot more
         | optimization that can be performed by not reloading variables
         | from memory and instead using local cached-in-register copies
         | of variables. This is why naive Rust can in some cases be
         | faster than naive C (without using of restrict keyword) and
         | this will only improve with time as compilers better support
         | this.
        
           | aw1621107 wrote:
           | > in fact because of UB in C many forms of optimization are
           | impossible in C because of aliasing uncertainty.
           | 
           | This doesn't seem right? Specific types of aliasing make
           | optimization more difficult/impossible because they are
           | specifically _not_ UB. Rust 's aliasing optimizations can
           | (theoretically?) occur specifically because same-type
           | aliasing _is_ UB in Rust. I 'm not sure how "UB prevents
           | optimization" falls out of that?
        
           | bee_rider wrote:
           | See also decades of Fortran performance dominance, only
           | overcome by throwing a gazillion C programmers at the
           | problem.
        
             | gpderetta wrote:
             | but doesn't Fortran get away with it exactly because
             | argument aliasing is UB? You can use restrict to get
             | Fortran semantics.
        
               | bee_rider wrote:
               | Actually yeah, you are right. I somehow misread it as a
               | defense of UB because of the last sentence. Total a
               | brain-fart on my part, sorry.
        
           | puffoflogic wrote:
           | > If things are defined to not alias
           | 
           | ... What do you call it then when you put aliasing pointers
           | into pointer variables "defined not to alias"? Can you
           | explain how this _reduces_ UB?
        
             | foxfluff wrote:
             | I think they're trying to say that if your language
             | _prevents_ aliasing (i.e. pointers by definition do not
             | alias and it is impossible to make them alias), then the
             | compiler doesn 't need to worry about it. C doesn't prevent
             | aliasing, therefore the compiler has to worry about every
             | case of aliasing that isn't undefined.
             | 
             | But it's not the UB here that prevents optimization, it's
             | precisely the fact that aliasing is (sometimes) defined. If
             | it were always UB, the compiler wouldn't ever have to worry
             | about it... but that'd require some interesting changes to
             | the language to keep it usable. Definitely not backwards
             | compatible changes.
        
               | puffoflogic wrote:
        
               | andrewla wrote:
               | > pointers by definition do not alias and it is
               | impossible to make them alias
               | 
               | This is equivalent to saying that pointer arithmetic is
               | disallowed. Pointers are by their nature in the C virtual
               | machine offsets into a linear memory space, so for any
               | two pointers, x and y, there exists a c such that
               | (ptr_t)x+c == (ptr_t)y, and thus there can always be
               | aliasing.
        
               | gpderetta wrote:
               | Actually no, the C abstract machine is not contiguous. If
               | x and y point inside distinct objects (recursively) not
               | part of the same object, there is no valid c that you can
               | add to x to reach y.
               | 
               | edit: allowing that would prevent even basic
               | optimizations like register allocation.
               | 
               | edit: s/virtual/abstract/
        
               | andrewla wrote:
               | The C abstract machine requires reversible
               | transformations from pointer to integral types (7.18.1.4
               | in C99, 3.3.4 in C90, the first ISO standard).
               | 
               | Practically speaking modern devices are in fact mostly
               | flat, so you can of course do this, although you do brush
               | up against undefined behavior to get there.
        
               | int_19h wrote:
               | Reversible transformations don't imply a flat address
               | space. All it means is that there's an integral type with
               | enough bits to record any address in them.
        
               | wbl wrote:
               | Pointer provenance is more complex and subtle.
        
               | Shikadi wrote:
               | What is the c virtual machine? I thought there wasn't one
        
               | foxfluff wrote:
               | They mean the abstract machine, in terms of which the
               | semantics are defined.
        
               | skissane wrote:
               | > Pointers are by their nature in the C virtual machine
               | offsets into a linear memory space
               | 
               | Historically, many platforms - such as Multics or Windows
               | 3.x - didn't have a linear memory space, they had some
               | kind of memory segmentation. The industry has largely
               | moved away from that towards the flat address space
               | model. Go back to the 1980s, it was still a much bigger
               | thing, and people used C on those platforms, and the
               | standard was written to support them. The actual detailed
               | control of memory segmentation is inherently non-portable
               | so cannot be addressed by the standard, but the standard
               | defines pointer arithmetic in such a way to support those
               | platforms - pointer arithmetic on unrelated pointers is
               | undefined, because if the pointers belong to different
               | memory segments the results can be meaningless and
               | useless.
        
               | andrewla wrote:
               | Even in cases of segmented memory architectures there is
               | still a requirement that a void pointer be able to cast
               | (reversably) into an integral type (7.18.1.4 in C99,
               | 3.3.4 in C90, the first ISO standard).
        
               | Sprocklem wrote:
               | This is not true in either of those C versions.
               | C99SS7.18.1.4 describes (u)intptr_t as optional, which
               | (per SS7.18) means that <stdint.h> need not provide those
               | typedefs if the (compiler) implementation doesn't provide
               | an integer type that allows reversible casting from a
               | void pointer. Similarly, it's not clear in C90SS3.3.4
               | that the implementation has to implement these casts in a
               | reversible manner, although that is the obvious way of
               | implementing it.
               | 
               | That being said, I can't think of an implementation that
               | didn't support this, even if they did have to pack
               | segment and offset information into a larger integer.
        
         | mzs wrote:
         | I really wish we got
         | 
         | cc: warning: UB line: 123 file: foo.c
        
           | foxfluff wrote:
           | Yeah, it'd be nice if we could solve the halting problem.
        
           | tlb wrote:
           | You can't detect most UB at compile time. LLVM has a system
           | to detect it a runtime. There is a significant performance
           | penalty, but it can be useful during testing.
           | 
           | See
           | https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
        
           | jcranmer wrote:
           | How many warnings do you want for this small function?
           | void oh_the_humanity(int *ptr, int val) {         *ptr = val
           | + 1;       }
           | 
           | Off the top of my head:
           | 
           | * UB: ptr may be pointing a float variable. (It's not illegal
           | to assign a float* to an int*, it's only UB when you actually
           | dereference it with the wrong type.)
           | 
           | * UB: val + 1 may overflow.
           | 
           | * UB: potential data race on writing *ptr.
           | 
           | * UB: ptr may be a one-past-the-end-of-the-array pointer,
           | which can be validly constructed, but may not be
           | dereferenced.
           | 
           | * UB: ptr may be pointing to an object whose lifetime has
           | expired.
           | 
           | * UB: ptr may be uninitialized.
           | 
           | * UB: val may be uninitialized.
           | 
           | As you can see, UB is intensely specific to the actual data
           | values; it's not really possible to catch even a large
           | fraction of UB statically without severe false positive
           | rates.
        
             | mzs wrote:
             | Yeah I know I get it, it's me more being wishful, but I
             | more seriously wish at least compilers could emit a warning
             | when they optimize something after UB:                 %
             | cat foo.c       #include <stdlib.h>               int
             | foo(int *bar)       {        int baz = *bar;
             | if (bar == NULL) exit(2);               return (baz);
             | }       % cc -O3 -Wall -Wextra -c foo.c       % objdump -dr
             | foo.o                          foo.o: file format Mach-O
             | 64-bit x86-64                     Disassembly of section
             | __TEXT,__text:              0000000000000000 _foo:
             | 0: 55                            pushq %rbp              1:
             | 48 89 e5                      movq %rsp, %rbp
             | 4: 8b 07                         movl (%rdi), %eax
             | 6: 5d                            popq %rbp              7:
             | c3                            retq       % cc -O0 -Wall
             | -Wextra -c foo.c       % objdump -dr foo.o
             | foo.o: file format Mach-O 64-bit x86-64
             | Disassembly of section __TEXT,__text:
             | 0000000000000000 _foo:              0: 55
             | pushq %rbp              1: 48 89 e5
             | movq %rsp, %rbp              4: 48 83 ec 10
             | subq $16, %rsp              8: 48 89 7d f8
             | movq %rdi, -8(%rbp)              c: 48 8b 45 f8
             | movq -8(%rbp), %rax             10: 8b 08
             | movl (%rax), %ecx             12: 89 4d f4
             | movl %ecx, -12(%rbp)             15: 48 83 7d f8 00
             | cmpq $0, -8(%rbp)             1a: 0f 85 0a 00 00 00
             | jne 10 <_foo+0x2a>             20: bf 02 00 00 00
             | movl $2, %edi             25: e8 00 00 00 00
             | callq 0 <_foo+0x2a>         0000000000000026:
             | X86_64_RELOC_BRANCH _exit             2a: 8b 45 f4
             | movl -12(%rbp), %eax             2d: 48 83 c4 10
             | addq $16, %rsp             31: 5d
             | popq %rbp             32: c3
             | retq       %
             | 
             | That's very similar to something that bit me in embedded
             | except it was with pointer to structure. Compiler realizes
             | I've derefed NULL and that's UB anyway so no need to do the
             | NULL check later and merrily scribble exc vectors or
             | whatever.
        
               | foxfluff wrote:
               | That's a nice example. It'd definitely be nice to have a
               | warning for this one.
               | 
               | Fwiw GCC does have a related warning flag (-Wnull-
               | dereference) but I'm not sure it's made exactly for this.
               | I believe it works based on functions being annotated for
               | possibly returning NULL, e.g. malloc. It's also not
               | enabled by -Wall or -W because apparently there were too
               | many false positives:
               | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96554
               | 
               | I imagine patches would be welcome. I'm guessing there
               | are more people who like to wish for more compiler
               | features than there are people who like to develop
               | compilers :)
        
               | mzs wrote:
               | edit: Thanks, I just checked and the warning doesn't work
               | 
               | https://godbolt.org/z/4TP1hfx4j
               | 
               | But your hint found -fno-delete-null-pointer-checks which
               | does the trick
               | 
               | https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
               | 
               | And -fno-delete-null-pointer-checks made it into LLVM
               | too. It's a good to know but a little late for when I
               | needed it, cheers :)                 % cc -O3 -fno-
               | delete-null-pointer-checks -Wall -Wextra -c foo.c       %
               | objdump -dr foo.o
               | foo.o: file format Mach-O 64-bit x86-64
               | Disassembly of section __TEXT,__text:
               | 0000000000000000 _foo:              0: 55
               | pushq %rbp              1: 48 89 e5
               | movq %rsp, %rbp              4: 48 85 ff
               | testq %rdi, %rdi              7: 74 04
               | je 4 <_foo+0xd>              9: 8b 07
               | movl (%rdi), %eax              b: 5d
               | popq %rbp              c: c3
               | retq              d: bf 02 00 00 00                movl
               | $2, %edi             12: e8 00 00 00 00
               | callq 0 <_foo+0x17>         0000000000000013:
               | X86_64_RELOC_BRANCH _exit
        
               | zajio1am wrote:
               | But such code might be generated by macros (or some code
               | generator), in which case silent elimination of
               | unnecessary code is expected and wanted behavior.
        
           | [deleted]
        
           | gpderetta wrote:
           | Compilers will routinely warn if they can detect UB at
           | compile time.
           | 
           | The problem is that except in a few trivial cases it is
           | impossible to detect UB at compile time. Even whole program
           | static analysis can only catch a small subset.
        
         | mpweiher wrote:
         | He doesn't really address any of the claims made, and makes
         | some startling claims himself. For example:
         | 
         | > This example demonstrates that even ICC with -O1 already
         | requires unrestricted UB.
         | 
         | The example demonstrates nothing of the sort. It demonstrates
         | that ICC _uses_ unrestricted undefined behaviour, not that this
         | is required in any way shape or form. (The only way the word
         | "requires" is reasonable here is that the behavior seen
         | "requires" use of this kind of UB to be present. But that's
         | something very different, and doesn't match with the rest of
         | his use).
         | 
         | > writing C has become incredibly hard since undefined behavior
         | is so difficult to avoid
         | 
         | No, it has become difficult because compilers exploit UB in
         | insane ways. The platform specific UB that he claims is "not an
         | option" is, incidentally, exactly how UB is defined in the
         | standard:
         | 
         |  _Permissible undefined behavior ranges from ignoring the
         | situation completely with unpredictable results, to behaving
         | during translation or program execution in a documented manner
         | characteristic of the environment (with or without the issuance
         | of a diagnostic message), to terminating a translation or
         | execution (with the issuance of a diagnostic message)._
         | 
         | This was made non-binding in later versions of the standard, so
         | optimiser engineers act as if these words don't exist.
         | 
         | And of course there is no reason given for this interpretation
         | being "not an option". Except for "but I wanna". Yes, it's not
         | an option if you really want to exploit UB in the insane ways
         | that compilers these days want to exploit it.
         | 
         | But that's not _necessary_ in any way shape or form.
         | 
         | > That would declare all but the most basic C compilers as non-
         | compliant.
         | 
         | Yes, because by any sane interpretation of the standard they
         | _are_ non-compliant.
         | 
         | On a more general note, I find this idea that things are
         | _necessary_ because I _want_ to do them really bizarre.
        
           | aw1621107 wrote:
           | > This was made non-binding in later versions of the
           | standard, so optimiser engineers act as if these words don't
           | exist.
           | 
           | This is reminiscent of the argument in "How One Word Broke C"
           | [0, HN discussion at 1]. I'm not particularly convinced this
           | argument is correct. In my opinion, it's "ignoring the
           | situation completely" that's the phrase of interest; after
           | all, what is assuming that UB cannot occur but "ignoring the
           | situation completely"?
           | 
           | I'm a nobody, though, so take that with an appropriate grain
           | of salt.
           | 
           | [0]: https://news.quelsolaar.com/2020/03/16/how-one-word-
           | broke-c/ (currently broken; archive at https://web.archive.or
           | g/web/20210307213745/https://news.quel...
           | 
           | [1]: https://news.ycombinator.com/item?id=22589657
        
             | mpweiher wrote:
             | Apologies, but "ignoring the situation completely" is the
             | exact opposite of "assuming the situation cannot occur and
             | _acting_ on that assumption ".
        
               | aw1621107 wrote:
               | I'm not sure how "acting on an assumption that the
               | situation cannot occur" is distinguishable from "choosing
               | to ignore situations completely whenever they come up".
               | The former is a blanket description of how you treat
               | individual instances.
               | 
               | For example:                   int f(int* i) {
               | int val = *i;             if (i == NULL) { return 0; }
               | return val;         }
               | 
               | I submit that there are two situations here:
               | 
               | 1. i is NULL. Program flow will be caught by the NULL
               | check and the return value is 0.
               | 
               | 2. i is not NULL. The NULL check cannot be hit, and the
               | return value is val.
               | 
               | As allowed by the standard, I'll just ignore the
               | situation with UB, leaving
               | 
               | > i is not NULL. The NULL check cannot be hit, and the
               | return value is val.
               | 
               | Alternatively, I can assume that UB cannot occur, which
               | eliminates option 1, leaving
               | 
               | > i is not NULL. The NULL check cannot be hit, and the
               | return value is val.
               | 
               | You get the same result either way.
               | 
               | And that gets to the root of the problem: What _exactly_
               | does  "ignoring the situation completely" mean? In
               | particular, what is the scope of a "situation"?
        
           | gpderetta wrote:
           | sorry, I don't understand, it seems to me that current
           | optimizing compilers are fully compliant with the option of
           | "ignoring the situation completely with unpredictable
           | results". That's how UB exploiting optimizations work: the
           | compiler ignores the possibility that the erroneous case can
           | ever happen and takes into account only all valid states.
        
             | mpweiher wrote:
             | Apologies, but "ignoring the situation completely with
             | undpredictable results" is the exact opposite of "assuming
             | the situation cannot occur and _acting_ on that assumption
             | ".
             | 
             | If my program attempts to write outside the bounds of
             | declared array, ignoring the situation (that this is UB) is
             | letting that write happen, and letting the chips fall where
             | they might.
             | 
             | How is assuming it cannot/must not happen, and then
             | optimising it away because it did happen "ignoring the
             | situation"??
        
               | gpderetta wrote:
               | So a compiler that is allowed to ignore a situation would
               | still be required to generate code for that situation? I
               | don't even know how that would be possible.
        
         | commandlinefan wrote:
         | I had the same basic thought while reading this: the only
         | alternative here is to interpret this sort of undefined
         | behavior in a "different" way (such as the way the K&R
         | originally interpreted it) and leave it up the programmer to
         | intuit what _other_ assumptions the compiler is making on their
         | behalf. Like... I can assume that the compiler will interpret:
         | if (i++ < i)
         | 
         | as                   mov ax, i         inc ax         cmp ax, i
         | jge else
         | 
         | ... but that's still me making a (reasonable?) assumption. The
         | C spec essentially just says "don't program that way".
        
           | Shikadi wrote:
           | Odd intuition, since you never store i back after
           | incrementing. Also, I believe i++ might increment after the
           | comparison rather than before, vs. ++i incrementing first
        
             | commandlinefan wrote:
             | You know what I mean.
        
               | Shikadi wrote:
               | I actually don't, but maybe I just lack enough background
               | knowledge on the discussion to understand your intent
        
               | saghm wrote:
               | I think they're saying that the assembly translation
               | doesn't store back `i`, whereas the C version does, so
               | it's a not straightforward to assume that the assembly
               | compiled from the C won't do that.
        
               | commandlinefan wrote:
               | The article talks about a similar code snippet being
               | optimized out by the compiler because i++ can never be
               | less than i originally was (unless you take into account
               | the actual behavior of computers).
        
             | int_19h wrote:
             | ++ is itself a store.
        
       | kortex wrote:
       | Is it even possible to have zero undefined behavior in languages
       | that allow user-defined pointers? It seems like allowing even
       | just one degree of memory indirection creates a singularity
       | beyond which any kind of formal guarantees become impossible.
       | Seems like you'd have to allow only structures which hide memory
       | implementation details if you truly want to avoid all UB. Same
       | goes for any arithmetic which could overflow.
       | 
       | That would require kernel devs to radically rethink how they
       | interact with I/O, which would probably require specific
       | architectures.
       | 
       | In other words, writing a kernel portable on any of the existing
       | ISAs that is also performant is basically impossible, barring
       | some humongous breakthrough in compiler technology.
       | 
       | Seems to me that when it comes to brass tacks, UB is kind of the
       | "we are all adults here" engineering tradeoff that enables
       | shipping fast and useful software, but is _technically_ not
       | strictly defined and thus usually does what you want, but can
       | result in bugs.
        
         | immibis wrote:
         | Even kernels only interact with memory in reasonably
         | predictable ways. I think they could all be hidden behind such
         | abstractions, BUT it will make the language a lot more complex.
        
         | kps wrote:
         | The original C committee wrote, as part of its guiding
         | principle to "Keep the spirit of C", that "To help ensure that
         | no code explosion occurs for what appears to be a very simple
         | operation, many operations are defined to be _how the target
         | machine's hardware does it_ rather than by a general abstract
         | rule."1 That is, if you write `a = b + c` you expect the
         | compiler to generate an `add a, b, c` instruction, and if that
         | happens to trap and burn your house down, well, that 's not C's
         | problem.
         | 
         | I'm convinced that the original UB rule was intended to capture
         | this, and the wording was an error later seized by compiler
         | developers. As evidence, consider Dennis Ritchie's rejection of
         | `noalias` as "a license for the compiler to undertake
         | aggressive optimizations that are completely legal by the
         | committee's rules, but make hash of apparently safe programs"2.
         | If _anyone_ at the time had realized that this is what the
         | definition of UB implied, it would have been called out and
         | rejected as well.
         | 
         | 1 https://www.lysator.liu.se/c/rat/a.html#1-1
         | 
         | 2 https://www.lysator.liu.se/c/dmr-on-noalias.html
        
           | unwind wrote:
           | I think expecting the compiler to generate code that makes
           | 'a' hold the sum of 'b' and 'c', assuming there is code that
           | "observes" the value in some way, is a more useful model.
        
           | gpderetta wrote:
           | > if you write `a = b + c` you expect the compiler to
           | generate an `add a, b, c`
           | 
           | this could be violated even by simple optimizing compilers
           | that do constant propagation or strength reductions.
           | 
           | Again, if you do not want optimizations -O0 is always
           | available.
        
             | ErikCorry wrote:
             | Nowhere does anyone guarantee that you won't be hit with
             | nasal demons if you use -O0
        
               | gpderetta wrote:
               | Well, it is hard to guarantee anything about a bit of
               | code with no defined semantics, you'll have to settle for
               | what O0 gives you.
        
             | kps wrote:
             | The statement, in context, is a simple illustration of the
             | principle that C operator semantics were defined by the
             | target hardware. Not a treatise on compiler construction.
        
             | naniwaduni wrote:
             | It would be nice if optimizing compilers' -O0 weren't
             | completely insane on the assumption that the insanity would
             | get optimized out.
        
         | remram wrote:
         | Undefined behavior means something specific in the standard,
         | it's not just an operation that might do different things on
         | different compilers and machines. It means that if it happens,
         | the program is allowed to do anything and everything, even
         | before the UB is reached. Undefined behavior is breaking an
         | assumption that the compiler is allowed to make.
         | 
         | It is probably impossible to make a low-level language with no
         | _implementation-defined_ behavior, but it is certainly possible
         | to make one with no _undefined behavior_. For example, you can
         | put in your spec that overflowing an unsigned integer can give
         | any value; that is different that putting in your spec that it
         | doesn 't happen and if you write it the variable might have no
         | value, multiple values, or burn your socks off.
         | 
         | https://en.wikipedia.org/wiki/Undefined_behavior
        
           | comp54321 wrote:
           | It's actually impossible as far as I can see as long as you
           | have unsafe memory access via pointers and memory is used to
           | hold information about control flow and local variables,
           | because overwriting memory can then do weird and wonderful
           | things to the state of the program. E.g. local variables can
           | magically change values, but worse, the control flow of your
           | program can be hijacked in pretty arbitrary ways.
           | 
           | It would be great if this wasn't possible, because then you
           | wouldn't have a whole class of security vulnerabilities.
           | 
           | On older systems without memory protection, the situation is
           | even worse - you could scribble all over the OS memory as
           | well and who knows what happens then.
        
         | foxfluff wrote:
         | > Is it even possible to have zero undefined behavior in
         | languages that allow user-defined pointers?
         | 
         | It kinda boils down to what exactly you mean by defined
         | behavior. A C programmer's take might be that you can run a
         | conforming program in an emulated abstract machine and get
         | defined results out of it. And then you can run the same thing
         | on real hardware and expect to get the same result (modulo
         | implementation defined behavior). This definition leaves some
         | things out (e.g. performance, observable effects in the "real
         | world") but it captures the computational semantics.
         | 
         | Another programmer's take might be more akin to a portable
         | assembler. In that case, you certainly could define reads and
         | writes for arbitrary pointers, in the sense that they must
         | cause corresponding (attempted) loads and stores at the machine
         | level. However, the definition wouldn't be complete since it
         | inevitably leaves much to the underlying implementation. Thus
         | you could have "defined" C programs that show completely
         | different behaviors depending on which implementation and
         | hardware you used. It would be impossible to say what the
         | program's output must be "in the abstract." For someone who
         | just wants to output assembly, maybe that's fine. I'm not sure
         | other people would be too satisfied with it. An out of bounds
         | write could still blow up your program and be remotely
         | exploitable; practically the same thing as undefined behavior,
         | except that now your compiler is also barred from optimizing.
         | 
         | There's quite a bit of tension between these two camps.
         | 
         | Alternatively, you could fully define it at a great runtime
         | cost and potential exclusion of real hardware implementations.
        
         | pornel wrote:
         | If by user-defined pointers you mean arbitrary integer-to-
         | pointer casts, then this is a kryptonite for static analysis,
         | and I don't think you can have a language that is both fast and
         | fully predictable (UB-free) in their presence. It breaks
         | pointer provenance and aliasing analysis, and existing
         | compilers already struggle with such casts in C.
         | 
         | But apart from that, you can have pointers, with many levels of
         | indirection, as long as there are rules that prevent use-after-
         | free, unsynchronized concurrent access, and other UB-worthy
         | problems. Rust's borrow checker with rules for no mutable
         | aliasing and Send/Sync markers for concurrent access comes
         | close, but it has to give up on generality for safety (e.g. it
         | can't reason about circular data structures).
        
         | ErikCorry wrote:
         | Without concurrency you don't have to have UB to have user-
         | defined pointers. x86 assembly language has no UB and has user-
         | defined pointers.
        
           | ahefner wrote:
           | CPUs can and do have UB in the instruction specifications.
           | x86 definitely does.
        
             | dooglius wrote:
             | In ring zero there are some, but no instructions that
             | userspace can use (or else security would be impossible)
        
             | secondcoming wrote:
             | Indeed, for example the BSR instruction:
             | 
             | > If the content source operand is 0, the content of the
             | destination operand is undefined.
             | 
             | [0] https://www.felixcloutier.com/x86/bsr
        
               | dooglius wrote:
               | No, this is not the same thing as "undefined behavior" in
               | the sense the ISO C spec defines it. The content of the
               | output being unspecified would not, for example, allow
               | the processor to write to some part of memory if the
               | source is zero, or change unrelated registers. x86 is
               | doing "undefined" the right way, unlike the ISO C spec.
        
               | kllrnohj wrote:
               | It's exactly the same as "undefined behavior" in C. The
               | only difference is in the consequences that follow from
               | it when optimizers get ahold of it.
               | 
               | If it's undefined behavior if BSR is given 0, then
               | therefore the compiler can assume the value passed to BSR
               | isn't 0 (because it's defined that a well-formed program
               | doesn't encounter undefined behavior). Therefore a check
               | if the value is == 0 can be skipped, because it can't be
               | zero since BSR didn't allow it. And since the code inside
               | the if isn't reachable, that probably means now other
               | things are no longer reachable and can similarly be
               | removed. And etc...
               | 
               | That's all the "undefined behavior allows the compiler to
               | murder my cat!" really is - it's just the chain of
               | consequences that result from the compiler following the
               | rules it was told. It was given a rule that a parameter
               | couldn't be null, and so it listened & did what it was
               | told including deleting redundant null checks (since
               | after all, the value was already specified to be not
               | null!).
               | 
               | To demonstrate the problem in a different language, let's
               | say I have this code:                   float
               | asFloat(@Nullable Integer i) {             return i ==
               | null ? Float.NAN : i.floatValue();         }
               | 
               | You'd expect the null check, right? But let's say I call
               | it from this function:                   void
               | sayHi(@NonNull Integer i) {             println("Hello "
               | + asFloat(i));         }
               | 
               | The optimizer comes along and inlines the two together
               | and now sees:                   void sayHi(@NonNull
               | Integer i) {             float temp = i == null ?
               | Float.NAN : i.floatValue();             println("Hello "
               | + temp);         }
               | 
               | Does it still need to keep the null check? After all, I
               | defined that 'i' isn't null, so surely the null check is
               | just unreachable code & can be eliminated, right? If it
               | doesn't, it's obviously wasting performance. If it does,
               | the internet gets angry about the compiler "abusing
               | undefined behavior." Even though as far as the compiler
               | is concerned it _isn 't_ undefined behavior! 'i' is
               | defined to be not-null!
        
               | dooglius wrote:
               | What "compiler"? We're talking about the x86 instruction
               | specification. The microcode translator certainly isn't
               | allowed make the sort of optimization you're describing,
               | that would be a violation of the spec that GP linked.
        
           | gpderetta wrote:
           | What would be, for example, the expected behaviour of
           | accessing an out of scope variable?
        
             | ErikCorry wrote:
             | You will get whatever bits have been written at that
             | location
        
               | gpderetta wrote:
               | What location? The variable might only have been
               | allocated in a register.
        
               | ErikCorry wrote:
               | There's no register allocator in X86 assembly
        
       | WalterBright wrote:
       | I had an online discussion some years back where I suggested that
       | C nail the size of char to 8 bits. He responded that there was a
       | CPU that had chars be 32 bits, and wasn't that great that a C
       | compiler for it would be Standard compliant?
       | 
       | I replied by pointing out that nearly every non-trivial C program
       | would have to be recoded to work on that architecture. So what
       | purpose did the Standard allowing that actually achieve?
       | 
       | I also see no problem for a vendor of a C compiler for that
       | architecture making a reasonable _dialect_ of C for it. After
       | all, to accommodate the memory architecture of the x86, nearly
       | all C compilers in the 80 's adopted near/far pointers, and while
       | not Standard compliant, it really didn't matter, and was
       | tremendously successful.
       | 
       | D made some decisions early on that worked out very well:
       | 
       | 1. 2's complement wraparound arithmetic
       | 
       | 2. sizes of basic integer types are fixed at 1 byte for chars, 2
       | for shorts, 4 for integers, 8 for longs. This is worked out very
       | well
       | 
       | 3. floating point is IEEE
       | 
       | 4. char's are UTF-8 code points
       | 
       | 5. chars are unsigned
       | 
       | These 5 points make for tremendous simplicity gains for D
       | programmers, and ironically _increase_ portability of D code.
       | 
       | After reading the paper, I'm inclined to change the definition of
       | UB in D to _not_ mean it can be assumed to not happen and not be
       | unintended.
        
       | abfan1127 wrote:
       | what's the history of having undefined behavior? why would a
       | language designer not specify?
        
         | rocqua wrote:
         | Because the code should be somewhat portable, and not all
         | processors act the same. This is why left and right shift are
         | weird with overflow. Not consistent between x86 and some other
         | instruction set. Similar for accessing the null address.
         | 
         | Besides that, how would you specify the behavior of reading
         | from an address the user randomly generated himself? What is
         | stored at an address not generated by the compiler is out of
         | scope for the compiler. If you try to define something like
         | 'always trap' or 'always return 0', you incur massive overhead.
         | 
         | Besides that. The fact that signed integer over/underflow is
         | undefined behavior actually unlocks a lot of reasonable
         | optimizations that treat integers like actual integers. Things
         | like (a+1 > a) always being true.
        
           | [deleted]
        
         | gpderetta wrote:
         | Sanitizers attempt to do define the behaviour of undefined
         | constructs, but expect an integer multiple slowdown compared
         | even with unoptimized builds.
         | 
         | It is very hard to specify the behaviour of, for example, use-
         | after-free, or accessing an automatic variable after it goes
         | out of scope, or data races, in a C-like language without
         | significant runtime cost.
        
       | speedcoder wrote:
       | Can you still compile gcc code with -O0 (gcc option to turn
       | optimization off) to get completely defined behavior? When doing
       | so does it actually still turn off all optimizations? Also does
       | -Os (optimization for size) still produce defined behavior?
        
         | foxfluff wrote:
         | > Can you still compile gcc code with -O0 (gcc option to turn
         | optimization off) to get completely defined behavior?
         | 
         | No. The standard specifies what's undefined, optimization
         | levels don't change it (though there are compiler flags such as
         | -fwrapv which make undefined things defined).
         | 
         | However, turning off optimizations will make behavior easier to
         | predict.
        
           | lanstin wrote:
           | it used to be passed around never to use -O0 because it was
           | much less used/tested and would generate wrong code more
           | often. Not sure id that was folklore or true.
        
       | RcouF1uZ4gsC wrote:
       | I think the undefined behavior in C and C++ are even less
       | defensible today than when they started because of the
       | convergence of architectures.
       | 
       | Pretty much every non-legacy architecture does IEEE floating
       | point. Pretty much all of them do a flat address space. The word
       | size is some power of 2(32 bit, 64 bit, maybe 128 bit in the
       | future). They are almost always little endian. The memory models
       | are converging towards the C++ memory model.
       | 
       | Given that, I think simplifying the language and getting rid of
       | foot guns could be done without losing any significant
       | performance or actual flexibility/portability.
        
         | pjmorris wrote:
         | It could be called 'C--' unless that's been taken.
        
           | RcouF1uZ4gsC wrote:
           | There is a language C-- created by Microsoft Research (Simon
           | Peyton Jones of Haskel and GHC fame)
           | 
           | https://www.microsoft.com/en-us/research/wp-
           | content/uploads/...
        
         | jandrese wrote:
         | You could also add that they all do 2s complement math.
         | 
         | This is what the OpenBSD team did to OpenSSL. If the code has
         | some complexity that is only necessary because it might have
         | been run on a VAX or AIX or early Cray architecture then it is
         | time to excise that complexity. They deleted thousands and
         | thousands of lines of support for architectures that are only
         | seen in museums and landfills today.
        
       | maxlybbert wrote:
       | It may be unusable for "modern" operating systems (you could
       | write an early UNIX clone with just ISO C because early UNIX
       | didn't support much).
       | 
       | But that's basically always been the case. I doubt you could stay
       | within the first ISO C standard and write a modern operating
       | system.
        
         | phicoh wrote:
         | Assuming that an early Unix clone would be written in C plus
         | some amount of assembler. Then what constructs of modern Unix
         | systems make it impossible to in write in the first ISO C
         | standard plus some assembler?
         | 
         | For example, 4.4BSD, early Solaris have just about everything
         | you would expect in a modern operating system. And give the age
         | of those systems, they were written in early versions of the C
         | standard.
        
           | immibis wrote:
           | This is true for any language. An operating system can be
           | written in Java plus some amount of assembler.
        
           | maxlybbert wrote:
           | (Duplicating an answer to a similar question):
           | 
           | The C standard has notes about a freestanding implementation,
           | mainly for developing kernels. The requirements C has for
           | freestanding environments are very limited ( http://www.open-
           | std.org/jtc1/sc22/wg14/www/docs/n1570.pdf , you get float.h,
           | iso646.h, limits.h, stdalign.h, stdarg.h, stdbool.h,
           | stddef.h, stdint.h, and stdnoreturn.h) and the starting point
           | for the program is implementation defined; you might get
           | _Atomic, but that's optional). For the record, C++ goes a
           | little overboard and promises all sorts of things in its
           | freestanding environment.
           | 
           | https://groups.google.com/g/comp.std.c/c/zzyii-
           | DlMiU/m/RnH5i...
        
           | ErikCorry wrote:
           | Before 1985 there was no C standard, and until 1989 it was
           | only a draft. So I doubt anything was written in standard C
           | in the 1970s or early 1980s.
        
             | phicoh wrote:
             | C89 is mostly a superset of K&R C. So if you can write an
             | OS in K&R you can write it is C89.
             | 
             | The issue was, do modern operating systems require newer
             | versions of the C standard? Or can you write everything in
             | C89.
        
               | ErikCorry wrote:
               | K&R wasn't standardized either though.
        
               | pjmlp wrote:
               | Inline assembly is not part of C89, so a pure C89
               | compiler needs to rely on an external Assembler for
               | systems programming for example.
               | 
               | Same applies to using C in hardware that lacked memory
               | mapped IO.
        
         | andyjohnson0 wrote:
         | > I doubt you could stay within the first ISO C standard and
         | write a modern operating system.
         | 
         | Do you take that view because of some intrinsic deficiency in
         | the language, or the difficulty of memory safety, or something
         | else?
        
           | maxlybbert wrote:
           | The C standard has notes about a freestanding implementation,
           | mainly for developing kernels. The requirements C has for
           | freestanding environments are very limited ( http://www.open-
           | std.org/jtc1/sc22/wg14/www/docs/n1570.pdf , you get float.h,
           | iso646.h, limits.h, stdalign.h, stdarg.h, stdbool.h,
           | stddef.h, stdint.h, and stdnoreturn.h) and the starting point
           | for the program is implementation defined; you might get
           | _Atomic, but that's optional). For the record, C++ goes a
           | little overboard and promises all sorts of things in its
           | freestanding environment.
           | 
           | https://groups.google.com/g/comp.std.c/c/zzyii-
           | DlMiU/m/RnH5i...
        
       | phicoh wrote:
       | One thing that is not mentioned in the article, is that next to
       | undefined behavior, there is also implementation defined
       | behavior.
       | 
       | For example, if signed integer overflow would be implementation
       | defined behavior, then any weirdness would be limited to just the
       | integer operation that overflows.
       | 
       | Lots of other stuff can be expressed as implementation defined
       | behavior. That would probably kill some optimizations.
       | 
       | So the question is more, do we want a portable assembler? In that
       | case as many C constructs as possible need have defined behavior.
       | Either defined by the standard or as part of the compiler
       | documentation.
       | 
       | Another possibily is to have standards for C on x86, amd64, arm,
       | etc. Then we can strictly define signed integer overflow, etc.
       | And say that on x86, pointers don't have alignment, so a pointer
       | that points to storage of suitable size can be used to stored an
       | object of different type, etc.
       | 
       | If the goal is to run SPEC as fast as possible, then making sure
       | every program trigger undefined behavior is the way to go.
        
         | wruza wrote:
         | I have a dumb question. Why can "we" write pretty good apps in
         | languages other than C, but can't write operating systems? Is
         | talking to hardware so much different than talking to APIs?
         | 
         | Another point of view on the same question: Looking at software
         | and hardware, the latter evolved insanely, but the former
         | didn't get seemingly faster, at least in userlands. Why bother
         | with UB-related optimizations at all for a wide spectrum of
         | software? Is there even software which benefits from -O3 _and_
         | doesn't use vectorization intrinsics? Why can't "we" just
         | hardcode jpeg, etc for few platforms? Is that really easier to
         | maintain opposed to maintaining never ending sources of UB?
         | 
         | Iow, why e.g. my serial port or ata or network driver has to be
         | implemented in C, if data mostly ends up in stream.on('data',
         | callback) anyway?
        
           | _3u10 wrote:
           | Yes, C provides a way to talk to ABIs, in addition to APIs.
           | It's not just "talking to hardware" it's talking to other
           | software in a reliable way, such that you can upgrade your C
           | compiler and have code written in C89 talk to code written in
           | C11 which is unheard of in most of the other languages that
           | don't support an ABI. (Think Python2 being incompatible with
           | Python3)
           | 
           | Software has gotten much faster. Yes, almost all software
           | benefits from -O3. What do you mean "hardcode"? as far as I
           | know libjpeg can be linked statically...
           | 
           | UB is easy to maintain, lets take integer addition &
           | overflow, you just issue an ADD instruction and however that
           | CPU executes the ADD instruction is how integer overflow
           | works on that platform and then in the C standard you write
           | "integer overflow is undefined behavior".
        
           | gpderetta wrote:
           | A lot of C++ codebases benefit greatly by O3 due to the more
           | aggressive inlining and interprocedural optimizations.
           | 
           | Also may UB exploiting things like strict aliasing are
           | enabled by default at all optimization levels in GCC.
        
           | eatbitseveryday wrote:
           | > Why can "we" write pretty good apps in languages other than
           | C, but can't write operating systems? Is talking to hardware
           | so much different than talking to APIs?
           | 
           | Operating systems are written in other languages, such as C++
           | and Rust.
           | 
           | One requirement is that a language must be compiled and thus
           | cannot rely on a runtime. That excludes Go and Java.
           | 
           | The language needs to support direct memory manipulation.
           | 
           | The compiled binary cannot be emitting system calls since
           | that binary IS the kernel. Thus the compiler must be told to
           | not link or include standard libraries.
           | 
           | You need to disable certain optimizations like advanced
           | vectorization extensions and red zone use on the stack.
           | 
           | There are others. Lots of specific control needed.
        
             | ArbixMenix wrote:
             | >One requirement is that a language must be compiled and
             | thus cannot rely on a runtime. That excludes Go and Java.
             | 
             | Maybe I'm wrong, but I know that there exist CPUs made
             | specifically to natively execute Java bytecode, so in
             | reality if the hardware has a baked-in language
             | interpretation it would be actually possible to write an OS
             | completely in Java
        
               | gpderetta wrote:
               | running java bytecode natively is neither necessary nor
               | sufficient as you can compile java to any other native
               | ISA, but you do still a relatively heavy runtime for GC.
               | 
               | Having said that, there have been OSs written in
               | languages with heavy runtimes, even GC.
        
               | pjc50 wrote:
               | ARM "Jazelle" was capable of this, but it required a C
               | implementation of a JVM. Any GC-dependant language has
               | this problem.
        
               | pjmlp wrote:
               | Please find the C code here,
               | 
               | https://people.inf.ethz.ch/wirth/ProjectOberon/Sources/Ke
               | rne...
        
               | bluGill wrote:
               | True, you can design a CPU for anything. However a OS
               | that depends on such a CPU is not portable to anything
               | else, and can't easily run most programs that people
               | depend on (emulators are possible, but not easy). Also
               | most CPU advances haven't gone into such a thing and it
               | is tricky to apply those advances while also providing
               | what the language needs. None of this is impossible, but
               | it makes such CPUs in todays world of questionable value.
               | 
               | Note that you can port any OS written in C to such a CPU
               | with "just" a new compiler backend and a few drivers.
               | Your OS won't take advantage of the features the CPU
               | provides, but it will work.
        
               | 3836293648 wrote:
               | Eh, can you really properly implement a CPU without
               | interrupts? I wouldn't categorise anything in that space
               | as a driver
        
               | bluGill wrote:
               | Good point. I assumed there was some form of interrupt
               | system, but not all CPUs need to have it, and lacking
               | that your OS choices will be limited.
        
             | pjmlp wrote:
             | > That excludes Go and Java.
             | 
             | Only for those that cargo cult against using them.
             | 
             | https://www.f-secure.com/en/consulting/foundry/usb-armory
             | 
             | https://developer.arm.com/solutions/internet-of-
             | things/langu...
             | 
             | https://www.ptc.com/en/products/developer-tools/perc
             | 
             | https://www.aicas.com/wp/products-services/jamaicavm/
        
               | eatbitseveryday wrote:
               | I was answering the question in a general sense for the
               | more prolific operating systems and on generic commonly
               | available general-purpose processors.
               | 
               | Yes one can implement a CPU that natively executes a
               | runtime for a high-level language, make your own ASIC, or
               | FPGA, etc. that does this. That is a more advanced
               | response to the general question.
               | 
               | Knowing the detailed points I mentioned will help
               | understand why specialization of processors is needed to
               | support other higher-level languages that do not meet the
               | requirements I laid out.
        
               | pjmlp wrote:
               | Which just proves your lack of knowledge that those
               | runtimes target generic commonly available general-
               | purpose processors.
               | 
               | None of those products use FPGAs or custom ASICs.
        
               | eatbitseveryday wrote:
               | > just proves your lack of knowledge
               | 
               | Tone is not needed.
               | 
               | For TamaGo, it seems to allow developers run their
               | application, not build an OS on the hardware. But I have
               | not played with it, you are right.
               | 
               | > TamaGo is a framework that enables compilation and
               | execution of unencumbered Go applications on bare metal
               | 
               | The environment does not seem to allow building a generic
               | operating system [1]. F-Secure ported the runtime itself
               | to boot natively. But please correct me.
               | 
               | > There is no thread support
               | 
               | The environment you run in is specifically curated for Go
               | applications, such as the memory layout. I'd call this an
               | "appliance" rather than enabling Go to be used for full-
               | fledged generic operating system implementations.
               | 
               | [1] https://github.com/f-secure-
               | foundry/tamago/wiki/Internals
        
               | pjmlp wrote:
               | Tone follows the last paragraph, returning ball.
               | 
               | An OS is an OS, regardless of the userspace.
        
               | MauranKilom wrote:
               | https://xkcd.com/801/
        
             | matthewaveryusa wrote:
             | This is why rust is so exciting: it's the first new
             | language that's graduated from toy-language space we've
             | seen in a while without a runtime. (python, ruby, go and
             | typescript-nodejs are the other graduates I'm thinking
             | about.)
        
           | phicoh wrote:
           | Let's assume you want write most of the operating system in
           | the high level language and as little as possible in
           | assembler.
           | 
           | For most languages, writing hardware trap handler becomes
           | quite a bit of an issue. In trap handler you cannot rely on
           | an extensive runtime system. Anything that does garbage
           | collection is probably out. Anything that does dynamic memory
           | allocation is out as well.
           | 
           | Beyond that, how easy is it to create pointers, create
           | datastructures that match a specific memory layout, etc. Low
           | level device drivers need to talk to hardware in very
           | specific ways. If it is hard to talk to the hardware, most
           | people are probably not going to bother using that language
           | for an operating system.
           | 
           | In theory you could mix and match languages in a kernel. For
           | example, a filesystem could be written in a language that has
           | an extensive runtime system.
        
             | nine_k wrote:
             | I'd say that Rust (and, to a smaller extent, Zig, and,
             | AFAICT, Ada) allow to write code that is guaranteed to not
             | allocate, and define the exact memory layout of certain
             | structures, all while offering much tighter protections
             | than C.
             | 
             | Of course, there are things that cannot be expressed in
             | safe code in either language. But marking fragments of code
             | as unsafe, where C-like unconstrained access is allowed,
             | helps a lot to minimize such areas and make them explicit.
             | 
             | There is definitely room for a more expressive and safe
             | languages in the kernel-level space. We can look at Google
             | Fuchsia or maybe at Redox OS, both are very real operating
             | systems trying to use safer languages, with some success.
        
               | int_19h wrote:
               | Ada still has plenty of UB; it's just that the cases
               | where it arises are usually more explicit.
        
               | zdragnar wrote:
               | I think stability plays a big role in C continuing to
               | remain dominant. Rust and Zig arent there yet, and wrt
               | Rust in particular the ownership model doesn't play
               | nearly as nicely in non-deterministic environments
               | (taking far more code to deal with hardware such as an
               | external display or printer that might, for example, get
               | randomly unplugged at any point in time works against the
               | grain of a static memory ownership analysis).
        
               | nine_k wrote:
               | I'd say it's a good example why more static checks like
               | lifetimes are useful.
               | 
               | With them, you can at least tell apart data structures
               | that are fleeting and can disappear when a cable is
               | ejected, and those which should stay put. This likely
               | might help avoid another double-free or another freed
               | pointer dereference.
        
             | bell-cot wrote:
             | If anything in the kernel is written in a language that has
             | an extensive runtime system... Well, extensive runtime
             | systems are pretty reliably resource hungry. And _when_
             | they might suddenly need _which sorts_ of resources tends
             | to be unpredictable.
             | 
             | Vs. the kernel must keep working reliably when resources
             | are running low.
        
               | wruza wrote:
               | But, today linux simply kills any process to free memory.
               | What could prevent a gc (which also serves allocations,
               | not only collects them back) to just do that on an
               | emergency cycle? Destroy a record in a process array and
               | reclaim its pages (of course without doing any
               | allocations on the way, or by using an emergency pool).
               | Or even just reclaim its pages and wait for it to crash
               | on a page fault if you feel lazy.
               | 
               |  _which sorts_
               | 
               | Dynamic languages only do memory-related ops
               | unpredictably, or is it more than that?
        
           | tyingq wrote:
           | I suppose because there aren't many languages that allow you
           | to manipulate arbitrary memory locations and cast portions of
           | that to arbitrary types, and also allow relatively easy
           | inline ASM. Which maybe isn't 100% necessary, but seems to be
           | helpful at an OS level.
        
           | rowls66 wrote:
           | I think that the statement should be refined to say that it
           | is not possible to develop a monolithic kernel based OS in
           | ISO standard C. A monolithic kernel generally relies on
           | passing pointers to memory between components with few
           | restrictions. This can be problematic for some
           | languages/compilers. However a microkernel OS that provides
           | more structure around how data is shared between OS
           | components can support development of many OS components in
           | multiple languages. Even languages requiring significant
           | runtimes like Java or C# could be used for many OS components
           | like file systems, network stacks or device drivers.
           | 
           | Historically, it has been difficult to beat monolithic
           | kernels for performance and efficiency, and through
           | significant effort, monolithic kernel based OS's exist that
           | are reliable enough to be useful. However, the monolithic
           | kernel is not the only OS architecture.
        
             | adamc wrote:
             | While maybe theoretically of interest, it is far afield of
             | the pragmatic considerations that underlie the paper:
             | worked on operating systems in common use TODAY.
        
           | user-the-name wrote:
           | > I have a dumb question. Why can "we" write pretty good apps
           | in languages other than C, but can't write operating systems?
           | Is talking to hardware so much different than talking to
           | APIs?
           | 
           | To some small extent, yes. But I don't think that is the main
           | issue here.
           | 
           | The real issue is that the stakes are much, much higher when
           | implementing an operating system than when writing, say, an
           | image editor. You can live with an occasional crash in a
           | userland app. But the same crash in an operating system may
           | open the door to taking over the entire computer, possibly
           | even remotely.
        
           | pjc50 wrote:
           | There are some other candidates, as expressed below, but
           | really the main problem is how difficult it is to write _and
           | deploy_ an operating system that 's of usable capability.
           | Even just hitting enough of POSIX to get a GUI and a browser
           | up is a pretty huge amount of work.
           | 
           | How many operating systems do we use that are less than 20
           | years old?
        
           | immibis wrote:
           | It boils down to abstractions papering over ABI details.
           | 
           | How do you write "put 0x12345678 to register 0x04000001" in
           | assembler? mov eax, 0x04000001 / mov [eax], 0x12345678
           | 
           | How do you write it in C-as-portable-assembler? You write
           | _(u32_ )0x04000001 = 0x12345678;
           | 
           | How do you write it in Java? You can't, the language has no
           | such ability and if you try it's a syntax error. You have to
           | call into a routine written in a lower-level language.
           | 
           | How do you write it in C-as-abstract-machine? You can't, the
           | language has no such ability and if you try it's undefined
           | behaviour. You have to call into a routine written in a
           | lower-level language.
           | 
           | By the way, you can't write an operating system in C-as-
           | portable-assembler either. No access to I/O port space, no
           | way to define the headers for the bootloader, no way to
           | execute instructions like LGDT and LIDT, no way to get the
           | right function prologues and epilogues for interrupt handlers
           | and system calls, no way to _invoke_ system calls. All those
           | things are usually written in assembler. Writing operating
           | systems in C has always been a lie. Conversely, you can
           | extend the compiler to add support for those things and then
           | you can write an operating system in extended-C!
        
             | wruza wrote:
             | This addresses a part of my question, which I didn't make
             | clear, thanks! I mean after all this assembler stuff one
             | could just use BASIC or similar. Yes, Java has no concept
             | of PEEK/POKE, IN/OUT, but it just wasn't designed for that.
             | Meanwhile, 1980s small systems were all assembly +
             | basic/fortran. Of course they had no kernel in a modern
             | sense, but all the devices were there: a speaker (SOUND), a
             | serial line to a streamer/recorder (BLOAD), a graphics
             | controller, no dma though, but it's just a controller with
             | the same "ports" as well, which can r/w memory and generate
             | interrupts. I don't get it why we don't just skip C to
             | something high-level after wrapping all this
             | pio/dma/irq/gdt/cr3 stuff into __cdecl/__stdcall format and
             | then use ffi of a language which would decide to support
             | that. I also don't understand GC arguments down the thread,
             | because GC over malloc seems to be just a synthetic detail.
             | You definitely can implement GC over a linear address
             | space, just bump alloc it until the limit, or page-table
             | however you want for dma. Malloc is not hardware, it isn't
             | even a kernel thing. Apps run on mmap and brk, which are
             | similar to what kernels have hardware-wise. Mmap is
             | basically a thin layer over paging and/or dma.
             | 
             | It was so easy and then blasted into something unbelievably
             | complex in just few years. Maybe 80386 wasn't a good place
             | to run typescript-over-asm kernel, but do we still have
             | this limitation today?
        
           | tenebrisalietum wrote:
           | > Is talking to hardware so much different than talking to
           | APIs?
           | 
           | It depends. If your hardware is behind a bus or controller
           | device that's serviced by a separate driver, then you're
           | using APIs of that bus/controller driver.
           | 
           | But think of having to talk to the TCP/IP stack using system
           | calls - you are using an API but you'll still need to have
           | some structure just beyond moving data back and forth over a
           | bus. A USB mass storage driver is going to need different
           | data moving over the USB interface than a USB network
           | interface driver.
           | 
           | Different buses work differently as well - USB device
           | addressing is different than SATA or PCI-E device addressing.
           | 
           | If you are really talking directly to a device, you're
           | manipulating registers, bits, ports, etc. You may have to
           | involve IRQs, etc. Your serial port, for example, can hold 16
           | bytes before it generates an IRQ to tell the CPU it has data
           | if it's a 16550 I think. Your SATA interface doesn't work
           | like that, it can actually DMA data directly to memory. But
           | both of these could be streamable devices to an operating
           | system.
        
           | rocqua wrote:
           | I would guess that the big difference between an app and an
           | OS is that the OS needs to do more complicated things with
           | memory addresses.
           | 
           | An app that runs has its own nicely mapped address space. And
           | it interfaces with devices through system calls. An operating
           | system has to keep the actual addresses of everything in
           | mind, and it usually has to talk to devices through virtual
           | addresses.
           | 
           | As an example of what I think might be the problem. If the OS
           | wants to read data from a device, it might allocate a buffer,
           | wait for the device to write into that buffer, and then later
           | read it. For the compiler, that is essentially "reading
           | uninitialized memory" and thus undefined behavior.
        
             | nine_k wrote:
             | The example works because the compiler has no way to know
             | that the programmer _intends_ the memory to be filled by
             | e.g. a DMA transfer from a device.
             | 
             | If a programmer could communicate this idea to the
             | compiler, it would be somehow safer to write such code.
             | There is a big difference between _intentionally_ reading
             | what looks like initialized memory, and doing so by an
             | oversight.
        
               | rocqua wrote:
               | It's not so much about 'intent'. The spec simply says
               | this operation is undefined behavior. You could have a
               | compiler that you could somehow inform "please just
               | define this behavior as reading whatever is in memory
               | there". But that supports the original point of the
               | article, that plain ISO C is not suitable for OS
               | programming.
        
           | GoblinSlayer wrote:
           | There are operating systems written in other languages than
           | C.
           | 
           | A driver doesn't need to be implemented in C, but the kernel
           | API is likely written in C, your code needs to talk to it
           | somehow. If your driver is written in C, it's as simple as
           | #include.
        
             | wruza wrote:
             | Isn't this sort of circular? If very low-level, I mean
             | in/out, lgdt, etc, were exposed as v8-bare-metal modules,
             | it would be as simple as require() then.                 t
             | = require("awesome-8253")       p = require("pio-node")
             | // cast your usual 0x42, 0x43, 0x61 spells
        
         | ErikCorry wrote:
         | In theory the difference between undefined behaviour and
         | implementation defined behaviour is that ID behaviour must be
         | documented. In practice good luck finding that documentation
         | for each CPU and compiler combination. In fact good luck just
         | finding it for LLVM and x64.
        
           | josephcsible wrote:
           | No, that's the difference between _unspecified_ and
           | implementation defined behavior.
        
             | ErikCorry wrote:
             | If you want to be pedantic about it then I guess Clang and
             | GCC don't implement the standards since they treat
             | implementation defined as unspecified.
        
             | astrobe_ wrote:
             | This _unspecifed behavior_ is perhaps a bit lesser-known
             | than UB, here is a (hopefully non-null ;-) pointer:
             | 
             | https://stackoverflow.com/questions/18420753/unspecified-
             | und...
        
           | palotasb wrote:
           | Undefined behavior entails _" there are no restrictions on
           | the behavior of the program",_ meaning anything can happen,
           | including executing the opposite of the program text.
           | Implementation defined behavior is saner in the sense that
           | program behavior is still defined. Examples of the latter are
           | the exact type of std::size_t or the number of bits in a
           | byte: https://en.cppreference.com/w/cpp/language/ub
        
             | immibis wrote:
             | The linked reference page does not say that implementation-
             | defined behaviour must be sensible, only that it must be
             | defined. Contrast with _unspecified behaviour_ where  "
             | _Each unspecified behavior results in one of a set of valid
             | results._ "
             | 
             | I expect that most instances of implementation-defined
             | behaviour come with additional rules which state that the
             | implementation has to define something sensible.
        
         | bluecalm wrote:
         | I don't think making it defined would help much. Overflowing a
         | signed integer is a bug in logic. It would be ideal to have a
         | crash on that. Continuing is going to be bad one way or another
         | unless you luck out with your buggy code so the way the
         | implementation works saves you. It can't be relied upon in
         | general case though.
         | 
         | Imo the way is to develop more tools that detect (either by
         | analysis or at runtime) those bugs and run the code with those
         | attached as often as you can afford it (to take the performance
         | penalty).
        
           | GoblinSlayer wrote:
           | "Can't be relied upon in general case" is implementation
           | defined behavior, not undefined. UB is much worse than what
           | you think, it's not merely can't be relied, but the program
           | can launch nuclear rockets when it happens. Preventing such
           | interpretations is very helpful.
        
             | bluecalm wrote:
             | I meant that you can't rely on platform/implementation
             | specific behavior to save you. It's the worst of both
             | worlds: you don't get performance benefits of UB and you
             | introduce a disaster waiting to happen once your code runs
             | on another platform or is compiled with another compiler.
             | 
             | I know what UB is. I think the idea is brilliant and saves
             | millions of dollars of burnt coal every day. Sometimes
             | security matters more and then you compile your code with
             | every flag/sanitizer you can find to exchange performance
             | for security.
        
           | cle wrote:
           | Overflowing a signed integer is not always a bug in logic, if
           | you know the underlying representation then purposefully
           | overflowing can be pretty useful.
        
             | gpderetta wrote:
             | The problem is that the vast majority of overflows are
             | indeed logic errors. If the behaviour were defined, I
             | wouldn't be able to use ubsan to catch them.
        
               | MaxBarraclough wrote:
               | > the vast majority of overflows are indeed logic errors
               | 
               | This topic has turned up before. [0]
               | 
               | I think C# gets this right. Ordinarily it handles integer
               | overflow by throwing an exception, but it has an
               | _unchecked_ keyword which gives you wrapping. [1] If you
               | 're writing code that is expected to wrap, you use the
               | _unchecked_ keyword, and you carry on using the usual
               | arithmetic operators. (I believe you can also instruct
               | the C# compiler to default to unchecked behaviour, so
               | there 's also a _checked_ keyword. This strikes me as a
               | mistake.)
               | 
               | Strictly speaking Java gives you the option of checked vs
               | unchecked integer arithmetic, but with terrible
               | ergonomics: the '+' operator always silently wraps, if
               | you want throw-on-overflow behaviour you have to call a
               | method. [2] This is of course so unsightly that Java
               | programmers tend to stick with the infix arithmetic
               | operators regardless of the wrapping behaviour.
               | 
               | C++ has templates and operator overloading, so you can
               | use a library to get signed arithmetic to wrap, or throw,
               | without undefined behaviour. [3] Such libraries are very
               | rarely used, though.
               | 
               | See also this very good blog post by John Regehr, who
               | specialises in this kind of thing. [4] To quote the post:
               | 
               | > _Java-style wrapping integers should never be the
               | default, this is arguably even worse than C and C++'s UB-
               | on-overflow which at least permits an implementation to
               | trap._
               | 
               | [0] https://news.ycombinator.com/item?id=26538606
               | 
               | [1] https://docs.microsoft.com/en-
               | us/dotnet/csharp/language-refe...
               | 
               | [2] https://docs.oracle.com/en/java/javase/17/docs/api/ja
               | va.base...
               | 
               | [3] https://www.boost.org/doc/libs/1_78_0/libs/safe_numer
               | ics/doc...
               | 
               | [4] https://blog.regehr.org/archives/1401
        
               | int_19h wrote:
               | "unchecked" is the default in C#, and "checked" is opt-
               | in, except for compile-time expressions (and
               | System.Decimal, which always throws on overflow).
               | 
               | You can tell the compiler to use "checked" by default for
               | a given project, but it's fairly rare to see that in
               | practice.
               | 
               | I wish it was the other way around, but I guess they
               | didn't consider the overhead of "checked" acceptable as a
               | default back in 1999; and now it's a back-compat issue.
        
               | pjmlp wrote:
               | It is being discussed to change the defaults on .NET 7.
        
           | phicoh wrote:
           | That's the thing. When C is portable assembler, you expect
           | signed integer overflow to be the same as in assembler. On
           | x86 you don't expect a trap.
           | 
           | There are quite a few idioms where overflow is used
           | intentionally. There is no reason to turn that into undefined
           | behavior if it works fine on the underlying platform.
        
             | bluecalm wrote:
             | I don't expect C to be portable assembler. I expect it to
             | be a simple fast language with clear rules. I am not sure
             | what you mean by an idiom here. I suppose you mean
             | assembler idiom as in C it was always a simple bug in
             | logic. Obviously you can't just use idioms from one
             | language in another without checking what the rules in the
             | other language are.
             | 
             | Platform specific behavior should be as rare as possible.
             | It's a recipe for bugs. The rule is simple enough and major
             | compilers have flags to prevent the overflow. Obviously you
             | pay the performance penalty for using them. It's a choice
             | you're free to make as it should be.
        
           | rini17 wrote:
           | So you're fine with throwing your hands up "its unreliable,
           | its a bug that should never happen" when standard lacks a
           | clear recipe how to check beforehand whether signed integer
           | arithmetic will overflow? Everyone rolls their own,
           | introducing even more bugs.
        
             | bluecalm wrote:
             | Well, I prefer to have standard tools to check or a way to
             | compile so it traps on the overflow. Majors compilers
             | provide that if you value security over performance and are
             | not sure about correctness of your logic.
        
       | ErikCorry wrote:
       | Similarly, standard C++ is not usable to implement virtual
       | machines like Hotspot, and yet all virtual machines are
       | implemented using C++ compilers.
        
       ___________________________________________________________________
       (page generated 2022-01-21 23:01 UTC)