[HN Gopher] Pointers Are Complicated III, or: Pointer-integer ca...
       ___________________________________________________________________
        
       Pointers Are Complicated III, or: Pointer-integer casts exposed
        
       Author : lukastyrychtr
       Score  : 115 points
       Date   : 2022-04-15 10:01 UTC (13 hours ago)
        
 (HTM) web link (www.ralfj.de)
 (TXT) w3m dump (www.ralfj.de)
        
       | rocqua wrote:
       | I am misunderstanding something about `restrict`.
       | 
       | > the original program already had Undefined Behavior, or (at
       | least) one of the optimizations is wrong.
       | 
       | As far as I know, since the uwu function was declared with x and
       | y as restrict, the way uwu is called in main is undefined
       | behavior. Because they are both pointing into the same array, and
       | are both 'derived' from the same array.
       | 
       | I guess if I am wrong its because `restrict` does not care if
       | both are derived from the same pointer. Instead it might only
       | care if x was derived directly from y or y was derived directly
       | from x. Is there a good reason to only care about this 'directly
       | derived from' rather than 'shared derivation'? It would sorta
       | suck if you can't use a function with to restrict arguments
       | pointing into the same array, but this example seems to suggest
       | that might be a reasonable requirement.
        
         | taintegral wrote:
         | `restrict` pointers have nothing to do with the underlying
         | "object" they point into (an array in this case). `restrict`
         | lets the compiler assume that reads through a restricted
         | pointer or any pointer expressions derived from it are only
         | affected by writes through that pointer or expressions derived
         | from it. There are only two writes in this example: `*x = 0`,
         | which is trivially correct; and `*ptr = 1`, where `ptr` is
         | derived from `x` (via `ptr = (int *)(uintptr_t)x`) so this is
         | also correct. However, it's now easy to see that the
         | optimization replacing `xaddr` with `y2addr` causes undefined
         | behavior since it changes `ptr` to be derived from `y`. The
         | post addresses this in "The blame game" and mentions that
         | integers could carry provenance but that it's infeasible to
         | actually do this.
         | 
         | The weak provenance solution is to ban optimizing the pointer
         | to integer casts since they have a (conceptual) side-effect.
         | The strict provenance proposal points out that the side effect
         | is only observable if the integer is cast back to a pointer, so
         | we can keep optimizing pointer to integer casts and instead ban
         | integer to pointer casts. For example, an operation like `(int
         | *)xaddr` is banned under strict provenance. Instead, we provide
         | a casting operation that includes the pointer to assume the
         | provenance of; something like `(int * with x)xaddr`. With this
         | new provenance-preserving cast, we can see that the final
         | optimization of replacing `*x` with its previously assigned
         | value of `0` is no longer possible because the code in between
         | involves `x`.
        
           | camgunz wrote:
           | > However, it's now easy to see that the optimization
           | replacing `xaddr` with `y2addr` causes undefined behavior
           | since it changes `ptr` to be derived from `y`.
           | 
           | Yeah this article is great and the framing is pretty perfect.
           | It really shows that optimization passes can't remove
           | information, else they run the risk of tricking later passes.
           | I definitely agree with OP that "the incorrect optimization
           | is the one that removed xaddr"; that optimization seems
           | _wild_ to me. You only know y is x + 1 because of the way it
           | 's constructed in the calling function (main). So the
           | compiler... inlines an optimized version of the function that
           | removes most use of x? Isn't that optimizer fundamentally
           | broken? Especially in a language with `volatile` and
           | `restrict`?
        
         | dthul wrote:
         | Indeed I don't think C cares about how the pointers you mark as
         | "restrict" were constructed, it just cares about how you
         | actually use the pointers and in the example code they are
         | never used in an "overlapping" way.
         | 
         | I just checked cppreference and it says: "if some object that
         | is accessible through P [marked restrict] (directly or
         | indirectly) is modified, by any means, then all accesses to
         | that object (both reads and writes) in that block must occur
         | through P (directly or indirectly), otherwise the behavior is
         | undefined"
         | 
         | The only accesses I can see are "*x = 0;", "*ptr = 1;" and
         | "return *x;". "*ptr" is an indirect access through "x" by way
         | of an integer round trip. If the orignal program is indeed UB
         | that would mean that pointers built from an integer round trip
         | are not considered (indirect) accesses through the original
         | pointer.
         | 
         | The author assumes that it _is_ an indirect access:  "However,
         | the only possibly suspicious part of the original program is a
         | pointer-integer-pointer round-trip - and if casting integers to
         | pointers is allowed, surely that must work. I will, for the
         | rest of this post, assume that replacing x by
         | (int*)(uintptr_t)x is always allowed."
        
         | cryptonector wrote:
         | > As far as I know, since the uwu function was declared with x
         | and y as restrict, the way uwu is called in main is undefined
         | behavior.
         | 
         | In the first example the integer-cast derivative of the second
         | pointer does alias the first but isn't used to access either
         | object, therefore there is (or should be) no UB.
         | 
         | > I guess if I am wrong its because `restrict` ...
         | 
         | The `restrict` keyword was -IIUC- specifically intended to make
         | it possible to write functions like `memcpy()` w/ optimizations
         | that are made possible by knowing that the pointed-to objects
         | do not overlap. The semantics of _that_ are crystal clear in a
         | hand-wavy way, but... very difficult to nail down _exactly_.
         | 
         | With a hand-wavy definition of `restrict` it's pretty clear
         | that the first `uwu()` is perfectly fine and has no UB.
        
       | celeritascelery wrote:
       | There are a few parts of this I still struggle to understand. I
       | don't get why ptr2int casts are a problem if you never try and
       | cast the integer back to a ptr. It seems like int2ptr is the real
       | issue.
       | 
       | Also it's said that casts to int are better then transmutes
       | because cast have side effects. But ptr2int casts don't actually
       | have side effects, they are a No-op on the hardware.
        
         | kevin_thibedeau wrote:
         | > they are a No-op on the hardware.
         | 
         | That is not guaranteed. The only guarantee is that you can
         | round trip conversions via (u)intptr_t. The integer
         | representation of a converted pointer can be completely
         | different to accommodate hardware like the Symbolics lisp
         | machine.
        
           | celeritascelery wrote:
           | Are we losing optimization on x86/arm due to mere existence
           | of other hardware (like symbolics or CHERI) that handles
           | things differently?
        
         | GolDDranks wrote:
         | They are problem in the sense that the address of the pointer
         | gets exposed. After you lose track of who has it and who might
         | do what with it, you can't track the aliasing information of
         | the pointer, so you have to suppress some optimizations. But
         | you are correct in the sense that if int2ptr never happens,
         | it's all good.
         | 
         | About side effects: we are not talking about having side
         | effects on hardware level, we are talking about side effects
         | from the compiler's viewpoint. Again, the compiler might track
         | aliasing information for optimizations, and casting has the
         | side effect of "exposing" the pointer.
        
         | mlindner wrote:
         | I'd assume because it has something to do with the idea is that
         | an optimizing compiler can't completely delete the address
         | entirely anymore if it's being used for "something". For
         | example optimizing it away to a register only variable.
        
         | kmonsen wrote:
         | I mean you are correct, but why else would a pointer be
         | converted to it if not to cast it back at some point? I guess
         | you can print it for debugging, but most other uses means it
         | will be used as pointer at some point.
        
           | rsaxvc wrote:
           | Some hand-vectorized code will do this to compute the number
           | of non-vectorized elements that may exist before and after a
           | suitably aligned region.
        
           | pcwalton wrote:
           | Another reason for converting pointers to integers without
           | ever doing the reverse operation is to hash them.
        
         | kibwen wrote:
         | _> I don't get why ptr2int casts are a problem if you never try
         | and cast the integer back to a ptr._
         | 
         | AFAIK, you do understand. ptr2int casts are totally fine and
         | defined behavior, as long as the program contains no int2ptr
         | casts. Is there a passage from the OP that contradicts this?
        
           | celeritascelery wrote:
           | From the section "Casts have a side-effect":
           | 
           | > But in this case, the operation in question is
           | (uintptr_t)x, which has no side-effect - right? Wrong. This
           | is exactly the key lesson that this example teaches us:
           | casting a pointer to an integer has a side-effect, and that
           | side-effect has to be preserved even if we don't care about
           | the result of the cast. ... We have to lose some
           | optimization, as the example shows. However, the crucial
           | difference to the previous section is that only code which
           | casts pointers to integers is affected.
           | 
           | So even if we never even use the result, casting a pointer to
           | an integer is problematic.
           | 
           | But in the explanation he only talks about the problems of
           | int2ptr cast, which I do undestand.
        
             | comex wrote:
             | The problem is that, if we assume that integers don't have
             | provenance, some far distant part of the code could guess
             | the integer and do an int2ptr. If you can prove that
             | nothing in the entire program could possibly do this for
             | the entire lifetime of the original object, then sure, you
             | could remove the ptr2int. But compiler optimizations
             | usually work one function at a time. In some cases it might
             | be feasible to prove this anyway, like if (a) you have a
             | function that doesn't call any other functions and (b) the
             | object in question is a local variable that will go out of
             | scope at the end of the function, making any further
             | accesses UB regardless. But in most cases it's not
             | feasible.
        
       | flohofwoe wrote:
       | Are there any experimental systems programming languages
       | exploring the opposite of "pointer provenance" (and the resulting
       | compiler complexity)?
       | 
       | E.g. treat memory strictly as some sort of separate storage
       | device, and do all computations in local register-like variables
       | (which could be backed by a special invisible scratch space to
       | allow variables to spill from registers to scratch space.
       | 
       | Basically a language which would require explicit load/stores
       | from and to regular memory, instead of trying to provide the
       | illusion of an infinitely large register bank.
        
         | saagarjha wrote:
         | Are you looking for a language that does function local,
         | bounded optimization, and dumps state directly to memory at
         | various well-defined boundaries?
        
           | flohofwoe wrote:
           | I think more like a medium level language somewhere between
           | assembly and C, with a clear distinction between a "virtual
           | register bank" and "mass storage memory". Moving data between
           | the two would be an explicit operation, not something the
           | compiler would ever do automatically.
        
             | saagarjha wrote:
             | So, one way you can do this is to mark all your "memory"
             | pointers as volatile, then load them into register
             | variables and store them back manually. This would actually
             | allow for very aggressive optimizations in the region that
             | you've fenced off with "register" since the compiler can
             | assume there is no aliasing, while letting you define the
             | boundary where you'd like to writes to "go to the
             | hardware". In C this might be a bit of boilerplate but in
             | C++ once could assume you could RAII the boilerplate
             | away...might be worth exploring.
        
         | dmytrish wrote:
         | Assembly languages sound like what you describe, and they do
         | not make the burden of memory management go away, just shift
         | it.
         | 
         | You still have to manage a separate storage device somehow
         | (most often via file systems drivers). That's why GCs are so
         | popular: you have a smart "driver" between your code and a RAM
         | storage that makes life so much easier.
        
           | flohofwoe wrote:
           | It's not so much about removing the burden or making life
           | easier, but instead making manual optimization more
           | straightforward and predictable. Currently, manual
           | optimization often means appeasing or even fighting the
           | compiler's optimizer passes, with small innocent changes
           | completely changing the output of the compiler.
        
             | dmytrish wrote:
             | Now that I think about exposing cache hierarchy in a
             | programming language, that makes much more sense. One can
             | imagine a programming language with explicit control over
             | CPU caches and their hierarchy. Also, this could make
             | GPU/CUDA programming more explicit, safe and efficient.
             | 
             | Still, this requires the hardware to cooperate with
             | software instead of pretending that random access memory is
             | truly random access. This functionality just isn't there at
             | this point.
             | 
             | Edit: this would require programmable caching hardware to
             | make caching analysis and algorithms introspectable and
             | changeable. For now, fixed caching algorithms implemented
             | in hardware do provide lots of benefits.
        
               | pclmulqdq wrote:
               | DSPs and other application-specific processors expose the
               | cache hierarchy as a set of scratchpads. This works very
               | well for them, but not for any CPU that is shared between
               | applications, like a server.
        
             | lmm wrote:
             | I do wonder if the long-term future of programming looks
             | like https://www.microsoft.com/en-
             | us/research/publication/coq-wor... , and providing
             | libraries to help the programmer optimize themselves rather
             | than black boxes that do it for them.
        
         | birktj wrote:
         | I think LLVM IR (and other IRs) works a bit like this. There
         | you have an infinite set of registers while memory is separate
         | and you need to explicitly load and store between them.
         | Unfortunately it is not very suitable for humans to write.
        
       | Findecanor wrote:
       | IMHO, casts from integer to pointer should have been deprecated
       | in C (and derivative languages) a long time ago.
       | 
       | Not only do they encourage unsafe practices, being forbidden in
       | most C/C++ style guides.
       | 
       | They can also wreak havoc with any hardware or software-hardening
       | that puts non-address bits in pointers. For instance, ARM PAC has
       | been in Apple's CPUs for years now and support is mandatory in
       | ARM v8.3 and up. Intel and AMD have also had experimental
       | extensions that use those bits, and new ones are on the way.
       | 
       | When an integer is cast to pointer in those schemes, the pointer
       | can become invalid. Therefore, an integer cast to a pointer type
       | should _always_ result in an invalid pointer, IMHO.
       | 
       | Integers are sometimes used to set alignment. I'd suggest that
       | the macros in CHERI C be added to <stdalign.h>: _align_up_ ,
       | _align_down_ and _is_aligned_. Those can be expressed in normal C
       | without having to cast an integer to pointer, or use faster
       | expressions depending on the platform.
        
         | saagarjha wrote:
         | On the contrary, code that needs to do integer-to-pointer casts
         | is almost exclusively written in C and C++. Hardware features
         | such as PAC and CHERI were designed to work _with_ those
         | languages assuming that integer-to-pointer casts are possible.
         | Any extension that prevents pointer tagging, extracting
         | signatures, and the like is dead on arrival. Most code doesn 't
         | need to touch this kind of stuff, but the whole point of the
         | languages is that it lets you do this when you need to.
        
           | kibwen wrote:
           | The Rust APIs mentioned in the OP support things like pointer
           | tagging without exposing a raw integer address to the user.
           | AIUI CHERI had to bend over backwards to support these
           | operations in C, not because it wanted to but because it had
           | to out of pragmatism. I wager that the CHERI authors would be
           | thrilled if the grandparent's proposal to disallow int2ptr
           | casts were possible to implement without ruling out every
           | significant C codebase in history.
        
         | im3w1l wrote:
         | Well one reason I think is the following pattern in some
         | library.                 struct SomeStruct {         ...
         | void (*callback)(A);         A user_data;       }
         | 
         | Here A will be either int or void*. But regardless of stated
         | type, the expectation is that the user puts whatever they find
         | the most convenient there, whether thats a pointer or integer
         | (fd maybe?).
        
         | flohofwoe wrote:
         | TBH I feel the opposite. "Pointers" should be simple memory
         | addresses again without any compiler magic attached, and memory
         | load/stores should be explicit, with the same level of control
         | as assembly. At the least this would make manual performance
         | optimization more predictable. There are plenty of languages
         | "above" C, no need to move C into that direction even more, but
         | I think the unexplored area "below" C (and above assembly) is
         | much more interesting.
        
           | [deleted]
        
           | kibwen wrote:
           | The problem is that C developers want their code to work as
           | they expect, which is reasonable, and at the same time they
           | want their code to be as fast as possible, which is also
           | reasonable. But in practice these two desires are often at
           | odds. So when C compilers start exploiting more UB in order
           | to make code faster programmers turn around and complain, and
           | then when C turns out to be slower than Fortran due to a lack
           | of UB-exploiting optimizations a different segment of
           | programmers will complain. It's a difficult balancing act for
           | compiler and language developers.
        
             | xscott wrote:
             | I think "restrict" is a really nice compromise in C: Your
             | code declares that you're willing to follow additional
             | rules, and the compiler can make additional optimizations.
             | 
             | The article shows a piece of code that makes that promise
             | and then doesn't hold up its part of the agreement. I can't
             | even follow the rest of the argument from there because
             | it's all based on a faulty foundation.
        
               | cstrahan wrote:
               | > The article shows a piece of code that makes that
               | promise and then doesn't hold up its part of the
               | agreement.
               | 
               | Edit: disregard; I now see your other comments.
               | 
               | Incorrect. The article shows a piece of code that makes
               | and _fulfills_ that promise, and then the optimization
               | passes break the code. If you disagree, I would be
               | interested in hearing how /where you think the code
               | itself is faulty.
        
               | xscott wrote:
               | No, I was sloppy when I read it the first time. The
               | second snippet of code is broken, and I (incorrectly)
               | assumed it was a valid translation from the first
               | snippet. My bad.
               | 
               | The first snippet is subtle, and I'm not a language
               | lawyer, but I can't see anything that screams "contract
               | violation".
        
             | Asooka wrote:
             | I'm starting to think we need a way to allow UB-based
             | optimisations on a per-function or per-block basis, with
             | the rest of the code being compiled with the simplest
             | mental model. It's getting a bit hard to reason about
             | whether your code is actually safe or not, it would be
             | better to compile most code as if all pointers can alias,
             | integers wrap around, accessing nullptr is allowed, etc.
             | (subject to per-platform quirks) Then we can mark the hot
             | functions as candidates for better optimisation. Maybe
             | other than an attribute we could add assertions to the
             | code, e.g. assert_no_overlap(ptr1, size1, ptr2, size2) --
             | assert that the range [ptr1, ptr1+size1) does not overlap
             | [ptr2, ptr2+size2). That could then optionally be compiled
             | into an actual runtime assertion to help catch UB at
             | runtime.
             | 
             | Ultimately the C (and C++) memory model is meant to help us
             | write fast, safe software. Developers must be able to
             | reason about the semantics of their code. Optimising short
             | parts of the program that can be reasonably checked by a
             | person for not invoking UB is probably the right way and
             | will result in fast software with fewer bugs due to
             | programmer error.
             | 
             | Edit: Not sure why people are downvoting this, it's like
             | there's some UB religion out there. Do you people _want_
             | your programs to crash due to the compiler doing surprising
             | things?
        
               | Hello71 wrote:
               | This already exists. gcc has optimization pragmas and
               | function attributes that allow optimization to be set on
               | a per-function level. However, they don't really work as
               | expected, mainly because inlining exists. Example:
               | int foo(void) { return 42; }       int bar(void) { return
               | foo() + foo(); }
               | 
               | If the whole file is compiled as -O0, foo and bar must be
               | compiled as-is. If the whole file is compiled as -O2, bar
               | will be optimized to "return 84;". But what if foo is
               | compiled as -O0 and bar is compiled as -O2? Can bar still
               | be optimized to "return 84;"? What about "return 42+42;"?
               | Can it do "int x = foo(); return x+x;"? What about "int x
               | = foo(); if (x == 42) return 84; else return x+x;"? All
               | of these are permitted in -O2 if gcc thinks it's a good
               | idea, but in our "mixed mode" the semantics are unclear.
               | It might be theoretically possible to come up with a
               | consistent definition; it might start with "functions
               | specified as -O0 cannot be inlined or have other
               | functions inlined into them", but there are still more
               | things to consider, like the behavior of global memory,
               | thread-local storage, floating-point environment, and so
               | on (what if one or more functions is static inline or
               | __attribute__((always_inline))?).
               | 
               | The real solution with current gcc is to simply compile
               | different files ("translation units") with different
               | optimization flags if you want. This has always been
               | supported. Unfortunately, it comes with a potentially
               | significant performance penalty; the whole point of LTO
               | is to avoid this performance penalty, but this once again
               | returns the issue of the semantics of mixing functions
               | with different optimization levels.
        
               | Rusky wrote:
               | > Do you people _want_ your programs to crash due to the
               | compiler doing surprising things?
               | 
               | I obviously can't speak for everyone who defends this
               | sort of compiler behavior, but my preferred solution is
               | for the compiler to continue performing these kinds of
               | optimizations while also providing better static analysis
               | and sanitizer tools to catch problems earlier, even if
               | that involves extending or replacing the language to make
               | that possible.
               | 
               | Expecting everyone who writes C or C++ to do this kind of
               | reasoning in their head and perform these optimizations
               | manually (or drop their performance back toward -O0) just
               | sounds like it would make things worse.
        
               | pjmlp wrote:
               | > Ultimately the C (and C++) memory model is meant to
               | help us write fast, safe software
               | 
               | Fast, maybe. Safe, not really.
        
         | andi999 wrote:
         | How do you do embedded then? I mean you need to specify the
         | addrese of lets say the spi interface. Usually numbers are
         | written as integers. So we would need a syntax for pointer
         | literals?
        
           | addaon wrote:
           | I don't particularly agree with the grandparent that getting
           | rid of pointer-to-integer casts makes sense, but at least the
           | specific question you're asking doesn't seem like a non-
           | starter. Two options that come to mind immediately would be
           | making a union { intptr_t; foo * } well-defined; or adding a
           | 'p' or 'ptr' literal suffix so 0x80004000p has type void *.
        
             | andi999 wrote:
             | Yes type punning is still possiblem, but doesnt this
             | violate the 'spirit' of the original idea (to get rid of
             | conversions between int and *)
        
               | addaon wrote:
               | Type punning in general is undefined (unless one of the
               | types is char * or equivalent); but defining it between
               | intptr_t and pointer types in a few restricted contexts
               | (e.g. for const variables of static storage duration
               | only) would be sufficient to address memory mapped
               | registers and restrictive enough that it should be
               | possible to define clearly.
        
           | [deleted]
        
           | kmonsen wrote:
           | Or language interop? At least in jni it's normal to store a
           | c/c++ pointer in Java as long, and then send it back to c
           | where it has to be converted to a pointer.
        
       | xscott wrote:
       | > [...] restrict, a C keyword to promise that a given pointer x
       | does not alias any other pointer not derived from x
       | 
       | The author clearly states that restrict is a promise, and then
       | immediately violates that promise. If you lie to the compiler,
       | what do you expect to happen?.
       | 
       | Declaring your pointers "restrict" is a promise from your code to
       | the compiler, _not_ from the compiler to your code.
       | 
       | It seems like the entire rest of the article is based on this
       | misunderstanding, and there are a whole bunch of strong
       | statements and questionable conclusions drawn from it:
       | 
       | > This is bad news. Our optimizations changed program behavior.
       | That must not happen!
       | 
       | > the only possibly suspicious part of the original program is a
       | pointer-integer-pointer round-trip
       | 
       | > while the details of defining "derived from" are nasty, it
       | should be clear that doing a bunch of operations that literally
       | don't involve x at all cannot by any stretch of the imagination
       | produce a result that is "derived from" x
       | 
       | It's hard for me to make sense of the rest of the business about
       | "provenance" when the problem it's supposed to solve are based on
       | faulty reasoning. The part about casting to integer having side
       | effects doesn't follow either.
       | 
       | I've always liked C, and I'm coming to really like Rust. I hope
       | mistaken arguments like the ones in this article aren't
       | persuasive to anyone in the compiler camps.
        
         | Cu3PO42 wrote:
         | I don't believe the argument is mistaken at all. Consider the
         | interface of `uwu`. It takes two pointers with the extra
         | promise that they do not alias each other.
         | 
         | When you call that function with two pointers that happen to be
         | adjacent but non-overlapping, that promise is absolutely
         | satisfied.
         | 
         | The problems start with the notion of 'derived from' and
         | pointer arithmetic. Obviously you can derive something from `y`
         | that aliases `x` by subtracting the offset from `y`. But that's
         | always true, and that - I believe - is the author's point. If
         | you view this as 'derived from', you can never satisfy any
         | restrict guarantee on a function whose implementation you have
         | not carefully reviewed.
         | 
         | As for having influence: Ralf Jung has written a PhD thesis on
         | the correctness of Rust (including formal proof that the model
         | is sound), so I'm inclined to believe people are willing to
         | listen to him, and rightly so.
        
           | xscott wrote:
           | > It takes two pointers with the extra promise that they do
           | not alias each other.
           | 
           | ... and then promptly uses pointer arithmetic to create an
           | alias. It violated the promise.
           | 
           | > If you view this as 'derived from', you can never satisfy
           | any restrict guarantee on a function whose implementation you
           | have not carefully reviewed.
           | 
           | This is C, not Rust. You aren't getting proofs without
           | careful review. You're using "restrict" in order to ask the
           | compiler to create a faster function on the promise that it
           | will never be used in a way that the pointers alias. It's a
           | promise that needs to be upheld within the function and for
           | everyone who calls the function.
           | 
           | Think of something like `memcpy` (overlaps not allowed) vs
           | `memmove` (overlaps allowed). The compiler can make the first
           | one faster, and it is your job to uphold the contract.
           | 
           | > As for having influence: Ralf Jung has written a PhD thesis
           | on the correctness of Rust
           | 
           | Appeals to authority aren't motivating to me. I've known and
           | worked with a whole lot of PhDs, and they aren't always
           | right. In fact, frequently the ones I've known are overly
           | confident when they step outside of their area of expertise.
        
             | Cu3PO42 wrote:
             | > ... and then promptly uses pointer arithmetic to create
             | an alias. It violated the promise.
             | 
             | I think this is where a lot can go wrong in terms of
             | understanding.
             | 
             | It was my understanding that the restrict keyword requires
             | a promise of the caller. Specifically, the caller promises
             | that the arguments it passes do not alias. It was my point
             | that the `uwu` function could not violate that promise,
             | because it wasn't the one who made it.
             | 
             | After review of the standard, I'm no longer certain this is
             | the case.
             | 
             | Disregarding this line of reasoning entirely, I still
             | believe the author is correct.
             | 
             | The following is an excerpt from the Standard (or rather
             | the latest available draft):
             | 
             | > An object that is accessed through a restrict-qualified
             | pointer has a special association with that pointer. This
             | association, defined in 6.7.3.1 below, requires that all
             | accesses to that object use, directly or indirectly, the
             | value of that particular pointer
             | 
             | The `uwu` function complies with that requirement. While it
             | does create an aliasing pointer derived from `y`, it never
             | uses it to actually access memory. Hence, it does not
             | violate the requirement imposed by restrict.
             | 
             | The aliasing access derived from `y` (which may well be
             | undefined behavior after all) is only introduced after one
             | of the optimization passes. Thus, this optimization is
             | incorrect and may not be applied.
             | 
             | Edit: of course, there is an aliasing pointer in the
             | original version of `uwu`. This pointer, however, is
             | derived from `x`, thus not violating any guarantees implied
             | by restrict.
        
               | xscott wrote:
               | > Thus, this optimization is incorrect and may not be
               | applied.
               | 
               | You're right, and I was wrong in my initial take on
               | things.
               | 
               | I currently believe the first version of the code is
               | fine, but that the second version is clearly incorrect.
               | There, you'll see it's using an address derived from `y`
               | to modify memory modified through `x`. The translation to
               | the second version should either _not_ be allowed, or it
               | should not continue to declare the arguments as `strict`.
               | 
               | As for the promise that `strict` makes, I interpret to be
               | a promise both about the function and a promise from all
               | callers of the function. In other words, it's a promise
               | from the programmer(s) to the compiler, and nothing more.
               | 
               | For instance, the arguments to `memcpy` were declared as
               | `restrict`. The documentation doesn't allow overlapping
               | ranges, so the implementation would be fine. However,
               | `uwu` could call it with overlapping ranges. It doesn't
               | matter if `uwu` didn't make the promise, it has to uphold
               | it for correct behaviour.
        
               | Cu3PO42 wrote:
               | I believe we're on the same page now. (I assume in the
               | following that where you wrote `strict`, you meant
               | `restrict`.)
               | 
               | I now understand the standard such that it implies that
               | promises made by `restrict` must be upheld in the whole
               | program and distinctions between caller and callee are
               | irrelevant. Both must cooperate such that the promises
               | remain true.
               | 
               | As for the given code for `uwu`, I also believe the first
               | version is allowed and the second is not. Since the
               | second is the result of (hypothetical proposed)
               | optimizations, the latter may not be applied even though
               | they seem fine on the surface.
               | 
               | So we must have a way to determine when optimizations may
               | be applied and imbuing pointer-to-integer casts with side
               | effects certainly fixes this situation. It seems like a
               | sound approach, but I do hope for a formal analysis, of
               | course.
        
               | xscott wrote:
               | > (I assume in the following that where you wrote
               | `strict`, you meant `restrict`.)
               | 
               | Yup, silly typos. Sorry about that.
        
             | dthul wrote:
             | > and then promptly uses pointer arithmetic to create an
             | alias. It violated the promise.
             | 
             | afaik not creating aliased pointers is not a promise you
             | make when marking a pointer as restrict. Only when using
             | that aliased pointer for accesses (in the presence of
             | modifications) can you actually break the promise. To quote
             | cppreference:
             | 
             | "During each execution of a block in which a restricted
             | pointer P is declared (typically each execution of a
             | function body in which P is a function parameter), if some
             | object that is accessible through P (directly or
             | indirectly) is modified, by any means, then all accesses to
             | that object (both reads and writes) in that block must
             | occur through P (directly or indirectly), otherwise the
             | behavior is undefined"
        
               | xscott wrote:
               | > Only in the presence of accesses and modifications can
               | you actually break the promise.
               | 
               | Ok, creating aliases you don't use is probably fine. I
               | should've worded that more carefully.
               | 
               | However, the example code certainly modifies the same
               | piece of memory from two different pointers, and so it
               | broke the promise.
        
               | dthul wrote:
               | Yes, I think it boils down to whether creating a derived
               | pointer from a restrict pointer and then using both is
               | allowed. I just had a look at the C standard and I think
               | it is allowed if the derived pointer is (what the
               | standard calls) "based" on the restrict pointer (page 89,
               | 6.7.3.1 paragraph 3: http://www.open-
               | std.org/jtc1/sc22/wg14/www/docs/n2310.pdf). The standard
               | language is very dense but it makes it sound like the
               | program presented in the article would not uphold this
               | "based" property and so might be UB as you suspected.
        
           | [deleted]
        
         | jynelson wrote:
         | The original pointers `x` and `y` are not aliased. Only `x` and
         | `y2` are aliases. Does the standard not allow you to pass any
         | pointers which _could_ be aliased based on arbitrary pointer
         | arithmetic in the function body? That seems basically
         | impossible to fulfill ...
         | 
         | Note that the use case here is for llvm IR that the Rust
         | compiler generates. It seems very hard in general to know if
         | two mutable references point to items in the same allocation.
        
           | masklinn wrote:
           | > That seems basically impossible to fulfill ...
           | 
           | Any sort of pointer arithmetics in the presence of restrict
           | pointers seems like a giant footgun.
           | 
           | As far as I can read the standard, it does not forbid
           | aliasing `restrict` pointer _as long as these pointers are
           | not used to access anything_. Here y2 aliases with x, but at
           | no point is anything accessed through y2 (whether reading or
           | writing). So it I feel like it respects the letter of the
           | law.
        
           | xscott wrote:
           | The code intentionally uses `y` to access stuff that it also
           | accesses through `x`. Declaring `x` and `y` to be `restrict`
           | was a promise that it wouldn't do that.
           | 
           | > Does the standard not allow you to pass any pointers which
           | could be aliased based on arbitrary pointer arithmetic
           | 
           | With the right hackish arithmetic, _all_ pointers could be
           | aliased with pointer arithmetic in C. Using `restrict` says
           | _your code_ won 't do that so the compiler can optimize more.
           | 
           | > It seems very hard in general to know if two mutable
           | references point to items in the same allocation.
           | 
           | Yeah, and in the limit it's impossible for C. It's not the
           | compiler's job to figure it out, it's the code's job to
           | uphold the promise.
           | 
           | It's different with Rust because you've got mutable/exclusive
           | or immutable/shared promises from the borrow checker. And
           | Rust doesn't let you dereference raw pointers outside of an
           | unsafe block.
           | 
           | It seems to me that mislabeling your pointers `restrict` in C
           | is akin to doing reckless operations inside of `unsafe`
           | blocks in Rust. In both cases, you told the compiler "trust
           | me, I know what I'm doing".
        
             | hvdijk wrote:
             | > With the right hackish arithmetic, all pointers could be
             | aliased with pointer arithmetic in C.
             | 
             | This is not true. The behaviour is undefined in C if
             | pointer arithmetic results in crossing the beginning or end
             | of an object. If we have int a, b;, then the behaviour is
             | undefined if we do &b - &a even if we never use the result
             | to try to reconstruct &b from &a.
             | 
             | > Using `restrict` says your code won't do that so the
             | compiler can optimize more.
             | 
             | `restrict` covers objects modified through one pointer and
             | accessed through another. It does not cover pointers that
             | point to the same object in general; if only one is used to
             | access the object, or if both are used only for reading,
             | all is fine. (It's actually a little bit more complicated
             | than I'm presenting here, but not in a way that's relevant
             | right now.)
        
               | xscott wrote:
               | Seems like we need a C compiler flag like "--acknowledge-
               | reality", because the compilers for nearly 100% of the
               | server, desktop, and mobile computers in the real world
               | have flat address spaces and do allow arbitrary pointer
               | arithmetic.
               | 
               | Yes, it's technically UB. But which operation could _any_
               | C compiler disallow in practice: converting from pointer
               | to integer, arithmetic on integers, or converting from
               | integer to pointer?
        
               | spc476 wrote:
               | I used to program on a platform where you could have two
               | pointers (say, integer pointers) that were not equal to
               | each other, yet a store to one would change the location
               | pointed to by the other pointer. Hint: it was a very
               | popular architecture in the 80s and 90s. Spoiler: The
               | 8088. I'm not saying I want to go back to that era (yes,
               | I do like flat address spaces) but even on modern systems
               | today it's possible to map memory such that two different
               | pointers point to the same physical memory (but yes, you
               | have to go out of your way to do that these days).
               | 
               | If I could, I would disallow integer to pointer---there
               | are (in my opinion) better ways to address a hardware
               | register than to do                   unsigned char *uart
               | = (unsigned char *)0xdeadbeef; /* [1] */
               | 
               | But I don't think it will go away any time soon in C
               | (Over 30+ years later, and we're still a few years out
               | from 1's compliment and signed magnitude integers being
               | deprecated).
               | 
               | [1] Here's one way using NASM:
               | global   uart              absolute 0xdeadbeef
               | uart resb 1
               | 
               | Then in C code:                   extern unsigned char
               | uart;
        
               | xscott wrote:
               | > I used to program on a platform [...]
               | 
               | Yeah, and I know that's why the C standard has so much
               | slop in it. They want to keep the window of conformance
               | open for past or future architectures.
               | 
               | > If I could, I would disallow integer to pointer [...]
               | 
               | It's not just about 0xDeadBeef hardware addresses, and I
               | don't think anyone is talking about breaking assembly
               | language.
               | 
               | However, C has unions and Rust has enums. You want those
               | to share the same piece of memory in the fewest bytes you
               | can get away with. Some interpreters, written in C, even
               | use nan-tagging to store the important 48 bits of a
               | pointer in the 52 bits of payload with double precision
               | floats.
               | 
               | You don't have to like it, and you can discourage your
               | children or coworkers from doing it, but there are
               | legitimate reasons for all of this. The examples look
               | ugly because they're short and stupid.
               | 
               | And nobody has even mentioned that `A[i] == _(A + i) ==_
               | (i + A) == i[A]` is going to continue to be valid in C.
               | https://stackoverflow.com/a/381549
        
               | hvdijk wrote:
               | Sorry to disappoint, but the compilers for nearly 100% of
               | the server, desktop, and mobile computers in the real
               | world do not allow arbitrary pointer arithmetic. They
               | cannot issue an error message for it because that is
               | provably impossible to detect in the general case, but
               | they do optimise on the basis that it does not happen,
               | even if it breaks programs that try to make use of it.
               | Consider this example for GCC:                   int
               | a[2], b[2];         int offset(void) { return b - a; }
               | int check(void) { return &a[1] + offset() == &b[1]; }
               | 
               | The function check() is optimised by GCC to return zero
               | at -O1 optimisation level or higher, because it reasons
               | that no matter how offset() is implemented, either the
               | addition is undefined, or the comparison results in
               | false.
               | 
               | (Note that GCC does this even in a few cases where it is
               | unclear whether the optimisation is valid. The example I
               | provided is slightly more complicated than I would have
               | liked to avoid that issue; the optimisation is definitely
               | valid in this example.)
        
               | xscott wrote:
               | Same compiler and optimization level provides different
               | behavior when you ask it to compile as C vs C++. Gross.
        
               | xscott wrote:
               | I originally responded to your code, and you're right
               | about it. That's not what I expected or would want to
               | happen.
               | 
               | However, your code doesn't actually address my comment
               | about conversions and arithmetic. Where does this code
               | code fail?                   int a[2], b[2];
               | uintptr_t x = (uintptr_t)a;         uintptr_t y =
               | (uintptr_t)b;         uintptr_t offset = y - x;
               | int* p = (int*)(x + offset);         printf("b == p:
               | %d\n", b==p);
               | 
               | I didn't try _every_ compiler on Godbolt, but I didn't
               | see it misbehave anywhere I did try.
               | 
               | edit: changed to uintptr_t
        
               | spc476 wrote:
               | There's nothing in the C standard that I could find that
               | dictates that b[] has to follow directly after a[] in
               | memory. That it works is just happenstance (or an
               | implementation detail). The only place were order is
               | maintained are fields in a structure definition.
        
               | xscott wrote:
               | There's nothing in that example that assumes they are
               | adjacent. It only assumes they are in the same (flat)
               | address space. Put a gigabyte between them if you want.
               | 
               | If I had used `uintptr_t` instead of `ssize_t`, I think
               | it's even compliant as far as wraparound goes.
               | 
               | edit: Note I changed the code to use uintptr_t
        
               | hvdijk wrote:
               | That is integer arithmetic, not pointer arithmetic. My
               | understanding of the C memory model that is mentioned in
               | the article (look for PNVI-ae-udi) is that what you are
               | doing is or will be well-defined. I suspect there may
               | still be a few implementations around where the offset
               | you get from integer subtractions has no obvious relation
               | to the offset you would get from pointer subtractions
               | (for two pointers where subtraction is well-defined), but
               | for your example that makes no difference.
        
               | [deleted]
        
         | dthul wrote:
         | > The author clearly states that restrict is a promise, and
         | then immediately violates that promise
         | 
         | Which part of the program do you believe violates the restrict
         | promise?
         | 
         | I think it's not obvious that it is violated. The question is
         | whether a pointer obtained through an integer round trip is
         | considered to be derived from the original pointer. The author
         | assumes that it is.
        
           | [deleted]
        
           | [deleted]
        
           | xscott wrote:
           | > Which part of the program do you believe violates the
           | restrict promise?
           | 
           | The part where they modify a piece of memory using `x` and
           | then modify the same piece using `y`.
           | 
           | That's a promise the programmer made inside that function and
           | for anyone else who calls the function. This is C, so it's
           | not the compiler's job to prove the implementer or caller got
           | it right.
        
             | dthul wrote:
             | But it's never modified using y?
        
               | xscott wrote:
               | I see your point, but this is really niddly. It
               | explicitly asks if the address in `x` is equal to the
               | address in `y-1`. And if so, it modifies the contents at
               | that address which is equal to both.
               | 
               | If the conclusion was that the C standard could/should
               | tighten up its verbiage for cases like this, I'd agree.
               | But the conclusion is something about adding provenance
               | to integers...
        
         | jcranmer wrote:
         | The one who is mistaken here is you, although I shouldn't be
         | surprised that someone can dismiss what is relatively deep
         | semantics work in compilers on the basis of misunderstanding
         | what's actually going on.
         | 
         | The first fundamental issue is that pointer provenance is an
         | area where people have (historically) dealt with the issue as
         | essentially a requirement of data-dependency: for a dereference
         | ` _p` to access an object x, then p must be at the end of a
         | chain of arithmetic computations where &x is one of the inputs.
         | It turns out, however, that compilers _do not preserve data
         | dependency _, and there is pretty unanimous agreement among
         | compiler writers and specification writers that_ they should
         | not* do so in general.
         | 
         | What this means is that the challenge we have is how to specify
         | something that generally follows the data-dependent rules while
         | still allowing compilers to do their non-data-dependency-
         | preserving optimizations. There is a working group in the C
         | committee that is doing this. The current leading candidate for
         | a specification is based on the idea that you track provenance
         | (i.e., data dependency) via pointers, but _not_ via integers.
         | 
         | This means that now we need to pin down what the actual
         | semantics are where you convert a pointer to an integer or vice
         | versa, and this blog post is part of the effort of trying to
         | pin down sane semantics here.
         | 
         | So, yes, restrict is a promise from the programmer to the
         | compiler... but _what_ is the programmer actually promising?
         | That 's what this blog post is exploring.
        
           | xscott wrote:
           | > The one who is mistaken here is you, although I shouldn't
           | be surprised that someone can dismiss [...]
           | 
           | Yes, I was mistaken in my first read through. I'm taking your
           | indignation with some amusement, but I hope you can be civil
           | going forward.
           | 
           | If you look at the second snippet of code, you can clearly
           | see how it declares the arguments `restrict` and then uses
           | them to access the same piece of memory. That's *wrong* - it
           | violated the promise made by `restrict`.
           | 
           | What I didn't see at first was that the first snippet of code
           | appears to be fine. As far as I can tell, it doesn't break
           | any rules. *So the optimization pass from the first version
           | to the second version is incorrect.* After that, everything
           | that follows is still questionable.
           | 
           | The following passes use the `restrict` promise from the
           | second version to eliminate from the load from `x` and so
           | on...
        
       | xscott wrote:
       | I was wrong elsewhere in this discussion. I now think the problem
       | isn't that the original code breaks any rules/promises, but that
       | the transformation to the second version of the code is not
       | valid.
       | 
       | > So, which of the optimizations is the wrong one?
       | 
       | In the original function, the code compares addresses but only
       | modifies through `x`. But since it's a static function, and since
       | it can see all the call sites, it eliminates a test it can prove
       | is always true. So far so good.
       | 
       | However, it's _not_ valid to keep the `restrict` declarations
       | after that rewrite. It changed a modification through `x` to a
       | modification through `y`, so that clearly violates the promise.
        
         | taintegral wrote:
         | This analysis is correct but the solution is not feasible.
         | Changing a modification through `x` to a modification through
         | `y` does indeed violate the semantics of `restrict`. The
         | problem is that in order to detect this situation, we'd have to
         | track the provenance of integers. In this specific example,
         | we'd have to know that replacing `xaddr` with `y2addr` affects
         | `x` and `y`. There is general consensus that tracking
         | provenance for integers causes many more problems than it
         | solves, so although this would solve the problem it is not
         | feasible. This is why weak and strict provenance are being
         | pursued instead.
        
           | xscott wrote:
           | With regards to optimization passes, does it matter if the
           | analysis is infeasible? We agree the
           | optimization/transformation from the first version to the
           | second isn't valid, so I think it shouldn't have been done.
           | 
           | The original version with two stores and one load doesn't
           | seem to have a problem. Having the optimizer punt when it
           | gets confused by integer to pointer casts seems acceptable.
        
             | taintegral wrote:
             | If you can't track provenance to un-restrict the pointers
             | because it's infeasible, then you have to give up on at
             | least one of the optimization passes. In this case, the
             | optimizations used are very fundamental and giving up on
             | any one of them unilaterally would be catastrophic for
             | performance. The provenance models being suggested add more
             | nuance to the model (pointer provenance) so that we can
             | keep all of the optimization passes while preventing these
             | cases from being optimized incorrectly. Weak provenance
             | says we can't optimize away the pointer to integer cast,
             | strict provenance says we must provide provenance for
             | integer to pointer casts. Weak provenance is broadly
             | compatible with existing code (compiler semantics change)
             | whereas strict provenance is not (language semantics
             | change). The tradeoff is that strict provenance leads to
             | better optimization in general.
        
               | xscott wrote:
               | Catastrophic sounds strong. As far as `restrict` goes, C
               | was never that far behind Fortran in performance.
               | 
               | And if maintaining `restrict` for other passes is really
               | important, maybe the order of the passes should be
               | changed. I'm not pretending compiler optimization is
               | simple, but I can't see any situation where having an
               | incorrect optimization pass run first is the right thing
               | to do. The broken pass needs to be fixed, and it
               | shouldn't have emitted code with incorrect `restrict` on
               | it.
        
               | [deleted]
        
             | jcranmer wrote:
             | It's not sufficient to say "there's an int2ptr cast, so
             | stop optimization." You can break the code up across
             | several function calls, which means the code doing the
             | optimization can't tell that there's an int2ptr cast around
             | to shut down the optimization. (Most optimizations are
             | intraprocedural, and even if they could look
             | interprocedurally, there's no guarantee that the function
             | body is even available for inspection--they could be in
             | different C source files).
             | 
             | Instead, you'd have to instead prove that there's no
             | possible int2ptr cast around everywhere, which is generally
             | an impossible analysis.
        
               | xscott wrote:
               | > It's not sufficient to say "there's an int2ptr cast, so
               | stop optimization."
               | 
               | Complicated or not, it's necessary that optimizations do
               | not break correct code.
               | 
               | There doesn't seem to be a problem (UB or otherwise) in
               | the first function at the top of the article, but the
               | second one has a clear aliasing problem that violates the
               | promise the `strict` makes. That translation was invalid.
        
       | benibela wrote:
       | It is much simpler in Pascal
       | 
       | When you write to a pointer, it writes into memory. And it does
       | not optimize it away. As if all pointers are volatile.
        
         | cryptonector wrote:
         | But then you have to constantly work around that in your code,
         | making it read more like assembly in a way.
        
         | pjmlp wrote:
         | Which Pascal?
        
           | benibela wrote:
           | FreePascal:                   procedure x;         var a:
           | integer;            b: pinteger;         begin           b :=
           | @a;           b^ := 1;           b^ := 2;         end;
           | 
           | becomes                   project1.lpr:15
           | begin         0000000000401090 50                       push
           | %rax         project1.lpr:17                           b^ :=
           | 1;         0000000000401091 c7042401000000           movl
           | $0x1,(%rsp)         project1.lpr:18
           | b^ := 2;         0000000000401098 c7042402000000
           | movl   $0x2,(%rsp)         project1.lpr:19
           | end;         000000000040109F 59                       pop
           | %rcx         00000000004010A0 c3                       ret
           | 
           | Perhaps Delphi, too
           | 
           | Although I do not know how much is by design and how much
           | comes from bugs in the optimizer. But they said that one
           | reason why they maintain their own optimizer rather than
           | switching to LLVM. LLVM would "break" this function
        
       ___________________________________________________________________
       (page generated 2022-04-15 23:01 UTC)