[HN Gopher] ABI Mistakes
___________________________________________________________________
ABI Mistakes
Author : yagizdegirmenci
Score : 118 points
Date : 2021-05-08 17:34 UTC (5 hours ago)
(HTM) web link (elronnd.net)
(TXT) w3m dump (elronnd.net)
| gumby wrote:
| > Unfortunately, that doesn't work anymore; compilers are smart
| now, and they don't like it when objects alias.
|
| Let's be specific: compilers for languages that support aliasing.
| For example, FORTRAN does not permit aliasing and therefore has
| all sorts of optimizations that languages that do permit aliasing
| cannot have.
|
| It's a tradeoff like any other, and isn't specific to a compiler
| beyond the fact that a given compiler X can compile language Y.
| varajelle wrote:
| But Fortran has its own ABI, while a believe this article is
| only about C ABI.
| cokernel_hacker wrote:
| My reading of the C++ standard is that this behavior is
| effectively mandated and that one can write a program which can
| tell if an ABI observed the proposed optimization.
|
| [expr.call]: "The lvalue-to-rvalue, array-to-pointer, and
| function-to-pointer standard conversions are performed on the
| argument expression."
|
| [conv.lval]: "... if T has a class type, the conversion copy-
| initializes the result object from the glvalue."
|
| The way a program can tell if a compiler is compliant to the
| standard is like so: struct S { int large[100];
| }; int compliant(struct S a, const struct S *b);
| int escape(const void *x); int bad() { struct
| S s; escape(&s); return compliant(s, &s); }
| int compliant(struct S a, const struct S *b) { int r = &a
| != b; escape(&a); escape(b); return r;
| }
|
| There are three calls to 'escape'. A programmer may assume that
| the first and third call to escape observes a different object
| than the second call to escape and they may assume that
| 'compliant' returns '1'.
| moonchild wrote:
| The compiler would be forced to create copies in that case. In
| general (using my proposed ABI), taking the address of an
| object will cause this, because it is possible to mutate an
| object through its address.
|
| It's still a win because 1) you can avoid making copies in many
| places, and 2) code size decreases because the copy happens one
| time in the callee rather than many times for every caller.
| gpderetta wrote:
| So the caller might have to copy if the the pointer escapes
| and the callee might have to copy if it needs to mutate the
| value. In practice in many cases you might end up with more
| copies than the "bad" ABI.
| MauranKilom wrote:
| But similar things can already happen where copy elision is
| optional. It's one the niches the C++ standard carves out
| regarding the as-if rule, and adding one for this new purpose
| is conceivable.
| gpderetta wrote:
| Copy elision is very different as it to only ever avoid
| copying objects that are just about to be destroyed.
| stephc_int13 wrote:
| Am I the only one reading this article as disguised PR for a new
| trendy language?
| mort96 wrote:
| Yes. It's an article talking about a better calling convention
| for C, the opposite of a new trendy language.
| mssundaram wrote:
| Speaking just on the design of the article, Firefox Reader mode
| is very useful here
| SixDouble5321 wrote:
| That's funny. I read this in a bright environment with
| sunglasses, an I had to use Firefox reader mode to even see it.
| haneefmubarak wrote:
| Out of curiosity: wouldn't doing this make semantics between C
| and C++ a lot nastier since in C++ passing a copy of an object by
| value implicitly calls the copy constructor from the caller
| function?
|
| Still, definitely an interesting optimization. I can definitely
| see a place to use this optimization myself in the near future
| (custom new ABI) either way.
| zabzonk wrote:
| You can't pass a C++ object with a user-defined copy
| constructor or a non-trivial default to a C function. If you
| want to pass an object from C++ to C it has to be POD (plain
| old data) i.e. effectively a C struct.
| haneefmubarak wrote:
| Sure, but I was thinking more of how C++ calling conventions
| typically tend to follow C calling conventions on a given
| architecture, so AIUI you'd either have to have a
| differentish calling convention or inefficiently require
| callees to make copies of already copied objects.
|
| Not that there's anything intrinsically wrong with having a
| differentish calling convention for C and C++ - it'd just add
| a smidgen more complexity is all.
| nneonneo wrote:
| Hmm, I don't agree. If you pass a structure by value, that is
| _supposed_ to make a copy. The callee is free to modify the copy
| any way it likes (unless you use const, and that's not much
| protection in C). Doing as the author suggests, and passing in an
| immutable parameter, would introduce copy-on-write semantics.
|
| If the concern is the overhead of copying large structures
| around, the obvious solution is to not do that and simply pass
| structure pointers (like most APIs do). This also gives the API
| implementer (i.e. callee) full control over whether copies get
| made or not.
|
| In fact, I think I'd argue that if you're passing large
| structures by value, Your API Is Probably Wrong and this is
| absolutely not the ABI's problem in the first place.
| derefr wrote:
| > Doing as the author suggests, and passing in an immutable
| parameter, would introduce copy-on-write semantics.
|
| Yes, and what's wrong with that, if it's transparently done by
| the compiler?
|
| In this scheme, the caller would pass non-register-sized data
| "by value" by actually passing a pointer to it. (Importantly,
| this pointer _is_ register-sized, and so wouldn 't necessarily
| need to be spilled to the stack--unlike caller memory copy,
| which is always to the stack.)
|
| A callee that can be statically determined to never modify the
| value, would, under this calling convention, be compiled to
| code that simply works with the data "through" the pointer
| that's been placed in its register / on the stack.
|
| A callee that _can 't_ be statically determined to never modify
| the value, would, under this calling convention, generate a
| memcpy _from_ the pointer _onto_ the stack; where the pointer
| then goes dead at that point (and so, if the pointer was
| spilled to the stack by the caller, then the memcpy could be
| targeted so as to overwrite the pointer on the stack.)
|
| This would be much more efficient under most conditions. Its
| only inefficiency would come from the situation where the
| pointer being passed would necessarily be spilled to the stack;
| and then the callee would be statically guaranteed to make a
| copy. In that case, you're doing an extra push (for the
| pointer) that current ABIs avoid by just having the caller do
| an eager copy.
|
| But this would be quite rare in practice, since this ABI (and
| the ABIs it replaces) are only for _external symbols_ -- the
| kind whose linkage is fundamentally dynamic (i.e. where the
| linker-loader could in theory substitute anything it likes for
| the external symbol with an LD_PRELOAD shim.)
|
| Internal symbols -- private static functions -- get bespoke
| compiler-specific ABIs that skip all this enforced
| caller/callee predetermined role business and just codegen
| whatever's most efficient on a case-by-case bases, with
| different callsites getting different monomorphizations or
| partial inlinings of the callee that distribute the
| responsibilities differently.
|
| And since these ABIs only matter for external linkage, you have
| to remember that symbols with external linkage get no "type
| attribute enforcement" at link-load time, and so your symbol
| for a function that was originally guaranteed to be a callee-
| that-always-writes, _might_ be substituted at link-load time
| for a callee-that-never-writes. In which case, having the
| pointer on the callee 's stack once again becomes handy, rather
| than useless.
|
| > In fact, I think I'd argue that if you're passing large
| structures by value
|
| Not necessarily _large_ structures. More often things like
| UUIDs--values _just_ large-enough to not fit into a (64-bit)
| machine register. It 's these _small_ memory spills, done in
| hot loops, that add up. There are huge wins for the cleanliness
| of e.g. RDBMS engine code, if their various 128-bit to 512-bit
| column types can be passed by value without necessitating an
| eager copy.
| shawnz wrote:
| If you can effectively do that static analysis, why does it
| matter whether you pass by value or by reference? Can't you
| do the analysis either way? So why not keep telling people to
| pass by reference, just in case their compiler is not so
| advanced, and when you do have access to advanced static
| analysis, then just use it to improve the pass-by-reference
| case.
| andi999 wrote:
| A problem would be that performance becomes brittle. So you
| change the strict in your function and suddenly the program
| is slower.
| Negitivefrags wrote:
| That war was lost decades ago.
| temac wrote:
| Other parts of the program can modify the object after the
| callee has started, if the callee contains some synchro there
| would be no race. At this point maybe you can think of
| additional tricks like CO"W" before any potential synchro
| point if some exist, but I'm not sure big advantages would
| remain except if you very carefully craft your program
| knowing all of that for the opti to happen, and it seems a
| lot of complexity in the compiler for limited gains,
| especially since if you have to massage the code you could as
| well transform to a reference yourself at source level (and
| that would better resist to e.g. calling unknown functions
| from the callee)
| Negitivefrags wrote:
| In order for another part of the program to do that, you
| would already have had to generate a _mutable_ pointer to
| the object somewhere. As the article actually already goes
| into, this is something the compiler already tracks in
| order to know if a value needs to be loaded from memory
| when using it again.
|
| If the ABI specified that objects passed this way were
| passed by immutable pointer, then just passing the object
| using this new method doesn't generate a mutable pointer,
| so the compiler is free to assume nobody else can modify
| it.
| temac wrote:
| But that means the caller also has some proofs to do, if
| a counter example is found or nothing can be proven the
| caller _also has to do a copy_ , and that's with no
| guarantee the callee won't do another one...
|
| So not really an optimization anymore.
| Negitivefrags wrote:
| You say "proofs" but, as I said, this is already
| something the compiler keeps track of.
|
| The caller is required to do a copy as the ABI is right
| now so if you can't do this optimisation it's only as bad
| as it was before.
|
| Yes, there is a case where this creates one extra copy.
| The case where you a) give away a mutable pointer to a
| variable before calling another function, and b) the
| called function modifies it's own argument and then c)
| that function then doesn't use that object as the
| argument for another function it calls.
|
| That's the only case where it generates an extra copy,
| and I'm going to go out on a limb and say that I'm almost
| cirtain that this will reduce sigificantly more copies
| that it creates.
| CyberRabbi wrote:
| > and passing in an immutable parameter, would introduce copy-
| on-write semantics.
|
| Yes that's the point the author is making. His scheme incurs
| strictly less overhead than the scheme currently used.
|
| Some people may want to program in a "value oriented" way
| without pointers at the C level for various reasons but for
| efficiency reasons they cannot.
| Negitivefrags wrote:
| The explaination by derefr is correct.
|
| But there is a larger assumption that underlies your post that
| I think is important to correct.
|
| You seem to be caught up in the idea that the code you write,
| and the code the compiler generates must have some kind of one-
| to-one corrospondance, but this isn't the case at all.
|
| The code the compiler generates must have the same final
| result, but it doesn't have to actually work the same way.
|
| So if you pass a structure by value, and the compiler can avoid
| a copy somehow because it's more efficient, then of course it
| should do so.
|
| Making an ABI that is ammenible to allowing the compiler to
| make better optimisations is of course also a good idea. And
| the proposed ABI here achieves that.
| saagarjha wrote:
| Dropping one-to-one correspondence is fine for most code, but
| there are certain places where it is unacceptable. An ABI is
| one of these places, because the interface is designed to be
| exposed in a way that is observable.
| __jem wrote:
| > The code the compiler generates must have the same final
| result, but it doesn't have to actually work the same way.
|
| You're obviously correct, but I think this ignores downstream
| costs of sufficiently magical compilers. If my compiler
| tunnels into a parallel universe, executes the code there,
| and returns the result, that's fine, but it's going to be
| enormously painful to debug.
|
| Obviously very few developers (including myself) have an
| appropriate mental model of highly complex, speculative
| processor architectures that end up executing their code, but
| I do think there is at least _some_ benefit to having a
| correspondence between the code you write and the code the
| compiler generates.
|
| Maybe it's not a problem with the compiler, but a problem
| with languages. Maybe we just need better languages?
| msbarnett wrote:
| > You're obviously correct, but I think this ignores
| downstream costs of sufficiently magical compilers. If my
| compiler tunnels into a parallel universe, executes the
| code there, and returns the result, that's fine, but it's
| going to be enormously painful to debug.
|
| > Obviously very few developers (including myself) have an
| appropriate mental model of highly complex, speculative
| processor architectures that end up executing their code,
| but I do think there is at least some benefit to having a
| correspondence between the code you write and the code the
| compiler generates.
|
| It's not really clear to me why you think this? Or at
| least, it's not clear to me how you can write this without
| levelling the same complaint against essentially any C
| compiler written after the mid 1980s?
|
| Because for all the hand--wringing in the comments here
| about compilers generating code that doesn't do what your C
| said it would do in the way you wrote it, that ship sailed
| _decades_ ago.
|
| That quaint little integer addition loop you write in C
| will, as like as not, be exploded by the compiler into an
| unrolled stream of SIMD instructions and packed registers
| that doesn't resemble what you wrote in the least, which
| the CPU may then further scramble and interleave, resulting
| in nothing remotely like what a naive person might imagine
| is a reasonable translation of what you wrote - and yet,
| debuggers remain broadly useful tools.
|
| And people have for decades now shown a strong desire to
| enthusiastically and with great speed drop compilers in
| favour of newer ones that performed better optimizations,
| which is to say, people have shown a strong preference for
| compilers that are good at generating code that looks
| nothing like what you wrote, provided the results are the
| same and the performance is as fast as possible.
|
| Turning a copy into passing a pointer that may later copy
| if and only if the compiler sees the possibility of a write
| coming is, in terms of modern optimizations, an absolutely
| conservative optimization that doesn't routinely happen
| only because platform ABIs unnecessarily rule it out. Far
| from being a bridge too far, in terms of modern compiler
| behaviour it's simple and low-hanging fruit which isn't
| exploited even though far more exotic tricks are already
| the rule of the day.
| cma wrote:
| Doesn't this hurt forward compatibility to new
| architectures where there are more registers and it would
| be more efficient doing it through registers? Seems like
| automatic choice by the compiler should be default and if
| explicit control is needed some kind of attribute or
| something could be used to override the compiler's choice.
| zorgmonkey wrote:
| Not really because ABI's are already architecture
| specific so they already do stuff different for other
| CPUs. Similarly, their are already a variety of
| attributes that can be used to explicitly control the ABI
| of the top of my head things like packed structs and
| function calling convention are common and in several
| languages (C, C++, rust and zig).
| gpderetta wrote:
| struct X { int i; }; X* p; void bar(X x){
| P->i=1; assert(x.i==0);// fails with transparent pass
| by ref } void foo() { X x{0}; P=&x;
| bar(x); }
| Negitivefrags wrote:
| The compiler knows that there is a mutable pointer to x
| floating around as soon as you did `P=&x`. Therefore it knows
| it needs to make a new copy of x and then pass an immutable
| pointer to that value when calling `bar`.
|
| This is something the compiler already keeps track of anyway in
| order to be able to know if it needs to re-load variables from
| memory when they are used again after a function call.
|
| Basically the optimsiation degrades to exactly what happened
| before in the case where you do this.
| gpderetta wrote:
| In many cases you might need to amke two copies. Also
| remember, the author proposes to change the abi to encourage
| pass by value instead of of pass by restricted reference, so
| in the original code there might be zero copies.
|
| In practice this optimization would be very fragile (lake
| most optimizations relying on escape analysis) and people
| would keep passing by reference in fear that a small change
| like taking the address of an object might force a copy of a
| large struct.
| mort96 wrote:
| Well, no, that's just one of the situations where the compiler
| would have to make a copy in the callee; a write to a pointer
| which might alias the argument.
| temac wrote:
| For a copy on write const ref ABI trick to work, the referenced
| area has to be stable while the callee is running. Good luck with
| guarantying that in C or C++ (and no, it would not necessarily be
| a race for the refed object to be concurrently unstable, the
| called function could have internal synchro...)
|
| So I think this idea is virtually impossible. Your ABI is
| probably not wrong, but your blog post probably is :P
| mort96 wrote:
| Well, no. The callee would just have to make a copy in cases
| where the memory might change.
|
| The compiler would have to produce a copy in the case of this
| function: void foo1(struct large l) {
| bar(); baz(l.x + l.y); }
|
| Because the call to `bar()` might change the caller's copy, so
| the callee `foo1` has to make a local copy before calling
| `bar()`.
|
| But it would not have to produce a copy in this example:
| void foo2(struct large l) { baz(l.x + l.y);
| }
|
| Because nothing can change the caller's copy before the callee
| `foo2` is done using it.
| Negitivefrags wrote:
| This isn't correct.
|
| The compiler would not have to produce a copy in either case.
|
| The call to bar() can not change the value of l because the
| person who called foo1 gave you an immutable pointer to it.
|
| How did foo1's caller know it was immutable?
|
| If the value passed to foo1 was a local variable, or one of
| it's arguments passed to it by immutable pointer, then all
| the compiler has to do is check if, in that function, it
| created a mutable pointer to the value and put it somewhere
| accessable to someone else. If it did not, then it can just
| pass the pointer right through.
|
| If it did create a mutable pointer, or if the value came from
| some other place the compiler has no knowledge about, then it
| would make a copy of the value before passing a pointer to
| foo1, just like it would do for the old ABI.
| foobiekr wrote:
| "A correctly-specified ABI should pass large structures by
| immutable reference"
|
| There is no such thing at the level that the ABI works,
| especially at a kernel-userland boundary (where the choice is to
| fully marshal the arguments or accept that you have a TOCTOU
| issue).
| user-the-name wrote:
| An ABI is a contract. If it wants to say that a pointer is an
| immutable reference, all it needs to do is to say "this pointer
| is an immutable reference", and specify what that means. It is
| up to those who follow that contract to follow those rules.
|
| You ABI could say a pointer may only be accessed on Wednesdays
| if it really wanted to.
| omnicognate wrote:
| Wednesdays in what timezone and by what clock?
| varajelle wrote:
| Even if we assume the computer's own internal clock, we
| would have a race condition if the computer goes to sleep
| on Wednesday afternoon just after a call to that function
| is done, and woken up in the next day. Sounds like a really
| impractical ABI.
| legulere wrote:
| Callee saved registers kind of work like that. You either don't
| touch those registers, or you restore them before returning.
| leni536 wrote:
| I think clang's noescape attribute for pointer function
| parameters are somewhat related. There are several compiler
| extensions that allow refining calling conventions, so clearly
| even compiler vendors think that there are things to explore
| here.
|
| Having said that I think the suggestion would be observably break
| both the current C and C++ standards, so something like this
| shouldn't be done without an explicit attribute.
| hctaw wrote:
| I'm reminded of Chesteron's Fence in this.
|
| Every major ABI is listed here as containing the same mistakes.
| I'm inclined to think the people who designed these ABIs were
| smart enough to understand the consequences of their design
| decisions.
|
| I don't know whether this author is correct or not, but my gut is
| there is something missing here with respect to non local control
| flow (like exception handling, setjmp/longjmp, and fibers).
| lostcolony wrote:
| I love seeing others bring up Chesterton's fence; it's been a
| reference that comes to mind with quite a lot of the WTFery
| I've encountered in my career (usually it remains WTFery even
| when looking for underlying reasons, but it at least helps
| remind me to question my instincts).
|
| I don't really know enough to weigh in on this, but I can say
| that having pursued a lot of WTFish things in my career so far,
| 90% of the times I've encountered bad decisions, the
| explanation for it was either "it was done that way because
| legacy reasons" (i.e., it had to be done that way then, the
| reason it had to be has changed, and now it would break things
| to do it 'correctly') or "it was easier" (i.e., at the time the
| badness wasn't really going to affect anyone, or not
| measurably, or was very intentional tech debt, and it's only
| 'now' that anyone is noticing/caring).
| david422 wrote:
| I've seen people make bad architectural decisions that now
| the company is stuck with. And it comes down to just the fact
| that it was a bad decision, no second guessing needed.
|
| I've also seen "bad" decisions made due to outside
| constraints. These decisions look like bad decisions, except
| that if you try to "fix" those decisions, it becomes a lot
| harder than it looks.
| rsj_hn wrote:
| Yup, there's also time dependence. Perhaps someone wrote some
| software in COBOL that is hard to maintain now. But rewritng
| it may not be worth the opportunity cost now, especially for
| well-tested systems that have been around for a long time and
| which have critical failure modes. Sometimes it's better to
| leave things alone and work around them, even if it results
| in an uglier design.
| derefr wrote:
| In this case, "it was done that way because legacy reasons"
| is close, but the real answer is "it was done that way
| because we hadn't yet invented the parts of compiler theory
| required to create compilers that enforce this constraint at
| the type level."
| jcelerier wrote:
| > A correctly-specified ABI should pass large structures by
| immutable reference
|
| is just not possible. CPUs don't know about `const`. So you
| have to work with the assumption that functions that you call
| can do anything to their arguments. Thus copies cannot be
| avoided.
| [deleted]
| mhh__ wrote:
| The CPU also doesn't know what an ABI is
| wizzwizz4 wrote:
| CPUs actually _do_ know about const; it 's called a read-only
| page.
|
| Besides, that's irrelevant. There's nothing stopping my
| function from following every pointer on the stack and
| smashing up its contents; are you going to defend against
| that, too? If not, how is this any different?
| jacoblambda wrote:
| An ABI also has a concept of defined and undefined behaviour.
| You can design an ABI that is fully protected against abuse
| but often the performance penalty for that will be huge.
|
| Instead what you'll do is specify the constrained inputs and
| expected output behaviour. From there you can out anything
| that violates those constraints as non-conformant. As long as
| you maintain those constraints between versions, there's no
| ABI breakage.
|
| Also you can absolutely have constant references in an ABI.
| There may be ways of ignoring the const depending on how you
| design the ABI but they will be obvious abuse.
| gpderetta wrote:
| Exactly, see my example elsethread. Also in C and derivatives
| distinct objects are guaranteed to have distinct addresses.
| Implicit sharing would break this.
| mort96 wrote:
| It wouldn't. The compiler would just have to generate the
| copy when the standard demands it (such as if the function
| body takes the address of the object).
| gpderetta wrote:
| Yes but then in many cases either (or both!) the caller and
| the callee might need to make a copy defeating the point of
| the optimization or even being worse than the original.
| mort96 wrote:
| In many cases the callee would have to make a copy, yes.
| However:
|
| 1. In many cases, no copy would have to be made. There
| are lots of small non-complex functions out there where
| the compiler can prove that it's safe to not make a copy.
|
| 2. In many other cases, a copy has to be made. But the
| copy is made by the callee, not by the caller. That means
| that all the instructions necessary to copy the argument
| ends up in the binary once in the callee, rather than
| once for every function call, leading to less code bloat
| (which has its own performance advantages).
|
| In fact, a stupid compiler could just always make a copy
| without analyzing the function body. This would result in
| a compiler which generates code that's about as fast as
| it would be with current ABIs, but with a smaller size.
| gpderetta wrote:
| You have to make a copy on the caller or the callee if
| the address of the object escapes, so you might end up
| with two extra copies even if nothing in the program
| mutates the object.
| legulere wrote:
| How about those explanations:
|
| It didn't matter before, as compilers were not optimizing as
| much, code had a much closer 1:1 correspondence to assembly (if
| you are passing it by pointer and not register, you would want
| to make that clear in code).
|
| It's much easier to implement in simple compilers. On the side
| of the callee you don't have to check if you manipulate your
| arguments, which is generally hard. Being able to manipulate
| your arguments is another shortcut for keeping the compiler
| simple. On the side of the caller you don't have to check if
| you hand out a mutable pointer.
|
| Also finally and most importantly: memory access was much
| cheaper in terms of cpu cycles. Just look at cdecl: all
| parameters are passed on the stack instead of registers. Our
| current calling conventions stem from performance hacks like
| fastcall that were only optimizing for existing code (you pass
| big structs by pointer by convention).
| brigade wrote:
| Well for one, the language says a copy is made _at the time of
| the function call_ , and it's perfectly valid to modify the
| original before the copy is finished being used. So pretty much
| any potentially aliasing write or function call in the callee
| would force a copy, and as he notes C's aliasing rules are lax
| enough that that's most of them.
|
| Then if you care about the possibility of signal handlers
| modifying the original... you pretty much have to make a copy
| every time anyway.
| temac wrote:
| Plus any _potential_ concurrency synchro point existing would
| force a copy, plus using any unknown function, etc.
|
| Using rust and propagating the single writer xor multiple
| readers requirement in an ABI, this might be interesting. But
| with C/C++, I'm afraid copies would be forced "all" the time.
| mort96 wrote:
| There's still a lot of functions which don't call unknown
| functions before accessing an argument passed by value,
| don't take the argument's address, etc. There are many
| simple functions such as this one: void
| print_foo(FILE *outf, struct foo foo) {
| fprintf(outf, "foo '%s': %i, %i\n", foo.name, foo.x,
| foo.y); }
|
| That one would gain a speed-up and code-bloat reduction
| from the proposed ABI, and there are many like it.
|
| But even if every single function had to fall back to
| making a copy, the argument is that there's still a
| significant code bloat saving by putting the copy in the
| callee rather than in the caller. After all, the
| instructions necessary to make a copy takes some space, and
| with the proposed ABI, those instructions are put in the
| called function, rather than in every function call. Most
| functions are called more than once, and all functions are
| called at least once (hopefully), so anything which can be
| changed from O(number of function calls) to O(number of
| functions) is an improvement.
| jbverschoor wrote:
| Sometimes a mistake is a decision under the assumption that the
| people intended to use this are smarter / more careful than
| they are.
| moonchild wrote:
| > my gut is there is something missing here with respect to non
| local control flow (like exception handling, setjmp/longjmp,
| and fibers)
|
| (Post author.)
|
| Mechanically, what happens is essentially the same as what
| ms/arm/riscv do: the caller creates a reference and passes it
| to the callee. The only difference is that the callee is _more_
| restricted than it would otherwise have been in what it can do
| with the memory pointed to by that reference. So I don 't think
| that there can possibly be any implications for non-local
| control flow.
| hctaw wrote:
| Doesn't the referenced data have to be guaranteed to outlive
| the callee, which would only be true if the callee is
| guaranteed to return to the calling scope?
|
| You can get around the immutability of the reference if your
| compiler implements the ABI with copy on write semantics,
| which I think is a reasonable compromise. But I'm still not
| certain how you would handle arbitrary control flow that the
| compiler may not be able to reason about.
|
| If for example your arguments may be behind const references,
| how would you implement getcontext/swapcontext for your ABI?
| If everything is an integral value in registers or on the
| stack then it's really easy, but i would think it would have
| to be a compiler intrinsic if it depends on the function
| signature of the calling context, in order to perform the
| required copies.
| chjj wrote:
| DMR on `restrict` (originally proposed for C89 as `noalias`):
| http://port70.net/~nsz/c/c89/dmr-on-noalias.html
| legulere wrote:
| Why would that optimization not be possible with a simple const
| *?
|
| The problem I see with that proposal is that it introduces a new
| type of pointer, the immutable pointer. That seems like equal to
| a const pointer, but it's not. With a const pointer the callee
| can not mutate the pointee. But it can still be mutated from the
| outside. That means that any time you handed out a mutable
| pointer to something you have to make a copy for handing out an
| immutable pointer to it. An ABI like this would probably be much
| more complex to implement for a small gain.
|
| You would end up with hardly predictable behaviour wether the
| struct gets copied or not. C# structs suffer a lot from this,
| because methods are mutating by default
| (https://codeblog.jonskeet.uk/2014/07/16/micro-
| optimization-t...). The biggest problem is simply that this is
| not explicit.
|
| Also there is a case where a calling convention like this can
| make things worse, as you will have to make two copies:
| void fn1(A*); void fn2(A a) { // Have to make
| copy here because mutating fn1(&a); }
| void fn3() { A a; fn1(&a);
| // Have to make copy here because reference to a might have
| escaped. fn2(a); }
| moonchild wrote:
| > With a const pointer the callee can not mutate the pointee.
| But it can still be mutated from the outside
|
| That's incorrect. In c semantics, at least, it is legal to take
| a pointer to non-const, cast it to a pointer to const, pass it
| back, and mutate the object pointed to. It is only illegal to
| mutate if the pointer actually points to const memory.
|
| Which means that, in a _general_ sense, if you 're given a
| const pointer, since you can't know if it actually points at
| const memory, you shouldn't mutate through it. But if you're
| handing out const pointers to non-const memory, you shouldn't
| count on the memory not being changed through those pointers.
|
| IOW const is all but useless in c (except as a declaration of
| intent).
|
| > An ABI like this would probably be much more complex to
| implement for a small gain
|
| Why? Mechanically, it's almost the same as the ms/arm/riscv
| abi, except that a copy is made by the callee rather than the
| caller.
| gpderetta wrote:
| Const matters to the caller, if it calls a function that
| takes a pointer to const it can generally reasonably expect
| that the object wont be changed.
|
| It is not about enabling compiler optimisations but about
| preserving invariants.
| vvanders wrote:
| One of the things I'm really excited about for Rust is restrict
| semantics are baked right there into the language with &mut/&. I
| think there were still a few LLVM bugs to flush out(because very
| little C/C++ code uses restrict by nature of how it subtly
| explodes when you get it wrong).
|
| In theory when they sort that out Rust should be able to turn on
| restrict where it applies globally "for free".
| nathankleyn wrote:
| Propagating "noalias" metadata for LLVM has actually finally
| been enabled again recently in nightly [0]. However it has
| already caused some regressions so it is not clear whether we
| may go through another revert/fix in llvm/reenable cycle [1].
| This has happened several times already sadly [2] as, exactly
| as you say, basically nobody else has forged through these
| paths in LLVM before.
|
| [0]: https://github.com/rust-lang/rust/pull/82834
|
| [1]: https://github.com/rust-lang/rust/issues/84958
|
| [2]: https://stackoverflow.com/a/57259339
| vvanders wrote:
| Ah damn that's a real shame, something to be said about how
| rarely restrict is used(I usually only touched it for
| particle systems or inner loops of components where I knew I
| was the only iterator).
| galangalalgol wrote:
| Seriously, this has been _years_ now. Is this understandable,
| or does this tell us something bad about llvm?
| wnoise wrote:
| Or something bad about noalias?
| swsieber wrote:
| This is understandable. Rust really uses restrict semantics
| in anger compared to any other language I know of. Have you
| seen restrict used in a c codebase? The LLVM support for
| restrict just doesn't get exercised much outside of Rust.
| galangalalgol wrote:
| Also there are a lot of fortran compilers using llvm now.
| Fortran has the information for noalias as well.
| galangalalgol wrote:
| I use it in c++ on most signal processing stuff. I think
| eigen will use it if you don't let it use intrinsics for
| simd. I also use g++ so I wouldn't have encountered it
| anyway.
| khuey wrote:
| This is what happens when you're trying to add a hairy
| feature to a big legacy codebase.
| rrdharan wrote:
| Referring to the LLVM codebase as legacy makes me feel
| old...
| zozbot234 wrote:
| > Is this understandable, or does this tell us something
| bad about llvm?
|
| LLVM is a large project that's mostly written in pre-modern
| C++, and "noalias" is a highly non-trivial feature that
| affects many parts of the compiler in 'cross-cutting' ways.
| It would be surprising if it did _not_ turn up some initial
| bugs.
| galangalalgol wrote:
| Initial, yes, but this was first uncovered in Oct 2015.
| That seems like long enough to fix it.
| saagarjha wrote:
| Aliasing analysis is a complicated part of the compiler,
| and it underpins a lot of optimization passes. It's not
| an easy thing to bolt on.
| masklinn wrote:
| It's not a single bug, it's a bunch of different bugs in
| the interactions between noalias and various analysis and
| optimisation passes.
| Teknoman117 wrote:
| The other neat thing about Rust is that if you turn on LTO and
| all of the involved code is Rust, there is no hard ABI. The
| linker will do all sorts of interesting things with register
| usage and how to pass data to functions.
| mhh__ wrote:
| Why is that not the case with any other language and LTO?
| comex wrote:
| It is. As far as I know, rustc just exposes LLVM's LTO
| functionality, without performing any Rust-specific
| optimizations at link time. So you get the same
| optimizations as you would with LTO in Clang (C compiler),
| and indeed you can do cross-language LTO between the two.
|
| Also, even without LTO, LLVM can perform the same ABI
| optimizations on functions that are local to a single
| translation unit and not exported to other ones. In Rust
| that means a function not exported across crate boundaries.
| In C that means a `static` function.
| donkarma wrote:
| I think MSVC has some weird things when you call a function
| and you do this, I don't remember but it had a weird
| calling convention in the assembly IIRC.
| colour wrote:
| "Well, what's wrong with that? Surely we can just do what we did
| in the days of dumb compilers and pass structures by pointer.
| Unfortunately, that doesn't work anymore; compilers are smart
| now, and they don't like it when objects alias. " What does that
| even mean? Does gcc cry after I make it translate code with
| pointers, I simply don't get it.
| re wrote:
| "Doesn't work anymore" is a hyperbole. Aliasing can prevent
| optimizations, like in the code example immediately after that
| paragraph where the value of `x` needs to be loaded from memory
| to be returned.
|
| See also
| https://en.wikipedia.org/wiki/Aliasing_(computing)#Conflicts...
___________________________________________________________________
(page generated 2021-05-08 23:00 UTC)