[HN Gopher] Carbon's most exciting feature is its calling conven...
___________________________________________________________________
Carbon's most exciting feature is its calling convention
Author : obl
Score : 103 points
Date : 2022-07-30 18:43 UTC (4 hours ago)
(HTM) web link (www.foonathan.net)
(TXT) w3m dump (www.foonathan.net)
| Denvercoder9 wrote:
| Maybe I'm missing something, but I don't see what's special about
| Carbon here. In C++ the compiler can also optimize pass-by-const-
| reference to pass-by-value, and they do. It just can't do it
| across an ABI boundary, but that should only be an issue with
| dynamic libraries, and Carbon has to follow the standard ABI
| there as well. Just make sure the compiler knows it doesn't have
| to follow the standard ABI for every symbol in your program, e.g.
| by enabling LTO and setting -fvisibility=hidden. See what happens
| `add` is made static: https://godbolt.org/z/KsqxP5Pxh (I also had
| to change foo() to avoid the compiler hardcoding the result of
| the single call in `add`).
| olliej wrote:
| I think the alias analysis "pro" is overblown, it's UB to take
| the address of a parameter via any mechanism other than
| explicitly taking the address - which a compiler obviously
| sees, and because it's UB the compiler optimizes is free to
| assume no one is taking the address. Then for any parameters
| that are passed by reference the compiler has to assume there
| are other references so there's no gain.
|
| Honestly the only thing that really stood out as nice is the
| default pass by word-sized value, in the context of templates -
| it's a thing that is achievable in C++, but requires a bunch of
| obnoxious additional templates that aren't even part of the
| standard library so everyone ends up reimplementing the same
| cruft. Happily I believe there's a proposal to add this exact
| functionality.
|
| I also loathe their desire to listen to the BNF maximalists
| insistence on not having any "ambiguity" from <>s. I'm sorry
| it's clearly parseable, and <>s are the standard token for
| decades. Switching to []s doesn't make it less conceptually
| ambiguous, if anything it makes it more ambiguous to a human
| reader. The only people who don't want <>s are PLT academics
| obsessed with forcing their dragon book idea of what a grammar
| should be. You can't argue you're doing it because the
| ambiguity in a grammar or lexer is bad, because then you would
| also drop infix operators.
|
| Then in carbon the more reasonable adoption of pascal's :
| notation for typing a variable or parameter removes the most
| common case of the supposedly terrible ambiguity anyway.
| temac wrote:
| > it's UB to take the address of a parameter via any
| mechanism other than explicitly taking the address - which a
| compiler obviously sees, and because it's UB the compiler
| optimizes is free to assume no one is taking the address.
|
| I don't get that: can you express in C++ a code that "take
| the address of a parameter via any mechanism other than
| explicitly taking the address"?
| josephcsible wrote:
| Consider this C code (also "works" if compiled as C++):
| int main(void) { int x = 0; int
| arr[1]; int *p = arr + 1; *p = 42;
| return x; }
|
| On a lot of systems (e.g.,
| https://godbolt.org/z/jYqM8TT3Y), it just so happens that
| `x` is right above `arr` on the stack, so that code will
| return 42. But that code is absolutely UB.
|
| The more general name for this concept is "pointer
| provenance". Basically, you can't pull pointer values out
| of thin air; you have to derive them from operations rooted
| at taking the address of something within the same
| allocation.
| hedora wrote:
| That's a buffer overflow. The optimizer doesn't need to
| reason about changing the behavior of such things.
| josephcsible wrote:
| The point is that on systems where that code returns 42,
| `p` has the exact same value it would if I did `int *p =
| &x;` instead, but not the same provenance.
| [deleted]
| zasdffaa wrote:
| > I also loathe their desire to listen to the BNF maximalists
| insistence on not having any "ambiguity" from
|
| Odd that you put ambiguity in quotes. I guess it's not real
| ambiguity then.
|
| Ivory tower academics, as you see them, actually care about
| stuff because it affects you. I've heard too many times the
| knee-jerk response of "it's just theory" as if that actually
| meant anything. Try implementing it yourself and you'd start
| to understand that little bits of crap typically don't add,
| they multiply (Edit: and I'm hurting from exactly this right
| now), so don't diss theory until you've (ironically) had some
| practice with it.
| comex wrote:
| That optimization is quite fragile.
|
| For example, try putting `puts("hello");` at the beginning of
| `add`. Now neither GCC nor Clang performs the optimization.
| Why? Because `puts` could theoretically modify the value behind
| the reference, so the value loaded is not necessarily the same
| as the value at the beginning of the function, which makes
| things more complicated, so both compilers give up.
|
| As another example, GCC and Clang both perform the optimization
| within a translation unit, but if the function definition and
| the call are in _different_ translation units, GCC doesn 't
| perform the optimization even with LTO, and Clang doesn't
| perform it with ThinLTO (but does perform it with full LTO).
| Meanwhile, many projects don't compile with any form of LTO,
| which is a reasonable decision to improve compilation speed and
| predictability.
|
| Neither compiler is smart enough to perform the optimization
| for virtual calls in almost any situation.
| moralestapia wrote:
| >Maybe I'm missing something, but I don't see what's special
| about Carbon here.
|
| It's nothing really out of the ordinary ... also that
| particular optimization is meager in practice (on the order of
| 1%?). I know everything improvement adds up, but the author
| talks about it as if it was a game changer. I like the
| enthusiasm, though :D.
| phire wrote:
| I have been burned many times in the past by "the compiler is
| allowed to optimise something away".
|
| You write your code assuming such an optimisation will happen,
| and for some reason, the compiler decides not to apply the
| optimisation. Perhaps the wind was blowing in the wrong
| direction, or it was in a bad mood, or you forgot to specify
| -fvisiblitly=hidden.
|
| The exciting part here is that it happens by default, and the
| compiler is required to do it. I don't have to think about it,
| and the ABI automatically does the thing best for performance.
| jcelerier wrote:
| I don't think that fvisibility=hidden on its own is
| sufficient, it does not allow the compiler to break the call
| abi as the function could still be called from another .o
| (which will only know the mangled name of the original
| function). You need fvisibility=internal (or maybe fno-
| semantic-interposition but I'm not sure if it's enough).
| fourthark wrote:
| This is mostly a Google project?
|
| Unclear from the repo who is sponsoring it.
| bardworx wrote:
| Microsoft.
|
| - don't listen to me, it's Google.
| moralestapia wrote:
| It's Google but yes, another instance of _Embrace, Extend and
| Extinguish_.
|
| I hope this doesn't take off, with my sincere apologies to
| the ones who have been working hard on this. The last thing
| the C/C++ ecosystem needs is becoming _de facto_ owned by a
| private company.
| labrador wrote:
| I prefer a fast evolution when necessary to a slow
| committee process. I don't see any drawbacks, unless your
| the type that prefers to build consensus before moving
| forward
| olliej wrote:
| In fairness this is as opposed to Go? Swift? Rust? Java?
| C#?
|
| Then C and C++ are defacto controlled by private companies:
| Google+Apple do pretty much all the clang development, MS
| does MSVC - if they choose not to implement a feature
| that's approved, or implement one that isn't, that is the
| de facto standard.
|
| You can argue it would require all three to agree on
| something, but that's still essentially making WG21
| somewhat irrelevant.
| dimitrios1 wrote:
| It hasn't been an issue at all for Go. As long as all the
| source is public, I don't see an issue using a large,
| overvalued tech monopoly for advancing technology.
| bardworx wrote:
| Sorry, not in the ecosystem (clearly).
|
| Why not? Just curious.
| moralestapia wrote:
| No problem, here's some context, https://en.wikipedia.org
| /wiki/Embrace,_extend,_and_extinguis...
| Taywee wrote:
| Yeah. Basically think a C++ alternative by C++ developers who
| are dissatisfied with the standards committee.
| Animats wrote:
| That's been done before. The original Modula 1 compiler had that.
| In a language where the default parameter mode is a read-only
| reference, it's an obvious optimization.
|
| The reverse is true. Anything passed by value can be treated as
| const reference by the compiler if the compiler knows enough
| about access and lifetime. The compiler must be able to determine
| that the parameter is neither deallocated nor modified while the
| function is using it. Rust compilers should be able to do that.
| mhh__ wrote:
| D and I think Fortran can also do this
| dfawcus wrote:
| The MIPS NUBI ABI from 2005 also proposed this for C:
| ftp://ftp.linux-
| mips.org//pub/linux/mips/doc/NUBI/MD00438-2C-NUBIDESC-
| SPC-00.20.pdf
|
| Section 3.4, page 21:
|
| "Arguments: [...]
|
| Derived types (structures etc) and non-standard scalar types
| are passed in a register if and only if their memory- stored
| image is register-size aligned and fits into a register. [...]
|
| All other arguments are passed by reference. The callee must
| copy the argument if it writes it or takes its address."
| ArrayBoundCheck wrote:
| Calling convention? That's the most exciting feature? It doesn't
| boost productivity, reduce compile times, offer a larger standard
| library or create a boost a reasonable language would provide?
|
| No thanks google. I've been saying that for years now. You jumped
| the shark
| oneplane wrote:
| At first I thought this was about Mac OS around the 9 and 10
| times... but I suppose not everyone is going to make that link
| with this name.
| hinkley wrote:
| I got as far as "surely they can't be talking about OS 9,
| someone must have recycled the name"
| rgovostes wrote:
| Apple's Cocoa framework itself reused the name of a 90's visual
| programming language for kids developed by its own Advanced
| Technology Group. It was certainly a play on "java for kids."
|
| https://en.wikipedia.org/wiki/Stagecast_Creator
| lioeters wrote:
| That's the first time I heard of Cocoa as "Java for kids".
| How cute!
| mathgeek wrote:
| You're certainly not the only one. It's a confusing choice of
| naming.
| Someone wrote:
| I hope they picked that name for the same reason Apple picked
| it: because all life is built on carbon.
| oneplane wrote:
| Or because Carbon as an element is "C" and "C" is also the
| language the framework was built for.
| Maursault wrote:
| Well, Apple was a little clever with Carbon for this
| reason, because it was a _C_ based API. I 'm not sure how
| you get Carbon from _C++._
| tambourine_man wrote:
| Also, the UI was called Aqua. So Aqua + Carbon is an
| interesting metaphor for an OS ecosystem.
| cflewis wrote:
| I think the statute of limitations on "confusing name" when
| Mac OS Carbon was removed from OS X 10 years ago has passed.
| morelisp wrote:
| Carbon was _removed_ two years ago.
| olliej wrote:
| Yeah, I was confused by the name carbon, but am able to
| recognize that what I think of by default is not relevant
| anymore.
| oneplane wrote:
| Since those old operating systems are now vintage or retro,
| there have been a few people porting things like Rust, Go
| and Swift to older Mac OS versions (the ones that use the
| Toolbox ROM) which was also in the news here.
|
| There is of course no 'registry of allowed names that have
| aged-out' but there are probably other creative names one
| could come up with.
| sedatk wrote:
| I mean, it's at least a bit less ambiguous than Go.
| booleandilemma wrote:
| I get the impression this guy just wanted something to blog
| about.
| tialaramex wrote:
| C++ also pays a price for insisting not only that objects have
| addresses, but those addresses are distinct.
|
| If you've got a 1.6 billion empty tuples in variable A, 1.4
| billion in variable B and 1.8 billion in variable C, C++ can't
| see a way to do that on a 32-bit operating system. It needs to
| give each empty tuple an address, so it must think of 4.8 billion
| integers between 0 and 2^32 and it can't do that, so your program
| won't work.
|
| Carbon is still far from finished, but if objects needn't have
| addresses it can do the same as Rust here, and cheerfully handle
| A, B and C as merely counters, _counting_ 1.6 billion, 1.4
| billion and 1.8 billion respectively is fine. Empty tuples are
| indistinguishable, so I needn 't worry about giving you back the
| "wrong" empty tuple when you remove one, I can just give you a
| fresh one each time and decrement the counter.
| shakow wrote:
| Is that really a problem now that the overwhelming majority of
| C++ programs run on 64bits platforms?
| comex wrote:
| A better example of address uniqueness being a problem is
| with code like: struct Marker {};
| struct Foo { Marker marker; int64_t
| number; };
|
| If you write the equivalent in C with GCC extensions, or
| Rust, sizeof(Foo) would be 8, the same as sizeof(int64_t);
| `marker` doesn't take up any extra space. In C++, however,
| sizeof(Foo) is 16, because `marker` must take up at least 1
| byte to have a unique address, which gets expanded to 8 bytes
| due to alignment.
|
| Now, as of C++20, you _can_ reduce sizeof(Foo) to 8 by
| tagging `marker` as [[no_unique_address]]. However, this has
| drawbacks. First of all, it 's easy to get situations like
| this in highly generic code, so it's hard to predict where
| [[no_unique_address]] needs to be applied (and applying it
| everywhere would be verbose).
|
| Second of all, [[no_unique_address]] is _dangerous_ , because
| it doesn't just allow empty fields to be omitted, it also
| allows nonempty fields to have trailing padding bytes reused
| for other fields. Normally that's okay, but if you have any
| code that performs memcpy or memset or similar based on the
| size of a type, such as: struct Foo {
| Foo(const Foo &other) { memcpy(this, &other,
| sizeof(Foo)); } // ...some fields
| here... };
|
| ...then if that code writes to a [[no_unique_address]] field,
| it can overwrite adjacent fields, since sizeof(Foo) includes
| any trailing padding bytes!
| Someone wrote:
| I may be overlooking something, but I don't see a realistic use
| case for having multiple empty tuples without an address.
|
| If you have those, I don't see any way to discriminate between
| them. If so, why would you ever want to have more than one of a
| given type? Is there some template code that might accidentally
| try to create them?
| zeusk wrote:
| Maybe the right question to ask is, why you have 4.8 billion
| empty tuples? And why you're still on a 32-bit system?
| borschtplease wrote:
| > And why you're still on a 32-bit system
|
| There is a lot of 32 bit microcontrollers, most notable stm32
| WJW wrote:
| Sure, but how many of those have to handle billions of
| empty tuples?
| TylerGlaiel wrote:
| since C++20 you can get around this
|
| https://en.cppreference.com/w/cpp/language/attributes/no_uni...
| axegon_ wrote:
| Dunno about the most exciting but the most irritating so far is
| seeing it up. Admittedly I didn't spent a lot of time trying(I
| didn't have all that much time today) but in the 40 minutes or
| so, I was unable to get it going. I'll go back to attempting
| tomorrow and make sense of the errors.
| pyjarrett wrote:
| It's a bit early in its lifecycle to get too excited for features
| which already exist in other languages. Carbon is exciting
| because of C++ interoperability, but I already get this behavior
| today in Ada. type Point is record x, y,
| z : Interfaces.Integer_64; end record; procedure
| Print(p : Point);
|
| Is `p` passed by reference or value? The compiler chooses what it
| thinks is best--all parameters are considered `const` unless
| they're `out` parameters. There's some rules for classes (tagged
| types), uncopyable (limited) objects, and `aliased` parameters
| which are always passed by reference.
|
| I can't get a pointer type (access) out of the parameter to the
| function, since the accessibility rules prevent it:
| -- "constant" since we don't know if it is writable type
| Point_Access is access constant Point; Last_Printed :
| Point_Access := null; procedure Print(P : Point) is
| begin -- P'Access is sort of like C++ std::addressof(P)
| to get a "pointer" -- There's also P'Address to get the
| actual address, but then requires conversion to a pointer-like
| "access" type to be used. -- -- Compiler Error:
| "non-local pointer cannot point to local object" since
| Point_Access type is declared at a higher level
| Last_Printed := P'Access; -- If we really, really,
| want to do it, "I'm smarter than the compiler", you can force
| it... -- Think of "Unrestricted" and "Unchecked" as
| grep-able warnings signs of "this is potentially very dangerous"
| Last_Access := P'Unrestricted_Access; -- ...
| end Last_Printed;
|
| What about making and then trying to use a local pointer-like
| type? This doesn't work because you can only create pointer-like
| accesses to types which have been marked as `aliased`, since you
| don't know if there's a location you can point to which has the
| value. procedure Print (P : Point) is
| type Local_Access is access constant Point; --
| Compiler Error: prefix of "Access" attribute must be aliased
| Ptr_Like : Local_Access := P'Access; -- Similar, "I
| am smarter than compiler" trick works here too...
| Ptr_Like : Local_Access := P'Unrestricted_Access;
|
| You can allow passing any arbitrary pointer into a function by
| providing `access`, but you're not allowed to store it, since you
| don't know which flavor of the pointer type it could be, e.g. if
| it points to something on the stack, or on the heap:
| type Point_Access is access constant Point; Last_Printed :
| Point_Access := null; -- Allow printing any pointer-
| like (access) to a point. procedure Print (P : access
| constant Point) is begin -- Compile Error:
| implicit conversion of anonymous access parameter not allowed
| Last_Printed := P; -- If we really, really want to
| do this, we can force it with a cast... Last_Printed :=
| Point_Access (P); -- ... end Print;
| rgovostes wrote:
| I was very confused by what the author is pointing out in the
| opening Point / Print example. ("[T]he compiler is allowed to
| convert that to a T" -> wait, why did the compiler change a
| struct to an int32?)
|
| I think this boils down to: Carbon defaults to passing parameters
| that fit in a single register by value, and all others by const
| reference. This affects a few things you might take for granted
| in C++, like whether you can take a reference to a parameter.
|
| The opening example is showing two samples of Carbon and the
| equivalent C++ code, noting that the "undecorated" parameter `p :
| Point` is equivalent to `const Point& p` (pass by const reference
| to a struct) and `x : i32` is equivalent to `std::int32_t x`
| (pass by value).
| NonNefarious wrote:
| An old Apple UI kit has an "exciting" feature?
| olliej wrote:
| Someone didn't read the article :)
| zascrash wrote:
| It's about Carbon Language (from google) not the framework.
| https://github.com/carbon-language/carbon-lang
| glouwbug wrote:
| Aren't we just reinventing the wheel over and over at this point?
| Sure could us another Go, C wasn't doing functions well at all /s
| ISL wrote:
| I'm sure the ship has sailed on Carbon's naming convention, but
| darn if that isn't a confusing article-title.
|
| It is interesting to contemplate the most-ambiguous and least-
| comprehensible/googleable name one might be able to give to a
| piece of software. "the"? "Biden"? "Russia"? "water"? "air"?
| "dog"? "person"? "!"? "?"? " "?
| labrador wrote:
| Google engineers: "Let's make our languages hard to Google!"
|
| Google managers: "Whatever floats your boat. How about Go, Dart
| and Carbon?"
| serial_dev wrote:
| I've been a Dart developer for three years and had zero
| issues finding anything Dart related.
|
| I also had to write some Go, that was actually pretty bad in
| terms of searchability.
| Cyberdog wrote:
| Swift was annoying in the early days since searches would
| bring up info on the international payment system.
|
| Lua and PHP have nice searchability. At least, it's fairly
| easy to weed out results on the Portuguese moon and the
| Philippine peso.
| bbkane wrote:
| I really think the most exciting feature of Carbon (indeed, it's
| justification for existence) is its backwards compatibility with
| C++, followed closely by it's more modern and flexible governance
| structure. Even the docs in the repo say you should avoid Carbon
| unless you have lots of C++ code you need to interact with.
| [deleted]
| usrnm wrote:
| Some of the biggest problems with C++ come from its backwards
| compatibility with C. Yes, it wins you users in the short term,
| but it's a pain to support as both languages evolve
| olliej wrote:
| The binary compatibility is the big deal. The other safe
| languages have explicitly taken the view the interop with C++
| is bad, and so we should instead do interop with the
| significantly less safe C instead.
|
| The real killer is that the lack of any interaction with C++
| means that you can't do any real incremental adoption of one
| of those safe languages in any big security critical
| projects. Saying that the solution to C++ is to just not use
| it ignores the reality that C++ exists, and large projects in
| C++ exist. It doesn't matter if you don't like C++.
|
| The final problem with the safe languages - with the
| exception of swift - is that they are all hell bent on not
| providing even just basic ABI stability. It doesn't matter
| how "safe" your language is if the first thing you have to do
| is make a pure C (even losing potential for automatic
| lifetime management the C++ would allow) interface.
|
| So I can have two libraries both written in rust, and the
| entire safety of it is contingent on each library talking to
| the other through a C API.
| usrnm wrote:
| > The other safe languages have explicitly taken the view
| the interop with C++ is bad
|
| It's not that it's inherently bad, it's just insanely
| difficult to do and, probably, isn't worth the pain.
| Carbon's approach to solving this problem includes
| embedding a custom C++ compiler as part of its toolchain,
| and at this point it's just the idea, who knows if they
| will be able to actually do it.
|
| > they are all hell bent on not providing even just basic
| ABI stability
|
| Right, the famous stable C++ ABI
| staticassertion wrote:
| > The other safe languages have explicitly taken the view
| the interop with C++ is bad
|
| I think it's more that interop with C++ is _hard_ and
| ultimately a lot less valuable than interop with C, which
| is what the vast majority of languages use for FFI.
|
| > It doesn't matter how "safe" your language is if the
| first thing you have to do is make a pure C (even losing
| potential for automatic lifetime management the C++ would
| allow) interface.
|
| I don't see how that leads to "it doesn't matter"
| __float wrote:
| If you consider the context of Carbon (Google having a
| _lot_ of C++), solving for "interop with C++ is hard"
| might be considered worth the price.
|
| Whether that's true in practice...? I guess we'll need a
| few more years to tell.
___________________________________________________________________
(page generated 2022-07-30 23:00 UTC)