[HN Gopher] Carbon's most exciting feature is its calling conven...
       ___________________________________________________________________
        
       Carbon's most exciting feature is its calling convention
        
       Author : obl
       Score  : 103 points
       Date   : 2022-07-30 18:43 UTC (4 hours ago)
        
 (HTM) web link (www.foonathan.net)
 (TXT) w3m dump (www.foonathan.net)
        
       | Denvercoder9 wrote:
       | Maybe I'm missing something, but I don't see what's special about
       | Carbon here. In C++ the compiler can also optimize pass-by-const-
       | reference to pass-by-value, and they do. It just can't do it
       | across an ABI boundary, but that should only be an issue with
       | dynamic libraries, and Carbon has to follow the standard ABI
       | there as well. Just make sure the compiler knows it doesn't have
       | to follow the standard ABI for every symbol in your program, e.g.
       | by enabling LTO and setting -fvisibility=hidden. See what happens
       | `add` is made static: https://godbolt.org/z/KsqxP5Pxh (I also had
       | to change foo() to avoid the compiler hardcoding the result of
       | the single call in `add`).
        
         | olliej wrote:
         | I think the alias analysis "pro" is overblown, it's UB to take
         | the address of a parameter via any mechanism other than
         | explicitly taking the address - which a compiler obviously
         | sees, and because it's UB the compiler optimizes is free to
         | assume no one is taking the address. Then for any parameters
         | that are passed by reference the compiler has to assume there
         | are other references so there's no gain.
         | 
         | Honestly the only thing that really stood out as nice is the
         | default pass by word-sized value, in the context of templates -
         | it's a thing that is achievable in C++, but requires a bunch of
         | obnoxious additional templates that aren't even part of the
         | standard library so everyone ends up reimplementing the same
         | cruft. Happily I believe there's a proposal to add this exact
         | functionality.
         | 
         | I also loathe their desire to listen to the BNF maximalists
         | insistence on not having any "ambiguity" from <>s. I'm sorry
         | it's clearly parseable, and <>s are the standard token for
         | decades. Switching to []s doesn't make it less conceptually
         | ambiguous, if anything it makes it more ambiguous to a human
         | reader. The only people who don't want <>s are PLT academics
         | obsessed with forcing their dragon book idea of what a grammar
         | should be. You can't argue you're doing it because the
         | ambiguity in a grammar or lexer is bad, because then you would
         | also drop infix operators.
         | 
         | Then in carbon the more reasonable adoption of pascal's :
         | notation for typing a variable or parameter removes the most
         | common case of the supposedly terrible ambiguity anyway.
        
           | temac wrote:
           | > it's UB to take the address of a parameter via any
           | mechanism other than explicitly taking the address - which a
           | compiler obviously sees, and because it's UB the compiler
           | optimizes is free to assume no one is taking the address.
           | 
           | I don't get that: can you express in C++ a code that "take
           | the address of a parameter via any mechanism other than
           | explicitly taking the address"?
        
             | josephcsible wrote:
             | Consider this C code (also "works" if compiled as C++):
             | int main(void) {             int x = 0;             int
             | arr[1];             int *p = arr + 1;             *p = 42;
             | return x;         }
             | 
             | On a lot of systems (e.g.,
             | https://godbolt.org/z/jYqM8TT3Y), it just so happens that
             | `x` is right above `arr` on the stack, so that code will
             | return 42. But that code is absolutely UB.
             | 
             | The more general name for this concept is "pointer
             | provenance". Basically, you can't pull pointer values out
             | of thin air; you have to derive them from operations rooted
             | at taking the address of something within the same
             | allocation.
        
               | hedora wrote:
               | That's a buffer overflow. The optimizer doesn't need to
               | reason about changing the behavior of such things.
        
               | josephcsible wrote:
               | The point is that on systems where that code returns 42,
               | `p` has the exact same value it would if I did `int *p =
               | &x;` instead, but not the same provenance.
        
           | [deleted]
        
           | zasdffaa wrote:
           | > I also loathe their desire to listen to the BNF maximalists
           | insistence on not having any "ambiguity" from
           | 
           | Odd that you put ambiguity in quotes. I guess it's not real
           | ambiguity then.
           | 
           | Ivory tower academics, as you see them, actually care about
           | stuff because it affects you. I've heard too many times the
           | knee-jerk response of "it's just theory" as if that actually
           | meant anything. Try implementing it yourself and you'd start
           | to understand that little bits of crap typically don't add,
           | they multiply (Edit: and I'm hurting from exactly this right
           | now), so don't diss theory until you've (ironically) had some
           | practice with it.
        
         | comex wrote:
         | That optimization is quite fragile.
         | 
         | For example, try putting `puts("hello");` at the beginning of
         | `add`. Now neither GCC nor Clang performs the optimization.
         | Why? Because `puts` could theoretically modify the value behind
         | the reference, so the value loaded is not necessarily the same
         | as the value at the beginning of the function, which makes
         | things more complicated, so both compilers give up.
         | 
         | As another example, GCC and Clang both perform the optimization
         | within a translation unit, but if the function definition and
         | the call are in _different_ translation units, GCC doesn 't
         | perform the optimization even with LTO, and Clang doesn't
         | perform it with ThinLTO (but does perform it with full LTO).
         | Meanwhile, many projects don't compile with any form of LTO,
         | which is a reasonable decision to improve compilation speed and
         | predictability.
         | 
         | Neither compiler is smart enough to perform the optimization
         | for virtual calls in almost any situation.
        
         | moralestapia wrote:
         | >Maybe I'm missing something, but I don't see what's special
         | about Carbon here.
         | 
         | It's nothing really out of the ordinary ... also that
         | particular optimization is meager in practice (on the order of
         | 1%?). I know everything improvement adds up, but the author
         | talks about it as if it was a game changer. I like the
         | enthusiasm, though :D.
        
         | phire wrote:
         | I have been burned many times in the past by "the compiler is
         | allowed to optimise something away".
         | 
         | You write your code assuming such an optimisation will happen,
         | and for some reason, the compiler decides not to apply the
         | optimisation. Perhaps the wind was blowing in the wrong
         | direction, or it was in a bad mood, or you forgot to specify
         | -fvisiblitly=hidden.
         | 
         | The exciting part here is that it happens by default, and the
         | compiler is required to do it. I don't have to think about it,
         | and the ABI automatically does the thing best for performance.
        
           | jcelerier wrote:
           | I don't think that fvisibility=hidden on its own is
           | sufficient, it does not allow the compiler to break the call
           | abi as the function could still be called from another .o
           | (which will only know the mangled name of the original
           | function). You need fvisibility=internal (or maybe fno-
           | semantic-interposition but I'm not sure if it's enough).
        
       | fourthark wrote:
       | This is mostly a Google project?
       | 
       | Unclear from the repo who is sponsoring it.
        
         | bardworx wrote:
         | Microsoft.
         | 
         | - don't listen to me, it's Google.
        
           | moralestapia wrote:
           | It's Google but yes, another instance of _Embrace, Extend and
           | Extinguish_.
           | 
           | I hope this doesn't take off, with my sincere apologies to
           | the ones who have been working hard on this. The last thing
           | the C/C++ ecosystem needs is becoming _de facto_ owned by a
           | private company.
        
             | labrador wrote:
             | I prefer a fast evolution when necessary to a slow
             | committee process. I don't see any drawbacks, unless your
             | the type that prefers to build consensus before moving
             | forward
        
             | olliej wrote:
             | In fairness this is as opposed to Go? Swift? Rust? Java?
             | C#?
             | 
             | Then C and C++ are defacto controlled by private companies:
             | Google+Apple do pretty much all the clang development, MS
             | does MSVC - if they choose not to implement a feature
             | that's approved, or implement one that isn't, that is the
             | de facto standard.
             | 
             | You can argue it would require all three to agree on
             | something, but that's still essentially making WG21
             | somewhat irrelevant.
        
             | dimitrios1 wrote:
             | It hasn't been an issue at all for Go. As long as all the
             | source is public, I don't see an issue using a large,
             | overvalued tech monopoly for advancing technology.
        
             | bardworx wrote:
             | Sorry, not in the ecosystem (clearly).
             | 
             | Why not? Just curious.
        
               | moralestapia wrote:
               | No problem, here's some context, https://en.wikipedia.org
               | /wiki/Embrace,_extend,_and_extinguis...
        
         | Taywee wrote:
         | Yeah. Basically think a C++ alternative by C++ developers who
         | are dissatisfied with the standards committee.
        
       | Animats wrote:
       | That's been done before. The original Modula 1 compiler had that.
       | In a language where the default parameter mode is a read-only
       | reference, it's an obvious optimization.
       | 
       | The reverse is true. Anything passed by value can be treated as
       | const reference by the compiler if the compiler knows enough
       | about access and lifetime. The compiler must be able to determine
       | that the parameter is neither deallocated nor modified while the
       | function is using it. Rust compilers should be able to do that.
        
         | mhh__ wrote:
         | D and I think Fortran can also do this
        
         | dfawcus wrote:
         | The MIPS NUBI ABI from 2005 also proposed this for C:
         | ftp://ftp.linux-
         | mips.org//pub/linux/mips/doc/NUBI/MD00438-2C-NUBIDESC-
         | SPC-00.20.pdf
         | 
         | Section 3.4, page 21:
         | 
         | "Arguments: [...]
         | 
         | Derived types (structures etc) and non-standard scalar types
         | are passed in a register if and only if their memory- stored
         | image is register-size aligned and fits into a register. [...]
         | 
         | All other arguments are passed by reference. The callee must
         | copy the argument if it writes it or takes its address."
        
       | ArrayBoundCheck wrote:
       | Calling convention? That's the most exciting feature? It doesn't
       | boost productivity, reduce compile times, offer a larger standard
       | library or create a boost a reasonable language would provide?
       | 
       | No thanks google. I've been saying that for years now. You jumped
       | the shark
        
       | oneplane wrote:
       | At first I thought this was about Mac OS around the 9 and 10
       | times... but I suppose not everyone is going to make that link
       | with this name.
        
         | hinkley wrote:
         | I got as far as "surely they can't be talking about OS 9,
         | someone must have recycled the name"
        
         | rgovostes wrote:
         | Apple's Cocoa framework itself reused the name of a 90's visual
         | programming language for kids developed by its own Advanced
         | Technology Group. It was certainly a play on "java for kids."
         | 
         | https://en.wikipedia.org/wiki/Stagecast_Creator
        
           | lioeters wrote:
           | That's the first time I heard of Cocoa as "Java for kids".
           | How cute!
        
         | mathgeek wrote:
         | You're certainly not the only one. It's a confusing choice of
         | naming.
        
           | Someone wrote:
           | I hope they picked that name for the same reason Apple picked
           | it: because all life is built on carbon.
        
             | oneplane wrote:
             | Or because Carbon as an element is "C" and "C" is also the
             | language the framework was built for.
        
               | Maursault wrote:
               | Well, Apple was a little clever with Carbon for this
               | reason, because it was a _C_ based API. I 'm not sure how
               | you get Carbon from _C++._
        
               | tambourine_man wrote:
               | Also, the UI was called Aqua. So Aqua + Carbon is an
               | interesting metaphor for an OS ecosystem.
        
           | cflewis wrote:
           | I think the statute of limitations on "confusing name" when
           | Mac OS Carbon was removed from OS X 10 years ago has passed.
        
             | morelisp wrote:
             | Carbon was _removed_ two years ago.
        
             | olliej wrote:
             | Yeah, I was confused by the name carbon, but am able to
             | recognize that what I think of by default is not relevant
             | anymore.
        
             | oneplane wrote:
             | Since those old operating systems are now vintage or retro,
             | there have been a few people porting things like Rust, Go
             | and Swift to older Mac OS versions (the ones that use the
             | Toolbox ROM) which was also in the news here.
             | 
             | There is of course no 'registry of allowed names that have
             | aged-out' but there are probably other creative names one
             | could come up with.
        
         | sedatk wrote:
         | I mean, it's at least a bit less ambiguous than Go.
        
       | booleandilemma wrote:
       | I get the impression this guy just wanted something to blog
       | about.
        
       | tialaramex wrote:
       | C++ also pays a price for insisting not only that objects have
       | addresses, but those addresses are distinct.
       | 
       | If you've got a 1.6 billion empty tuples in variable A, 1.4
       | billion in variable B and 1.8 billion in variable C, C++ can't
       | see a way to do that on a 32-bit operating system. It needs to
       | give each empty tuple an address, so it must think of 4.8 billion
       | integers between 0 and 2^32 and it can't do that, so your program
       | won't work.
       | 
       | Carbon is still far from finished, but if objects needn't have
       | addresses it can do the same as Rust here, and cheerfully handle
       | A, B and C as merely counters, _counting_ 1.6 billion, 1.4
       | billion and 1.8 billion respectively is fine. Empty tuples are
       | indistinguishable, so I needn 't worry about giving you back the
       | "wrong" empty tuple when you remove one, I can just give you a
       | fresh one each time and decrement the counter.
        
         | shakow wrote:
         | Is that really a problem now that the overwhelming majority of
         | C++ programs run on 64bits platforms?
        
           | comex wrote:
           | A better example of address uniqueness being a problem is
           | with code like:                   struct Marker {};
           | struct Foo {             Marker marker;             int64_t
           | number;         };
           | 
           | If you write the equivalent in C with GCC extensions, or
           | Rust, sizeof(Foo) would be 8, the same as sizeof(int64_t);
           | `marker` doesn't take up any extra space. In C++, however,
           | sizeof(Foo) is 16, because `marker` must take up at least 1
           | byte to have a unique address, which gets expanded to 8 bytes
           | due to alignment.
           | 
           | Now, as of C++20, you _can_ reduce sizeof(Foo) to 8 by
           | tagging `marker` as [[no_unique_address]]. However, this has
           | drawbacks. First of all, it 's easy to get situations like
           | this in highly generic code, so it's hard to predict where
           | [[no_unique_address]] needs to be applied (and applying it
           | everywhere would be verbose).
           | 
           | Second of all, [[no_unique_address]] is _dangerous_ , because
           | it doesn't just allow empty fields to be omitted, it also
           | allows nonempty fields to have trailing padding bytes reused
           | for other fields. Normally that's okay, but if you have any
           | code that performs memcpy or memset or similar based on the
           | size of a type, such as:                   struct Foo {
           | Foo(const Foo &other) {                 memcpy(this, &other,
           | sizeof(Foo));             }             // ...some fields
           | here...         };
           | 
           | ...then if that code writes to a [[no_unique_address]] field,
           | it can overwrite adjacent fields, since sizeof(Foo) includes
           | any trailing padding bytes!
        
         | Someone wrote:
         | I may be overlooking something, but I don't see a realistic use
         | case for having multiple empty tuples without an address.
         | 
         | If you have those, I don't see any way to discriminate between
         | them. If so, why would you ever want to have more than one of a
         | given type? Is there some template code that might accidentally
         | try to create them?
        
         | zeusk wrote:
         | Maybe the right question to ask is, why you have 4.8 billion
         | empty tuples? And why you're still on a 32-bit system?
        
           | borschtplease wrote:
           | > And why you're still on a 32-bit system
           | 
           | There is a lot of 32 bit microcontrollers, most notable stm32
        
             | WJW wrote:
             | Sure, but how many of those have to handle billions of
             | empty tuples?
        
         | TylerGlaiel wrote:
         | since C++20 you can get around this
         | 
         | https://en.cppreference.com/w/cpp/language/attributes/no_uni...
        
       | axegon_ wrote:
       | Dunno about the most exciting but the most irritating so far is
       | seeing it up. Admittedly I didn't spent a lot of time trying(I
       | didn't have all that much time today) but in the 40 minutes or
       | so, I was unable to get it going. I'll go back to attempting
       | tomorrow and make sense of the errors.
        
       | pyjarrett wrote:
       | It's a bit early in its lifecycle to get too excited for features
       | which already exist in other languages. Carbon is exciting
       | because of C++ interoperability, but I already get this behavior
       | today in Ada.                 type Point is record          x, y,
       | z : Interfaces.Integer_64;       end record;            procedure
       | Print(p : Point);
       | 
       | Is `p` passed by reference or value? The compiler chooses what it
       | thinks is best--all parameters are considered `const` unless
       | they're `out` parameters. There's some rules for classes (tagged
       | types), uncopyable (limited) objects, and `aliased` parameters
       | which are always passed by reference.
       | 
       | I can't get a pointer type (access) out of the parameter to the
       | function, since the accessibility rules prevent it:
       | -- "constant" since we don't know if it is writable       type
       | Point_Access is access constant Point;            Last_Printed :
       | Point_Access := null;            procedure Print(P : Point) is
       | begin          -- P'Access is sort of like C++ std::addressof(P)
       | to get a "pointer"          -- There's also P'Address to get the
       | actual address, but then requires conversion to a pointer-like
       | "access" type to be used.          --          -- Compiler Error:
       | "non-local pointer cannot point to local object" since
       | Point_Access type is declared at a higher level
       | Last_Printed := P'Access;               -- If we really, really,
       | want to do it, "I'm smarter than the compiler", you can force
       | it...          -- Think of "Unrestricted" and "Unchecked" as
       | grep-able warnings signs of "this is potentially very dangerous"
       | Last_Access := P'Unrestricted_Access;               -- ...
       | end Last_Printed;
       | 
       | What about making and then trying to use a local pointer-like
       | type? This doesn't work because you can only create pointer-like
       | accesses to types which have been marked as `aliased`, since you
       | don't know if there's a location you can point to which has the
       | value.                 procedure Print (P : Point) is
       | type Local_Access is access constant Point;               --
       | Compiler Error: prefix of "Access" attribute must be aliased
       | Ptr_Like : Local_Access := P'Access;               -- Similar, "I
       | am smarter than compiler" trick works here too...
       | Ptr_Like : Local_Access := P'Unrestricted_Access;
       | 
       | You can allow passing any arbitrary pointer into a function by
       | providing `access`, but you're not allowed to store it, since you
       | don't know which flavor of the pointer type it could be, e.g. if
       | it points to something on the stack, or on the heap:
       | type Point_Access is access constant Point;       Last_Printed :
       | Point_Access := null;            -- Allow printing any pointer-
       | like (access) to a point.       procedure Print (P : access
       | constant Point) is       begin          -- Compile Error:
       | implicit conversion of anonymous access parameter not allowed
       | Last_Printed := P;               -- If we really, really want to
       | do this, we can force it with a cast...          Last_Printed :=
       | Point_Access (P);          -- ...        end Print;
        
       | rgovostes wrote:
       | I was very confused by what the author is pointing out in the
       | opening Point / Print example. ("[T]he compiler is allowed to
       | convert that to a T" -> wait, why did the compiler change a
       | struct to an int32?)
       | 
       | I think this boils down to: Carbon defaults to passing parameters
       | that fit in a single register by value, and all others by const
       | reference. This affects a few things you might take for granted
       | in C++, like whether you can take a reference to a parameter.
       | 
       | The opening example is showing two samples of Carbon and the
       | equivalent C++ code, noting that the "undecorated" parameter `p :
       | Point` is equivalent to `const Point& p` (pass by const reference
       | to a struct) and `x : i32` is equivalent to `std::int32_t x`
       | (pass by value).
        
       | NonNefarious wrote:
       | An old Apple UI kit has an "exciting" feature?
        
         | olliej wrote:
         | Someone didn't read the article :)
        
         | zascrash wrote:
         | It's about Carbon Language (from google) not the framework.
         | https://github.com/carbon-language/carbon-lang
        
       | glouwbug wrote:
       | Aren't we just reinventing the wheel over and over at this point?
       | Sure could us another Go, C wasn't doing functions well at all /s
        
       | ISL wrote:
       | I'm sure the ship has sailed on Carbon's naming convention, but
       | darn if that isn't a confusing article-title.
       | 
       | It is interesting to contemplate the most-ambiguous and least-
       | comprehensible/googleable name one might be able to give to a
       | piece of software. "the"? "Biden"? "Russia"? "water"? "air"?
       | "dog"? "person"? "!"? "?"? " "?
        
         | labrador wrote:
         | Google engineers: "Let's make our languages hard to Google!"
         | 
         | Google managers: "Whatever floats your boat. How about Go, Dart
         | and Carbon?"
        
           | serial_dev wrote:
           | I've been a Dart developer for three years and had zero
           | issues finding anything Dart related.
           | 
           | I also had to write some Go, that was actually pretty bad in
           | terms of searchability.
        
           | Cyberdog wrote:
           | Swift was annoying in the early days since searches would
           | bring up info on the international payment system.
           | 
           | Lua and PHP have nice searchability. At least, it's fairly
           | easy to weed out results on the Portuguese moon and the
           | Philippine peso.
        
       | bbkane wrote:
       | I really think the most exciting feature of Carbon (indeed, it's
       | justification for existence) is its backwards compatibility with
       | C++, followed closely by it's more modern and flexible governance
       | structure. Even the docs in the repo say you should avoid Carbon
       | unless you have lots of C++ code you need to interact with.
        
         | [deleted]
        
         | usrnm wrote:
         | Some of the biggest problems with C++ come from its backwards
         | compatibility with C. Yes, it wins you users in the short term,
         | but it's a pain to support as both languages evolve
        
           | olliej wrote:
           | The binary compatibility is the big deal. The other safe
           | languages have explicitly taken the view the interop with C++
           | is bad, and so we should instead do interop with the
           | significantly less safe C instead.
           | 
           | The real killer is that the lack of any interaction with C++
           | means that you can't do any real incremental adoption of one
           | of those safe languages in any big security critical
           | projects. Saying that the solution to C++ is to just not use
           | it ignores the reality that C++ exists, and large projects in
           | C++ exist. It doesn't matter if you don't like C++.
           | 
           | The final problem with the safe languages - with the
           | exception of swift - is that they are all hell bent on not
           | providing even just basic ABI stability. It doesn't matter
           | how "safe" your language is if the first thing you have to do
           | is make a pure C (even losing potential for automatic
           | lifetime management the C++ would allow) interface.
           | 
           | So I can have two libraries both written in rust, and the
           | entire safety of it is contingent on each library talking to
           | the other through a C API.
        
             | usrnm wrote:
             | > The other safe languages have explicitly taken the view
             | the interop with C++ is bad
             | 
             | It's not that it's inherently bad, it's just insanely
             | difficult to do and, probably, isn't worth the pain.
             | Carbon's approach to solving this problem includes
             | embedding a custom C++ compiler as part of its toolchain,
             | and at this point it's just the idea, who knows if they
             | will be able to actually do it.
             | 
             | > they are all hell bent on not providing even just basic
             | ABI stability
             | 
             | Right, the famous stable C++ ABI
        
             | staticassertion wrote:
             | > The other safe languages have explicitly taken the view
             | the interop with C++ is bad
             | 
             | I think it's more that interop with C++ is _hard_ and
             | ultimately a lot less valuable than interop with C, which
             | is what the vast majority of languages use for FFI.
             | 
             | > It doesn't matter how "safe" your language is if the
             | first thing you have to do is make a pure C (even losing
             | potential for automatic lifetime management the C++ would
             | allow) interface.
             | 
             | I don't see how that leads to "it doesn't matter"
        
               | __float wrote:
               | If you consider the context of Carbon (Google having a
               | _lot_ of C++), solving for "interop with C++ is hard"
               | might be considered worth the price.
               | 
               | Whether that's true in practice...? I guess we'll need a
               | few more years to tell.
        
       ___________________________________________________________________
       (page generated 2022-07-30 23:00 UTC)