[HN Gopher] GCC 15.1
       ___________________________________________________________________
        
       GCC 15.1
        
       Author : jrepinc
       Score  : 174 points
       Date   : 2025-04-25 10:53 UTC (12 hours ago)
        
 (HTM) web link (gcc.gnu.org)
 (TXT) w3m dump (gcc.gnu.org)
        
       | Calavar wrote:
       | > {0} initializer in C or C++ for unions no longer guarantees
       | clearing of the whole union (except for static storage duration
       | initialization), it just initializes the first union member to
       | zero. If initialization of the whole union including padding bits
       | is desirable, use {} (valid in C23 or C++) or use -fzero-init-
       | padding-bits=unions option to restore old GCC behavior.
       | 
       | This is going to silently break so much existing code, especially
       | union based type punning in C code. {0} used to guarantee full
       | zeroing and {} did not, and step by step we've flipped the
       | situation to the reverse. The only sensible thing, in terms of
       | not breaking old code, would be to have _both_ {0} and {} zero
       | initialize the whole union.
       | 
       | I'm sure this change was discussed in depth on the mailing list,
       | but it's absolutely mind boggling to me
        
         | VyseofArcadia wrote:
         | I feel like once a language is standardized (or reaches 1.0),
         | that's it. You're done. No more changes. You wanna make
         | improvements? Try out some new ideas? Fine, do that in a new
         | language.
         | 
         | I can deal with the footguns if they aren't cheekily mutating
         | over the years. I feel like in C++ especially we barely have
         | the time to come to terms with the unintended consequences of
         | the previous language revision before the next one drops a
         | whole new load of them on us.
        
           | ryao wrote:
           | I suspect this change was motivated by standards conformance.
        
             | fuhsnn wrote:
             | The wording of GCC maintainer was "the standard doesn't
             | require it." when they informed Linux kernel mailing list.
             | 
             | https://lore.kernel.org/linux-
             | toolchains/Z0hRrrNU3Q+ro2T7@tu...
        
               | matheusmoreira wrote:
               | Reminds me of strict aliasing. Same attitude...
               | 
               | https://www.yodaiken.com/2018/06/07/torvalds-on-aliasing/
        
           | seritools wrote:
           | > If the size of the new type is larger than the size of the
           | last-written type, the contents of the excess bytes are
           | unspecified (and may be a trap representation). Before C99
           | TC3 (DR 283) this behavior was undefined, but commonly
           | implemented this way.
           | 
           | https://en.cppreference.com/w/c/language/union
           | 
           | > When initializing a union, the initializer list must have
           | only one member, which initializes the first member of the
           | union unless a designated initializer is used(since C99).
           | 
           | https://en.cppreference.com/w/c/language/struct_initializati.
           | ..
           | 
           | - = {0} initializes the first union variant, and bytes
           | outside of that first variant are unspecified. Seems like GCC
           | 15.1 follows the 26 year old standard correctly. (not sure
           | how much has changed from C89 here)
        
           | hulitu wrote:
           | It's careless development. Why think something in advance
           | when you can fix it later. It works so well for Microsoft,
           | Google and lately Apple. /s
           | 
           | The release cycle of a software speaks a lot about its
           | quality. Move fast, break things has become the new
           | development process.
        
           | pjmlp wrote:
           | Programming languages are products, that is like saying you
           | want to keep using vi 1.0.
           | 
           | Maybe C should have stop at K&R C from UNIX V6, at least that
           | would have spared the world in having it being adopted
           | outside UNIX.
        
             | rgoulter wrote:
             | I liked the idea I heard: internet audiences demand
             | progress, but internet audiences hate change.
        
             | ryao wrote:
             | If C++ had never been invented, that might have been the
             | case.
        
               | pjmlp wrote:
               | C++ was invented exactly because Bjarne Stroustoup
               | vouched never again to repeat the downgrade of his
               | development experience from Simula to BCPL.
               | 
               | When faced with writing a distributed systems application
               | at Bell Labs, and having to deal with C, the very first
               | step was to create C with Classes.
               | 
               | Also had C++ not been invented, or C gone into an history
               | footnote, so what, there would be other programming
               | languages to chose from.
               | 
               | Lets not put programming languages into some kind of
               | worshiping sanctuary.
        
           | _joel wrote:
           | Perl 6 and Python 3 joined the chat
        
           | Ragnarork wrote:
           | > I feel like once a language is standardized (or reaches
           | 1.0), that's it. You're done. No more changes. You wanna make
           | improvements? Try out some new ideas? Fine, do that in a new
           | language.
           | 
           | Thank goodness this is not how the software world works
           | overall. I'm not sure you understand the implications of what
           | you ask for.
           | 
           | > if they aren't cheekily mutating over the years
           | 
           | You're complaining about languages mutating, then mention C++
           | which has added stuff but maintained backwards compatibility
           | over the course of many standards (aside from a few hiccups
           | like auto_ptr, which was also short lived), with a high
           | aversion to modifying existing stuff.
        
         | ryao wrote:
         | > This is going to silently break so much existing code
         | 
         | How much code actually uses unions this way?
         | 
         | > especially union based type punning in C code
         | 
         | I have never done type punning via the GNU C compiler extension
         | in a way that would break because of this. I always assign a
         | value to it and then get out the value from a new type. Do you
         | know of any code that does things differently to be affected by
         | this?
        
           | Calavar wrote:
           | I would guess a lot. People aren't intimately familiar with
           | the standard, and people are lazy when it comes to writing
           | boilerplate like initialization code. And up until now, it
           | just worked, so even a good test suite wouldn't catch it.
           | 
           | EDIT: I initially mentioned type punning for arithmetic, but
           | this compiler change wouldn't affect that
        
             | ryao wrote:
             | How would that be broken by this? The union will be zero
             | initialized regardless because this change only affects
             | situations where the union members are of different
             | lengths, but for integer to float, the union members should
             | always be the same length or bad things will happen.
        
               | Calavar wrote:
               | I realized my mistake and I think I edited my comment a
               | split second before you replied, but you're right. That
               | particular type punning scenario wouldn't be affected by
               | this change because 1) the members are the same size, so
               | there's no padding bits 2) the specific union member is
               | going to be initialized to the input parameter, not with
               | the syntax sugar for aggregate zero initialization.
        
               | ryao wrote:
               | Well, under your original version, I could see someone
               | filling in bit fields in the float like the exponent and
               | sign while leaving the mantissa zeroed, but given that
               | the integer and float would be the same length, there is
               | no section that would be left uninitialized by this
               | change.
               | 
               | In order for this change to leave something
               | uninitialized, you would need to have a member of the
               | union after the first member that is longer than the
               | first member. Code that does that and relies on {0} to
               | zero the union seems incredibly rare to me.
        
           | ndiddy wrote:
           | > How much code actually uses unions this way?
           | 
           | I see this change caused Mbed-TLS to start failing its test
           | suite when compiled with GCC 15: https://github.com/Mbed-
           | TLS/mbedtls/issues/9814 (kinda scary since it's a security
           | library). Hopefully other projects with less rigorous test
           | suites aren't using {0} in that way. The Github issue
           | mentions that Clang tried a similar optimization a while ago
           | and backed it out after user complaints, so maybe the same
           | thing will happen with GCC.
        
             | ryao wrote:
             | GCC's developers have a strong insistence on standards
             | conformance (minus situations where they explicitly choose
             | to deviate, like type punning in unions) over the status
             | quo. We already went through a much more severe shift with
             | strict aliasing enforcement by GCC and they never changed
             | course. I do not expect this to be any different.
        
         | ogoffart wrote:
         | > This is going to silently break so much existing code
         | 
         | The code was already broken. It was an undefined behavior.
         | 
         | That's a problem with C and it's undefined behavior minefields.
        
           | ryao wrote:
           | GCC has long been known to define undefined behavior in C
           | unions. In particular, type punning in unions is undefined
           | behavior under the C and C++ standards, but GCC (and Clang)
           | define it.
        
             | mtklein wrote:
             | I have always thought that punning through a union was
             | legal in C but UB in C++, and that punning through
             | incompatible pointer casting was UB in both.
             | 
             | I am basing this entirely on memory and the wikipedia
             | article on type punning. I welcome extremely pedantic
             | feedback.
        
               | ryao wrote:
               | There has been plenty of misinformation spread on that.
               | One of the GCC developers told me explicitly that type
               | punning through a union was UB in C, but defined by GCC
               | when I asked (after I had a bug report closed due to UB).
               | I could find the bug report if I look for it, but I would
               | rather not do the search.
        
               | trealira wrote:
               | From a draft of the C23 standard, this is what it has to
               | say about union type punning:
               | 
               | > If the member used to read the contents of a union
               | object is not the same as the member last used to store a
               | value in the object the appropriate part of the object
               | representation of the value is reinterpreted as an object
               | representation in the new type as described in 6.2.6 (a
               | process sometimes called type punning). This might be a
               | non-value representation.
               | 
               | In past standards, it said "trap representation" rather
               | than "non-value representation," but in none of them did
               | it say that union type punning was undefined behavior. If
               | you have a PDF of any standard or draft standard, just
               | doing a search for "type punning" should direct you to
               | this footnote quickly.
               | 
               | So I'm going to say that if the GCC developer explicitly
               | said that union type punning was undefined behavior in C,
               | then they were wrong, because that's not what the C
               | standard says.
        
               | amboar wrote:
               | Section J.1 _Unspecified_ behavior says
               | 
               | > (11) The values of bytes that correspond to union
               | members other than the one last stored into (6.2.6.1).
               | 
               | So it's a little more constrained in the ramifications,
               | but the outcomes may still be surprising. It's a bit
               | unfortunate that "UB" aliases to both "Undefined
               | behavior" and "Unspecified behavior" given they have
               | subtly different definitions.
               | 
               | From section 4 we have:
               | 
               | > A program that is correct in all other aspects,
               | operating on correct data, containing unspecified
               | behavior shall be a correct program and act in accordance
               | with 5.1.2.4.
        
               | ryao wrote:
               | Here is what was said:
               | 
               | > Type punning via unions is undefined behavior in both c
               | and c++.
               | 
               | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
               | 
               | Feel free to start a discussion on the GCC mailing list.
        
               | trealira wrote:
               | I actually might, although not now. Thanks for the link.
               | I'm surprised he directly contradicted the C standard,
               | rather than it just being a misunderstanding.
        
               | ryao wrote:
               | According to another comment, the C standard contradicts
               | the C standard on this:
               | 
               | https://news.ycombinator.com/item?id=43794268
               | 
               | Taking snippets of the C standard out of context of the
               | whole seems to result in misunderstandings on this.
        
               | trealira wrote:
               | It doesn't. That commenter is saying that in C99, it was
               | unspecified behavior. Since C11 onward, it's been removed
               | from the unspecified behavior annex and type punning is
               | allowed, though it may generate a trap/non-value
               | representation. It was never undefined behavior, which is
               | different.
               | 
               | Edit: no, it's still in the unspecified behavior annex,
               | that's my mistake. It's still not undefined, though.
        
               | ryao wrote:
               | Most of the C code I write is C99 code, so it is
               | undefined behavior either way for me (if I care about
               | compilers other than GCC and Clang).
               | 
               | That said, I am going to defer to the GCC developers on
               | this since I do not have time to make sense of all
               | versions of the C standard.
        
               | trealira wrote:
               | That's fair. In the end, what matters is how C is
               | implemented in practice on the platforms your code
               | targets, not what the C standard says.
        
               | jotux wrote:
               | https://gcc.gnu.org/onlinedocs/gcc/Optimize-
               | Options.html#Typ...
        
               | ryao wrote:
               | What is your point? I already said that GCC defines it
               | even though the C standard does not. As per the GCC
               | developers:
               | 
               | > Type punning via unions is undefined behavior in both c
               | and c++.
               | 
               | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
        
               | jotux wrote:
               | > One of the GCC developers told me explicitly that type
               | punning through a union was UB in C, but defined by GCC
               | when I asked
               | 
               | I just was citing the source of this for reference.
        
               | ryao wrote:
               | I see. Carry on then. :)
        
               | uecker wrote:
               | Union type punning is allowed and supported by GCC:
               | https://godbolt.org/z/vd7h6vf5q
        
               | ryao wrote:
               | I said that GCC defines type punning via unions. It is an
               | extension to the C standard that GCC did.
               | 
               | That said, using "the code compiles in godbolt" as proof
               | that it is not relying on what the standard specifies to
               | be UB is fallacious.
        
               | uecker wrote:
               | I am a member of the standards committee and a GCC
               | maintainer. The C standard supports union punning. (You
               | are right though that relying on godbolt examples can be
               | misleading.)
        
               | jotux wrote:
               | Saw this recently and thought it was good:
               | https://www.youtube.com/watch?v=NRV_bgN92DI
        
               | jcranmer wrote:
               | > punning through a union was legal in C
               | 
               | In C89, it was implementation-defined. In C99, it was
               | made expressly legal, but it was erroneously included in
               | the list of undefined behavior annex. From C11 on, the
               | annex was fixed.
               | 
               | > but UB in C++
               | 
               | C++11 adopted "unrestricted unions", which added a
               | concept of active members that is UB to access other
               | members unless you make them active. Except active
               | members rely on constructors and destructors, which
               | primitive types don't have, so the standard isn't
               | particularly clear on what happens here. The current
               | consensus is that it's UB.
               | 
               | C++20 added std::bit_cast which is a much safer interface
               | to type punning than unions.
               | 
               | > punning through incompatible pointer casting was UB in
               | both
               | 
               | There is a general rule that accessing an object through
               | an 'incompatible' lvalue is illegal in both languages. In
               | general, changing the const or volatile qualifier on the
               | object is legal, as is reading via a different signed or
               | unsigned variant, and char pointers can read anything.
        
               | trealira wrote:
               | > In C99, it was made expressly legal, but it was
               | erroneously included in the list of undefined behavior
               | annex.
               | 
               | In C99, union type punning was put under Annex J.1, which
               | is unspecified behavior, not undefined behavior.
               | Unspecified behavior is basically implementation-defined
               | behavior, except that the implementor is not required to
               | document the behavior.
        
               | ryao wrote:
               | We can use UB to refer to both. :)
        
               | trealira wrote:
               | Maybe, but we were talking about "undefined behavior,"
               | not "UB," so the point is moot.
        
               | hermitdev wrote:
               | > We can use UB to refer to both. :)
               | 
               | You can, but in the context of the standard, you'd be
               | wrong to do so. Undefined behavior and unspecified
               | behavior have specific, different, meanings in context of
               | the C and C++ standards.
               | 
               | Conflate them at your own peril.
        
               | ryao wrote:
               | The GCC developers disagree as of last December:
               | 
               | > Type punning via unions is undefined behavior in both c
               | and c++.
               | 
               | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
        
             | mat_epice wrote:
             | EDIT: This comment is wrong, see fsmv's comment below.
             | Leaving for posterity because I'm no coward!
             | 
             | - - -
             | 
             | Undefined behavior only means that the spec leaves a
             | particular situation undefined and that the compiler
             | implementor can do whatever they want. Every compiler
             | defines undefined behavior, whether it's documented (or
             | easy to qualify, or deterministic) or not.
             | 
             | It is in poor taste that gcc has had widely used,
             | documented behaviors that are changing, especially in a
             | point release.
        
               | fsmv wrote:
               | I think you're confusing unspecified and undefined
               | behavior. UB could do something randomly different every
               | time and unspecified must chose an option.
               | 
               | In a lot of cases in optimizing compilers they just
               | assume UB doesn't exist. Yes technically the compiler
               | does do something but there's still a big difference
               | between the two.
        
               | mat_epice wrote:
               | Thanks, you're right, I was mistaken.
        
             | flohofwoe wrote:
             | > type punning in unions is undefined behavior under the C
             | and C++ standards
             | 
             | Union type punning is entirely valid in C, but UB in C++
             | (one of the surprisingly many subtle but still fundamental
             | differences between C and C++). There's specifically a
             | (somewhat obscure) footnote about this in the C standard,
             | which also has been more clarified in one of the recent C
             | standards.
        
               | ryao wrote:
               | There is no footnote about it in the C standard. Someone
               | proposed adding one to standardize the behavior, but it
               | was never accepted. Ever since then, people keep quoting
               | it even though it is a rejected amendment.
        
               | jcranmer wrote:
               | Footnote 107 in C23, on page 75 in SS6.5.2.3:
               | 
               | > If the member used to read the contents of a union
               | object is not the same as the member last used to store a
               | value in the object the appropriate part of the object
               | representation of the value is reinterpreted as an object
               | representation in the new type as described in 6.2.6 (a
               | process sometimes called type punning). This might be a
               | non-value representation.
               | 
               | (though this footnote has been present as far back as
               | C99, albeit with different numbers as the standard has
               | added more text in the intervening 24 years).
        
               | ryao wrote:
               | The GCC developers disagree with your interpretation:
               | 
               | > Type punning via unions is undefined behavior in both c
               | and c++.
               | 
               | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
        
               | flohofwoe wrote:
               | I'm not sure tbh what's there to 'interpret' or how a
               | compiler developer could misread that, the wording is
               | quite clear.
        
               | ryao wrote:
               | It is an excerpt being taken out of context. Of course it
               | is quite clear. Taking it out of context ignores
               | everything else that the standard says. That
               | interpretation is wrong as far as compiler authors are
               | concerned.
        
               | trealira wrote:
               | The context is that it's a footnote. The footnote is
               | referenced in this paragraph:
               | 
               |  _A postfix expression followed by the . operator and an
               | identifier designates a member of a structure or union
               | object. The value is that of the named member (106), and
               | is an lvalue if the first expression is an lvalue. If the
               | first expression has qualified type, the result has the
               | so-qualified version of the type of the designated
               | member._
               | 
               |  _106) If the member used to read the contents of a union
               | object is not the same as the member last used to store a
               | value in the object the appropriate part of the object
               | representation of the value is reinterpreted as an object
               | representation in the new type as described in 6.2.6 (a
               | process sometimes called type punning). This might be a
               | non-value representation._
               | 
               | In that same document, union type punning is explicitly
               | listed under Annex J.1, Unspecified Behavior:
               | 
               |  _(11) The values of bytes that correspond to union
               | members other than the one last stored into (6.2.6.1)._
               | 
               | The standard is extremely clear and explicit that it's
               | not undefined behavior.
        
               | ryao wrote:
               | This is not considering the document as a whole. I will
               | defer to the GCC developers on what the document means on
               | this.
        
               | trealira wrote:
               | I'm interested in hearing how considering the document as
               | a whole leads to a different conclusion.
        
               | jcranmer wrote:
               | I am a member of the C standards committee, and I'm
               | telling you you're wrong here. Martin Uecker is also
               | member of the C standards committee, and has just
               | responded to that bug saying that the comment you linked
               | is wrong. I, and others here, have quoted literal
               | standards text to you explaining why type punning through
               | unions is well-defined behavior in C.
               | 
               | I don't know who Andrew Pinski is, but they're factually
               | incorrect regarding the legality of type punning via
               | unions in C.
        
               | uecker wrote:
               | Andrew is a GCC developer who is very competent (much
               | more than myself regarding GCC), but I think he was
               | mistakenly assuming the C++ rules apply to C here as
               | well.
        
           | grandempire wrote:
           | When you have a big system many people rely on you generally
           | try to look for ways to keep their code working - not look
           | for the changes you're contractually allowed to make.
           | 
           | GCC probably has a better justification than "we are allowed
           | to".
        
             | arp242 wrote:
             | > GCC probably has a better justification than "we are
             | allowed to".
             | 
             | Maybe, but I've seen GCC people justify such changes with
             | little more than "it's UB, we can change it, end of story",
             | so I wouldn't assume it.
        
           | mwkaufma wrote:
           | Undefined in the standard doesn't mean undefined in GCC.
           | Type-punning through unions has always been a special case
           | that GCC has taken care with beyond the standard.
        
         | mistrial9 wrote:
         | using UNION was always considered sketchy IMHO. This is trivia
         | for security exploiters?
        
           | grandempire wrote:
           | No. This is how sum types are implemented.
           | 
           | And from a runtime perspective it's going to be a struct with
           | perhaps more padding. You'll need more details about your
           | specific threat model to explain why that's bad.
        
             | mistrial9 wrote:
             | a quick search says that std::variant is the modern
             | replacement to implement your niche feature "sum types"
        
               | grandempire wrote:
               | That's for C++. And how is std::variant implemented?
        
               | LowLevelMahn wrote:
               | not using a union:
               | https://ojdip.net/2013/10/implementing-a-variant-type-in-
               | cpp... because the union can't be extended with variadic
               | template types
        
               | grandempire wrote:
               | So instead it has a buffer large enough to hold all the
               | types? That's what union does.
               | 
               | Still waiting to hear the security concerns.
        
               | LegionMammal978 wrote:
               | Actually, it does use a union, in both libstdc++ [0] and
               | libc++ [1]. (Underneath a lengthy stack of base classes,
               | since it wouldn't be C++ if it weren't painful to match
               | the specified semantics.)
               | 
               | [0] https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2
               | B%2B-v3...
               | 
               | [1] https://github.com/llvm/llvm-
               | project/blob/llvmorg-20.1.3/lib...
        
               | jlouis wrote:
               | Not a niche feature. Fundamental for any decent language
               | with a type system.
        
               | mistrial9 wrote:
               | ok, but C99 and C++11 and others, all have ways to
               | implement types. "Fundemental" as you say.. using UNION
               | in C++ is not a good choice to implement types.. in old
               | C99, you can use UNION that way but why? footguns all
               | around.
        
               | soraminazuki wrote:
               | Whoa, that's a core building block of programming and
               | computer science that you're dismissing as "niche"
               | without explanation.
        
               | mistrial9 wrote:
               | yes types are a core building block of programming and
               | computer science, but not using UNION ? this casual
               | dismissal of "criticisms of UNION" here seems superficial
               | and un-wise to me.
        
         | mtklein wrote:
         | This was my instinct too, until I got this little tickle in the
         | back of my head that maybe I remembered that Clang was already
         | acting like this, so maybe it won't be so bad. Notice 32-bit
         | wzr vs 64-bit xzr:                   $ cat union.c && clang -O1
         | -c union.c -o union.o && objdump -d union.o         union foo {
         | float  f;             double d;         };              void
         | create_f(union foo *u) {             *u = (union foo){0};
         | }              void create_d(union foo *u) {             *u =
         | (union foo){.d=0};         }              union.o: file format
         | mach-o arm64              Disassembly of section __TEXT,__text:
         | 0000000000000000 <ltmp0>:                0: b900001f      str
         | wzr, [x0]                4: d65f03c0      ret
         | 0000000000000008 <_create_d>:                8: f900001f
         | str xzr, [x0]                c: d65f03c0      ret
        
           | mtklein wrote:
           | Ah, I can confirm what I see elsewhere in the thread, this is
           | no longer true in Clang. That first clang was Apple Clang 17
           | ---who knows what version that actually is---and here is
           | Clang 20:                   $
           | /opt/homebrew/opt/llvm/bin/clang-20 -O1 -c union.c -o union.o
           | && objdump -d union.o              union.o: file format
           | mach-o arm64              Disassembly of section
           | __TEXT,__text:              0000000000000000 <ltmp0>:
           | 0: f900001f      str xzr, [x0]                4: d65f03c0
           | ret              0000000000000008 <_create_d>:
           | 8: f900001f      str xzr, [x0]                c: d65f03c0
           | ret
        
             | dzaima wrote:
             | Looks like that change is clang <=19 to clang 20:
             | https://godbolt.org/z/7zrocxGaq
        
         | myrmidon wrote:
         | I honestly feel that "uninitialized by default" is strictly a
         | mistake, a relic from the days when C was basically cross-
         | platform assembly language.
         | 
         | Zero-initialized-by-default for everything would be an
         | extremely beneficial tradeoff IMO.
         | 
         | Maybe with a __noinit attribute or somesuch for the few cases
         | where you don't _need_ a variable to be initialized AND the
         | compiler is too stupid to optimize the zero-initialization away
         | on its own.
         | 
         | This would not even break existing code, just lead to a few
         | easily fixed performance regressions, but it would make it
         | significantly harder to introduce undefined and difficult to
         | spot behavior by accident (because very often code _assumes_
         | zero-initialization _and_ gets it purely by chance, and this is
         | also most likely to happen in the edge cases that might not be
         | covered by tests under memory sanitizer if you even have
         | those).
        
           | elromulous wrote:
           | Devil's advocate: this would be unacceptable for os kernels
           | and super performance critical code (e.g. hft).
        
             | sidkshatriya wrote:
             | Would you rather have a HFT trade go correctly and a few
             | nanoseconds slower or a few nanoseconds faster but with
             | some edge case bugs related to variable initialisation ?
             | 
             | You might claim that that you can have both but bugs are
             | more inevitable in the uninitialised by default scenario. I
             | doubt that variable initialisation is the thing that would
             | slow down HFT. I would posit is it things like network
             | latency that would dominate.
        
               | hermitdev wrote:
               | > Would you rather have a HFT trade go correctly and a
               | few nanoseconds slower or a few nanoseconds faster but
               | with some edge case bugs related to variable
               | initialisation ?
               | 
               | As someone who works in the HFT space: it depends. How
               | frequently and how bad are the bad-trade cases? Some slop
               | happens. We make trade decisions with hardware _without
               | even seeing an entire packet coming in on the network_.
               | Mistakes/bad trades happen. Sometimes it results in
               | trades that don't go our way or missed opportunities.
               | 
               | Just as important as " _can_ we do better? " is "
               | _should_ we do better? ". Queue priority at the exchange
               | matters. Shaving nanoseconds is how you get a competitive
               | edge.
               | 
               | > I would posit is it things like network latency that
               | would dominate.
               | 
               | Everything matters. Everything is measured.
               | 
               | edit to add: I'm not saying we write software that either
               | has or relies upon unitialized values. I'm just saying in
               | such a hypothetical, it's not a cut and dry "do the right
               | thing (correct according to the language spec)" decision.
        
               | Imustaskforhelp wrote:
               | We make trade decisions with hardware _without even
               | seeing an entire packet coming in on the network_
               | 
               | Wait what????
               | 
               | Can you please educate me on high frequency trading... ,
               | like I don't understand what's the point of it & lets say
               | one person has created a hft bot then why the need of
               | other bot other than the fact of different trading strats
               | and I don't think these are profitable / how they compare
               | in the long run with the boglehead strategy??
        
               | hermitdev wrote:
               | This is a vast, _vast_ over-simplification: The primary
               | "feature" of HFT is providing liquidity to market.
               | 
               | HFT firms are (almost) always willing to buy or sell at
               | or near the current market price. HFT firms basically
               | race each other for trade volume from "retail" traders
               | (and sometimes each other). HFTs make money off the
               | spread - the difference between the bid & offer -
               | typically only a cent. You don't make a lot of money on
               | any individual trade (and some trades are losers), but
               | you make money on doing a lot of volume. If done
               | properly, it doesn't matter which direction the market
               | moves for an HFT, they'll make money either way as long
               | as there's sufficient trading volume to be had.
               | 
               | But honestly, if you want to learn about HFT, best do
               | some actual research on it - I'm not a great source as
               | I'm just the guy that keeps the stuff up and running; I'm
               | not too involved in the business side of things. There's
               | a lot of negative press about HFTs, some positive.
        
             | myrmidon wrote:
             | No, just throw the __noinit attribute at every place where
             | its needed.
             | 
             | You probably would not even need it in a lot of instances
             | because the compiler would elide lots of dead stores
             | (zeroing) even without hinting.
        
             | pjmlp wrote:
             | It is acceptable enough for Windows, Android and macOS,
             | that have been doing for at least the last five years.
             | 
             | That is the usual fearmongering when security improvements
             | are done to C and C++.
        
             | TuxSH wrote:
             | > this would be unacceptable for os kernels
             | 
             | Depends on the boundary. I can give a non-Linux,
             | microkernel example (but that was/is shipped on dozens of
             | millions of devices):
             | 
             | - prior to 11.0, Nintendo 3DS kernel SVC (syscall)
             | implementations did not clear output parameters, leading to
             | extremely trivial leaks. Unprivileged processes could
             | retrieve kernel-mode stack addresses easily and making
             | exploit code much easier to write, example here:
             | https://github.com/TuxSH/universal-
             | otherapp/blob/master/sour...
             | 
             | - Nintendo started clearing all temporary registers on the
             | Switch kernel at some point (iirc x0-x7 and some more); on
             | the 3DS they never did that, and you can leak kernel object
             | addresses quite easily (iirc by reading r2), this made an
             | entire class of use-after-free and arbwrite bugs easier to
             | exploit (call SvcCreateSemaphore 3 times, get sema kernel
             | object address, use one of the now-patched exploit that can
             | cause a double-decref on the KSemaphore, call
             | SvcWaitSynchronization, profit)
             | 
             | more generally:
             | 
             | - unclearead padding in structures + copy to user =
             | infoleak
             | 
             | so one at least ought to be careful where crossing
             | privilege boundaries
        
           | bjourne wrote:
           | There are many low-level devices where initialization is very
           | expensive. It may mean that you need two passes through
           | memory instead of one, making whatever code you are running
           | twice as slow.
        
             | myrmidon wrote:
             | I would argue that these cases are pretty rare, and you
             | could always get nominal performance with the __noinit
             | hint, but I think this would seldomly even be needed.
             | 
             | If you have instances of zero-initialized structs where you
             | set individual fields after the initialization, all modern
             | compiler will elide the dead stores in the the typical
             | cases already anyway, and data of relevant size that is
             | supposed to stay uninitialized for long is rare and a bit
             | of an anti-pattern in my opinion anyway.
        
             | modeless wrote:
             | Ok, those developers can use a compiler flag. We need
             | defaults that work better for the vast majority.
        
               | bjourne wrote:
               | Then why are you using C? :P
        
               | 01HNNWZ0MV43FF wrote:
               | I'm not, looks like a bad language with worse
               | implementations
        
           | rwmj wrote:
           | GCC now supports -ftrivial-auto-var-
           | init=[zero|uninitialized|pattern] for stack variables
           | https://gcc.gnu.org/onlinedocs/gcc/Optimize-
           | Options.html#ind...
           | 
           | For malloc, you could use a custom allocator, or replace all
           | the calls with calloc.
        
             | myrmidon wrote:
             | Very nice, did not know about this!
             | 
             | The only problem with vendor extensions like this is that
             | you can't really rely on it, so you're still kinda forced
             | to keep all the (redundant) zero intialization; solving it
             | at the language level is much nicer. Maybe with C2030...
        
           | bluGill wrote:
           | C++26 has everything initialiied by default. The value is not
           | specified though. Implementations are encourage to use
           | something weird to detect using before explict
           | initialization.
        
           | nullc wrote:
           | Zero initializing often hides real and serious bugs, however.
           | Say you have a function with an internal variable LEN that
           | ought to get set to some dynamic length that internal
           | operations will run over. Changes to the code introduce a
           | path which skips the setting of LEN. Current compilers will
           | (very likely) warn you about the potentially uninitialized
           | use, valgrind will warn you (assuming the case gets
           | triggered), and failing all that the program will potentially
           | crash when some large value ends up in LEN-- alerting you to
           | the issue.
           | 
           | Compare with default zero init: The compiler won't warn you,
           | valgrind won't warn you, and the program won't crash. It will
           | just be silently wrong in many cases (particularly for
           | length/count variables).
           | 
           | Generally the attention to exploit safety can sometimes push
           | us in directions that are bad for program correctness. There
           | are many places where exploit safety is important, but also
           | many cases where its irrelevant. For security it's generally
           | 'safe' is a program erroneously shuts down or does less than
           | it should but that is far from true for software generally.
           | 
           | I prefer this behavior: Use of an uninitialized variable is
           | an error which the compiler will warn about, however, in code
           | where the compiler cannot prove that it is not used the
           | compiler's behavior is implementation defined and can include
           | trapping on ue, initializing to zero, or initializing to ~0
           | (the complement of zero). The developer may annotate with
           | _noinit which makes any use UB and avoids the cost of
           | inserting a trap or ~0 initialization. ~0 init will usually
           | fail but seldom in a silent way, so hopefully at least any
           | user reports will be reproducible.
           | 
           | Similar to RESTRICT _noinit is a potential footgun, but its
           | usage would presumably be quite rare and only in carefully
           | maintained performance critical code. Code using _noinit like
           | RESTRICT is at least still more maintainable than assembly.
           | 
           | This approach preserves the compiler's ability to detect
           | programmer error, and lets the implementation pick the
           | preferred way to handle the remaining error. In some contexts
           | it's preferable to trap cleanly or crash reliably (init to ~0
           | or explicit trap), in others its better to be silently wrong
           | (init 0).
           | 
           | Since C99 lets you declare variables wherever so it is often
           | easy to just declare a variable where it is first set and
           | that's probably best, of course. .. when you can.
        
         | mastax wrote:
         | Do distros have tooling to deal with this type of change?
         | 
         | I imagine it would be very useful to be able to search through
         | all the C/C++ source files for all the packages in the distro
         | in a semantic manner, so that it understands typedefs and
         | preprocessor macros etc. The search query for this change would
         | be something like "find all union types whose first member is
         | not its largest member, then find all lines of code where that
         | type is initialized with `{0}`".
        
           | ryao wrote:
           | As a retired Gentoo developer, I can say not really as far as
           | I know. There could be static analysis tools that can find
           | this, but I am not aware of anyone who runs them on the
           | entire distribution.
        
             | mastax wrote:
             | In theory it's just an extension of IDE tooling. A CLI with
             | a little query language wrapping libclang. In practice I'm
             | sure it's a nightmare just to get 20,000 packages' build
             | systems wrangled such that the right source files get
             | indexed by libclang, and all the endless plumbing for
             | downloading packages and reporting results, and on and on.
        
               | ryao wrote:
               | Distribution build systems typically operate outside of
               | an IDE. I suspect that it would be a nightmare to get
               | 20,000 packages to compile in an IDE.
               | 
               | It is possible in theory to write a compiler plugin to
               | generate an error when code that does this is found and
               | it would make it easy to find all of the instances in all
               | packages by building with `make -k`, provided that the
               | code is not hidden behind an unused package flag.
        
         | anon-3988 wrote:
         | lol this is exactly the kind of stuff I expects from C or C++
         | haha its kinda insane people just decide to do this amidst all
         | the talk about correctness/safety.
        
         | nikic wrote:
         | Fun fact: GCC decided to adopt Clang's (old) behavior at the
         | same time Clang decided to adopt GCC's (old) behavior.
         | 
         | So now you have this matrix of behaviors: * Old GCC:
         | Initializes whole union. * New GCC: Initializes first member
         | only. * Old Clang: Initializes first member only. * New Clang:
         | Initializes whole union.
        
         | Blikkentrekker wrote:
         | I have to say, I've read the discussion this generated and it's
         | a bit scary how no one seems to know whether type punning
         | through unions is undefined or not in C, or rather, my
         | conclusion reading it all is more so that many people are wrong
         | and that is defined behavior, but some of the people who are
         | wrong about it are actual GCC compiler developers so it can't
         | be too easy to be right.
        
       | elvircrn wrote:
       | "C++ Modules have been greatly improved."
       | 
       | It would be nice to know what these great improvements actually
       | are.
        
         | artemonster wrote:
         | those were the greatest improvements of all time. all of them.
         | :D
        
         | canucker2016 wrote:
         | Later in the article, it mentions:                   Improved
         | experimental support for C++23, including:                  std
         | and std.compat modules (also supported for C++20).
         | 
         | From https://developers.redhat.com/articles/2025/04/24/new-c-
         | feat...:                   The next major version of the GNU
         | Compiler Collection (GCC), 15.1, is expected to be released in
         | April or May 2025.              GCC 15 greatly improved the
         | modules code. For instance, module std is now supported (even
         | in C++20 mode).
        
         | boris wrote:
         | In GCC 14, C++ modules were unusable (incomplete, full of bugs,
         | no std modules, etc). I haven't tried 15 yet but if that
         | changed, then it definitely qualifies for a "great
         | improvement".
        
           | bluGill wrote:
           | Still no std modules but otherwise likely useable. modules
           | are ready for early adoptors to use and start writing the
           | books on what you should do. (Not how to do it, those books
           | are mostly written though not in print. How hou should as is
           | was imbort std a good idea or shoule containers and
           | algorithms been split - or maybe something I haven't though
           | of)
        
       | omoikane wrote:
       | Really excited about #embed support:
       | 
       | > C: #embed preprocessing directive support.
       | 
       | > C++: P1967R14, #embed (PR119065)
       | 
       | See also:
       | 
       | https://news.ycombinator.com/item?id=32201951 - Embed is in C23
       | (2022-07-23)
        
         | NekkoDroid wrote:
         | I'd really wish for an `std::embed<...>` that would be a
         | consteval function (IIRC there is a proposal for this, but I
         | don't know its status). The less pre-processor stuff going on
         | the less there is to worry about, the syntax would end up much
         | cleaner and you can create your own wrapper functions.
        
       | codr7 wrote:
       | Finally, musttail, can't wait to try that out.
        
       | fithisux wrote:
       | Any Hope for HaikuOs + Winlibs. GDC would be greatly appreciated.
        
       | pjmlp wrote:
       | Interesting to see some improvements being done to Modula-2
       | frontend as well.
        
       ___________________________________________________________________
       (page generated 2025-04-25 23:00 UTC)