[HN Gopher] GCC 15.1
___________________________________________________________________
GCC 15.1
Author : jrepinc
Score : 174 points
Date : 2025-04-25 10:53 UTC (12 hours ago)
(HTM) web link (gcc.gnu.org)
(TXT) w3m dump (gcc.gnu.org)
| Calavar wrote:
| > {0} initializer in C or C++ for unions no longer guarantees
| clearing of the whole union (except for static storage duration
| initialization), it just initializes the first union member to
| zero. If initialization of the whole union including padding bits
| is desirable, use {} (valid in C23 or C++) or use -fzero-init-
| padding-bits=unions option to restore old GCC behavior.
|
| This is going to silently break so much existing code, especially
| union based type punning in C code. {0} used to guarantee full
| zeroing and {} did not, and step by step we've flipped the
| situation to the reverse. The only sensible thing, in terms of
| not breaking old code, would be to have _both_ {0} and {} zero
| initialize the whole union.
|
| I'm sure this change was discussed in depth on the mailing list,
| but it's absolutely mind boggling to me
| VyseofArcadia wrote:
| I feel like once a language is standardized (or reaches 1.0),
| that's it. You're done. No more changes. You wanna make
| improvements? Try out some new ideas? Fine, do that in a new
| language.
|
| I can deal with the footguns if they aren't cheekily mutating
| over the years. I feel like in C++ especially we barely have
| the time to come to terms with the unintended consequences of
| the previous language revision before the next one drops a
| whole new load of them on us.
| ryao wrote:
| I suspect this change was motivated by standards conformance.
| fuhsnn wrote:
| The wording of GCC maintainer was "the standard doesn't
| require it." when they informed Linux kernel mailing list.
|
| https://lore.kernel.org/linux-
| toolchains/Z0hRrrNU3Q+ro2T7@tu...
| matheusmoreira wrote:
| Reminds me of strict aliasing. Same attitude...
|
| https://www.yodaiken.com/2018/06/07/torvalds-on-aliasing/
| seritools wrote:
| > If the size of the new type is larger than the size of the
| last-written type, the contents of the excess bytes are
| unspecified (and may be a trap representation). Before C99
| TC3 (DR 283) this behavior was undefined, but commonly
| implemented this way.
|
| https://en.cppreference.com/w/c/language/union
|
| > When initializing a union, the initializer list must have
| only one member, which initializes the first member of the
| union unless a designated initializer is used(since C99).
|
| https://en.cppreference.com/w/c/language/struct_initializati.
| ..
|
| - = {0} initializes the first union variant, and bytes
| outside of that first variant are unspecified. Seems like GCC
| 15.1 follows the 26 year old standard correctly. (not sure
| how much has changed from C89 here)
| hulitu wrote:
| It's careless development. Why think something in advance
| when you can fix it later. It works so well for Microsoft,
| Google and lately Apple. /s
|
| The release cycle of a software speaks a lot about its
| quality. Move fast, break things has become the new
| development process.
| pjmlp wrote:
| Programming languages are products, that is like saying you
| want to keep using vi 1.0.
|
| Maybe C should have stop at K&R C from UNIX V6, at least that
| would have spared the world in having it being adopted
| outside UNIX.
| rgoulter wrote:
| I liked the idea I heard: internet audiences demand
| progress, but internet audiences hate change.
| ryao wrote:
| If C++ had never been invented, that might have been the
| case.
| pjmlp wrote:
| C++ was invented exactly because Bjarne Stroustoup
| vouched never again to repeat the downgrade of his
| development experience from Simula to BCPL.
|
| When faced with writing a distributed systems application
| at Bell Labs, and having to deal with C, the very first
| step was to create C with Classes.
|
| Also had C++ not been invented, or C gone into an history
| footnote, so what, there would be other programming
| languages to chose from.
|
| Lets not put programming languages into some kind of
| worshiping sanctuary.
| _joel wrote:
| Perl 6 and Python 3 joined the chat
| Ragnarork wrote:
| > I feel like once a language is standardized (or reaches
| 1.0), that's it. You're done. No more changes. You wanna make
| improvements? Try out some new ideas? Fine, do that in a new
| language.
|
| Thank goodness this is not how the software world works
| overall. I'm not sure you understand the implications of what
| you ask for.
|
| > if they aren't cheekily mutating over the years
|
| You're complaining about languages mutating, then mention C++
| which has added stuff but maintained backwards compatibility
| over the course of many standards (aside from a few hiccups
| like auto_ptr, which was also short lived), with a high
| aversion to modifying existing stuff.
| ryao wrote:
| > This is going to silently break so much existing code
|
| How much code actually uses unions this way?
|
| > especially union based type punning in C code
|
| I have never done type punning via the GNU C compiler extension
| in a way that would break because of this. I always assign a
| value to it and then get out the value from a new type. Do you
| know of any code that does things differently to be affected by
| this?
| Calavar wrote:
| I would guess a lot. People aren't intimately familiar with
| the standard, and people are lazy when it comes to writing
| boilerplate like initialization code. And up until now, it
| just worked, so even a good test suite wouldn't catch it.
|
| EDIT: I initially mentioned type punning for arithmetic, but
| this compiler change wouldn't affect that
| ryao wrote:
| How would that be broken by this? The union will be zero
| initialized regardless because this change only affects
| situations where the union members are of different
| lengths, but for integer to float, the union members should
| always be the same length or bad things will happen.
| Calavar wrote:
| I realized my mistake and I think I edited my comment a
| split second before you replied, but you're right. That
| particular type punning scenario wouldn't be affected by
| this change because 1) the members are the same size, so
| there's no padding bits 2) the specific union member is
| going to be initialized to the input parameter, not with
| the syntax sugar for aggregate zero initialization.
| ryao wrote:
| Well, under your original version, I could see someone
| filling in bit fields in the float like the exponent and
| sign while leaving the mantissa zeroed, but given that
| the integer and float would be the same length, there is
| no section that would be left uninitialized by this
| change.
|
| In order for this change to leave something
| uninitialized, you would need to have a member of the
| union after the first member that is longer than the
| first member. Code that does that and relies on {0} to
| zero the union seems incredibly rare to me.
| ndiddy wrote:
| > How much code actually uses unions this way?
|
| I see this change caused Mbed-TLS to start failing its test
| suite when compiled with GCC 15: https://github.com/Mbed-
| TLS/mbedtls/issues/9814 (kinda scary since it's a security
| library). Hopefully other projects with less rigorous test
| suites aren't using {0} in that way. The Github issue
| mentions that Clang tried a similar optimization a while ago
| and backed it out after user complaints, so maybe the same
| thing will happen with GCC.
| ryao wrote:
| GCC's developers have a strong insistence on standards
| conformance (minus situations where they explicitly choose
| to deviate, like type punning in unions) over the status
| quo. We already went through a much more severe shift with
| strict aliasing enforcement by GCC and they never changed
| course. I do not expect this to be any different.
| ogoffart wrote:
| > This is going to silently break so much existing code
|
| The code was already broken. It was an undefined behavior.
|
| That's a problem with C and it's undefined behavior minefields.
| ryao wrote:
| GCC has long been known to define undefined behavior in C
| unions. In particular, type punning in unions is undefined
| behavior under the C and C++ standards, but GCC (and Clang)
| define it.
| mtklein wrote:
| I have always thought that punning through a union was
| legal in C but UB in C++, and that punning through
| incompatible pointer casting was UB in both.
|
| I am basing this entirely on memory and the wikipedia
| article on type punning. I welcome extremely pedantic
| feedback.
| ryao wrote:
| There has been plenty of misinformation spread on that.
| One of the GCC developers told me explicitly that type
| punning through a union was UB in C, but defined by GCC
| when I asked (after I had a bug report closed due to UB).
| I could find the bug report if I look for it, but I would
| rather not do the search.
| trealira wrote:
| From a draft of the C23 standard, this is what it has to
| say about union type punning:
|
| > If the member used to read the contents of a union
| object is not the same as the member last used to store a
| value in the object the appropriate part of the object
| representation of the value is reinterpreted as an object
| representation in the new type as described in 6.2.6 (a
| process sometimes called type punning). This might be a
| non-value representation.
|
| In past standards, it said "trap representation" rather
| than "non-value representation," but in none of them did
| it say that union type punning was undefined behavior. If
| you have a PDF of any standard or draft standard, just
| doing a search for "type punning" should direct you to
| this footnote quickly.
|
| So I'm going to say that if the GCC developer explicitly
| said that union type punning was undefined behavior in C,
| then they were wrong, because that's not what the C
| standard says.
| amboar wrote:
| Section J.1 _Unspecified_ behavior says
|
| > (11) The values of bytes that correspond to union
| members other than the one last stored into (6.2.6.1).
|
| So it's a little more constrained in the ramifications,
| but the outcomes may still be surprising. It's a bit
| unfortunate that "UB" aliases to both "Undefined
| behavior" and "Unspecified behavior" given they have
| subtly different definitions.
|
| From section 4 we have:
|
| > A program that is correct in all other aspects,
| operating on correct data, containing unspecified
| behavior shall be a correct program and act in accordance
| with 5.1.2.4.
| ryao wrote:
| Here is what was said:
|
| > Type punning via unions is undefined behavior in both c
| and c++.
|
| https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
|
| Feel free to start a discussion on the GCC mailing list.
| trealira wrote:
| I actually might, although not now. Thanks for the link.
| I'm surprised he directly contradicted the C standard,
| rather than it just being a misunderstanding.
| ryao wrote:
| According to another comment, the C standard contradicts
| the C standard on this:
|
| https://news.ycombinator.com/item?id=43794268
|
| Taking snippets of the C standard out of context of the
| whole seems to result in misunderstandings on this.
| trealira wrote:
| It doesn't. That commenter is saying that in C99, it was
| unspecified behavior. Since C11 onward, it's been removed
| from the unspecified behavior annex and type punning is
| allowed, though it may generate a trap/non-value
| representation. It was never undefined behavior, which is
| different.
|
| Edit: no, it's still in the unspecified behavior annex,
| that's my mistake. It's still not undefined, though.
| ryao wrote:
| Most of the C code I write is C99 code, so it is
| undefined behavior either way for me (if I care about
| compilers other than GCC and Clang).
|
| That said, I am going to defer to the GCC developers on
| this since I do not have time to make sense of all
| versions of the C standard.
| trealira wrote:
| That's fair. In the end, what matters is how C is
| implemented in practice on the platforms your code
| targets, not what the C standard says.
| jotux wrote:
| https://gcc.gnu.org/onlinedocs/gcc/Optimize-
| Options.html#Typ...
| ryao wrote:
| What is your point? I already said that GCC defines it
| even though the C standard does not. As per the GCC
| developers:
|
| > Type punning via unions is undefined behavior in both c
| and c++.
|
| https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
| jotux wrote:
| > One of the GCC developers told me explicitly that type
| punning through a union was UB in C, but defined by GCC
| when I asked
|
| I just was citing the source of this for reference.
| ryao wrote:
| I see. Carry on then. :)
| uecker wrote:
| Union type punning is allowed and supported by GCC:
| https://godbolt.org/z/vd7h6vf5q
| ryao wrote:
| I said that GCC defines type punning via unions. It is an
| extension to the C standard that GCC did.
|
| That said, using "the code compiles in godbolt" as proof
| that it is not relying on what the standard specifies to
| be UB is fallacious.
| uecker wrote:
| I am a member of the standards committee and a GCC
| maintainer. The C standard supports union punning. (You
| are right though that relying on godbolt examples can be
| misleading.)
| jotux wrote:
| Saw this recently and thought it was good:
| https://www.youtube.com/watch?v=NRV_bgN92DI
| jcranmer wrote:
| > punning through a union was legal in C
|
| In C89, it was implementation-defined. In C99, it was
| made expressly legal, but it was erroneously included in
| the list of undefined behavior annex. From C11 on, the
| annex was fixed.
|
| > but UB in C++
|
| C++11 adopted "unrestricted unions", which added a
| concept of active members that is UB to access other
| members unless you make them active. Except active
| members rely on constructors and destructors, which
| primitive types don't have, so the standard isn't
| particularly clear on what happens here. The current
| consensus is that it's UB.
|
| C++20 added std::bit_cast which is a much safer interface
| to type punning than unions.
|
| > punning through incompatible pointer casting was UB in
| both
|
| There is a general rule that accessing an object through
| an 'incompatible' lvalue is illegal in both languages. In
| general, changing the const or volatile qualifier on the
| object is legal, as is reading via a different signed or
| unsigned variant, and char pointers can read anything.
| trealira wrote:
| > In C99, it was made expressly legal, but it was
| erroneously included in the list of undefined behavior
| annex.
|
| In C99, union type punning was put under Annex J.1, which
| is unspecified behavior, not undefined behavior.
| Unspecified behavior is basically implementation-defined
| behavior, except that the implementor is not required to
| document the behavior.
| ryao wrote:
| We can use UB to refer to both. :)
| trealira wrote:
| Maybe, but we were talking about "undefined behavior,"
| not "UB," so the point is moot.
| hermitdev wrote:
| > We can use UB to refer to both. :)
|
| You can, but in the context of the standard, you'd be
| wrong to do so. Undefined behavior and unspecified
| behavior have specific, different, meanings in context of
| the C and C++ standards.
|
| Conflate them at your own peril.
| ryao wrote:
| The GCC developers disagree as of last December:
|
| > Type punning via unions is undefined behavior in both c
| and c++.
|
| https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
| mat_epice wrote:
| EDIT: This comment is wrong, see fsmv's comment below.
| Leaving for posterity because I'm no coward!
|
| - - -
|
| Undefined behavior only means that the spec leaves a
| particular situation undefined and that the compiler
| implementor can do whatever they want. Every compiler
| defines undefined behavior, whether it's documented (or
| easy to qualify, or deterministic) or not.
|
| It is in poor taste that gcc has had widely used,
| documented behaviors that are changing, especially in a
| point release.
| fsmv wrote:
| I think you're confusing unspecified and undefined
| behavior. UB could do something randomly different every
| time and unspecified must chose an option.
|
| In a lot of cases in optimizing compilers they just
| assume UB doesn't exist. Yes technically the compiler
| does do something but there's still a big difference
| between the two.
| mat_epice wrote:
| Thanks, you're right, I was mistaken.
| flohofwoe wrote:
| > type punning in unions is undefined behavior under the C
| and C++ standards
|
| Union type punning is entirely valid in C, but UB in C++
| (one of the surprisingly many subtle but still fundamental
| differences between C and C++). There's specifically a
| (somewhat obscure) footnote about this in the C standard,
| which also has been more clarified in one of the recent C
| standards.
| ryao wrote:
| There is no footnote about it in the C standard. Someone
| proposed adding one to standardize the behavior, but it
| was never accepted. Ever since then, people keep quoting
| it even though it is a rejected amendment.
| jcranmer wrote:
| Footnote 107 in C23, on page 75 in SS6.5.2.3:
|
| > If the member used to read the contents of a union
| object is not the same as the member last used to store a
| value in the object the appropriate part of the object
| representation of the value is reinterpreted as an object
| representation in the new type as described in 6.2.6 (a
| process sometimes called type punning). This might be a
| non-value representation.
|
| (though this footnote has been present as far back as
| C99, albeit with different numbers as the standard has
| added more text in the intervening 24 years).
| ryao wrote:
| The GCC developers disagree with your interpretation:
|
| > Type punning via unions is undefined behavior in both c
| and c++.
|
| https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13
| flohofwoe wrote:
| I'm not sure tbh what's there to 'interpret' or how a
| compiler developer could misread that, the wording is
| quite clear.
| ryao wrote:
| It is an excerpt being taken out of context. Of course it
| is quite clear. Taking it out of context ignores
| everything else that the standard says. That
| interpretation is wrong as far as compiler authors are
| concerned.
| trealira wrote:
| The context is that it's a footnote. The footnote is
| referenced in this paragraph:
|
| _A postfix expression followed by the . operator and an
| identifier designates a member of a structure or union
| object. The value is that of the named member (106), and
| is an lvalue if the first expression is an lvalue. If the
| first expression has qualified type, the result has the
| so-qualified version of the type of the designated
| member._
|
| _106) If the member used to read the contents of a union
| object is not the same as the member last used to store a
| value in the object the appropriate part of the object
| representation of the value is reinterpreted as an object
| representation in the new type as described in 6.2.6 (a
| process sometimes called type punning). This might be a
| non-value representation._
|
| In that same document, union type punning is explicitly
| listed under Annex J.1, Unspecified Behavior:
|
| _(11) The values of bytes that correspond to union
| members other than the one last stored into (6.2.6.1)._
|
| The standard is extremely clear and explicit that it's
| not undefined behavior.
| ryao wrote:
| This is not considering the document as a whole. I will
| defer to the GCC developers on what the document means on
| this.
| trealira wrote:
| I'm interested in hearing how considering the document as
| a whole leads to a different conclusion.
| jcranmer wrote:
| I am a member of the C standards committee, and I'm
| telling you you're wrong here. Martin Uecker is also
| member of the C standards committee, and has just
| responded to that bug saying that the comment you linked
| is wrong. I, and others here, have quoted literal
| standards text to you explaining why type punning through
| unions is well-defined behavior in C.
|
| I don't know who Andrew Pinski is, but they're factually
| incorrect regarding the legality of type punning via
| unions in C.
| uecker wrote:
| Andrew is a GCC developer who is very competent (much
| more than myself regarding GCC), but I think he was
| mistakenly assuming the C++ rules apply to C here as
| well.
| grandempire wrote:
| When you have a big system many people rely on you generally
| try to look for ways to keep their code working - not look
| for the changes you're contractually allowed to make.
|
| GCC probably has a better justification than "we are allowed
| to".
| arp242 wrote:
| > GCC probably has a better justification than "we are
| allowed to".
|
| Maybe, but I've seen GCC people justify such changes with
| little more than "it's UB, we can change it, end of story",
| so I wouldn't assume it.
| mwkaufma wrote:
| Undefined in the standard doesn't mean undefined in GCC.
| Type-punning through unions has always been a special case
| that GCC has taken care with beyond the standard.
| mistrial9 wrote:
| using UNION was always considered sketchy IMHO. This is trivia
| for security exploiters?
| grandempire wrote:
| No. This is how sum types are implemented.
|
| And from a runtime perspective it's going to be a struct with
| perhaps more padding. You'll need more details about your
| specific threat model to explain why that's bad.
| mistrial9 wrote:
| a quick search says that std::variant is the modern
| replacement to implement your niche feature "sum types"
| grandempire wrote:
| That's for C++. And how is std::variant implemented?
| LowLevelMahn wrote:
| not using a union:
| https://ojdip.net/2013/10/implementing-a-variant-type-in-
| cpp... because the union can't be extended with variadic
| template types
| grandempire wrote:
| So instead it has a buffer large enough to hold all the
| types? That's what union does.
|
| Still waiting to hear the security concerns.
| LegionMammal978 wrote:
| Actually, it does use a union, in both libstdc++ [0] and
| libc++ [1]. (Underneath a lengthy stack of base classes,
| since it wouldn't be C++ if it weren't painful to match
| the specified semantics.)
|
| [0] https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2
| B%2B-v3...
|
| [1] https://github.com/llvm/llvm-
| project/blob/llvmorg-20.1.3/lib...
| jlouis wrote:
| Not a niche feature. Fundamental for any decent language
| with a type system.
| mistrial9 wrote:
| ok, but C99 and C++11 and others, all have ways to
| implement types. "Fundemental" as you say.. using UNION
| in C++ is not a good choice to implement types.. in old
| C99, you can use UNION that way but why? footguns all
| around.
| soraminazuki wrote:
| Whoa, that's a core building block of programming and
| computer science that you're dismissing as "niche"
| without explanation.
| mistrial9 wrote:
| yes types are a core building block of programming and
| computer science, but not using UNION ? this casual
| dismissal of "criticisms of UNION" here seems superficial
| and un-wise to me.
| mtklein wrote:
| This was my instinct too, until I got this little tickle in the
| back of my head that maybe I remembered that Clang was already
| acting like this, so maybe it won't be so bad. Notice 32-bit
| wzr vs 64-bit xzr: $ cat union.c && clang -O1
| -c union.c -o union.o && objdump -d union.o union foo {
| float f; double d; }; void
| create_f(union foo *u) { *u = (union foo){0};
| } void create_d(union foo *u) { *u =
| (union foo){.d=0}; } union.o: file format
| mach-o arm64 Disassembly of section __TEXT,__text:
| 0000000000000000 <ltmp0>: 0: b900001f str
| wzr, [x0] 4: d65f03c0 ret
| 0000000000000008 <_create_d>: 8: f900001f
| str xzr, [x0] c: d65f03c0 ret
| mtklein wrote:
| Ah, I can confirm what I see elsewhere in the thread, this is
| no longer true in Clang. That first clang was Apple Clang 17
| ---who knows what version that actually is---and here is
| Clang 20: $
| /opt/homebrew/opt/llvm/bin/clang-20 -O1 -c union.c -o union.o
| && objdump -d union.o union.o: file format
| mach-o arm64 Disassembly of section
| __TEXT,__text: 0000000000000000 <ltmp0>:
| 0: f900001f str xzr, [x0] 4: d65f03c0
| ret 0000000000000008 <_create_d>:
| 8: f900001f str xzr, [x0] c: d65f03c0
| ret
| dzaima wrote:
| Looks like that change is clang <=19 to clang 20:
| https://godbolt.org/z/7zrocxGaq
| myrmidon wrote:
| I honestly feel that "uninitialized by default" is strictly a
| mistake, a relic from the days when C was basically cross-
| platform assembly language.
|
| Zero-initialized-by-default for everything would be an
| extremely beneficial tradeoff IMO.
|
| Maybe with a __noinit attribute or somesuch for the few cases
| where you don't _need_ a variable to be initialized AND the
| compiler is too stupid to optimize the zero-initialization away
| on its own.
|
| This would not even break existing code, just lead to a few
| easily fixed performance regressions, but it would make it
| significantly harder to introduce undefined and difficult to
| spot behavior by accident (because very often code _assumes_
| zero-initialization _and_ gets it purely by chance, and this is
| also most likely to happen in the edge cases that might not be
| covered by tests under memory sanitizer if you even have
| those).
| elromulous wrote:
| Devil's advocate: this would be unacceptable for os kernels
| and super performance critical code (e.g. hft).
| sidkshatriya wrote:
| Would you rather have a HFT trade go correctly and a few
| nanoseconds slower or a few nanoseconds faster but with
| some edge case bugs related to variable initialisation ?
|
| You might claim that that you can have both but bugs are
| more inevitable in the uninitialised by default scenario. I
| doubt that variable initialisation is the thing that would
| slow down HFT. I would posit is it things like network
| latency that would dominate.
| hermitdev wrote:
| > Would you rather have a HFT trade go correctly and a
| few nanoseconds slower or a few nanoseconds faster but
| with some edge case bugs related to variable
| initialisation ?
|
| As someone who works in the HFT space: it depends. How
| frequently and how bad are the bad-trade cases? Some slop
| happens. We make trade decisions with hardware _without
| even seeing an entire packet coming in on the network_.
| Mistakes/bad trades happen. Sometimes it results in
| trades that don't go our way or missed opportunities.
|
| Just as important as " _can_ we do better? " is "
| _should_ we do better? ". Queue priority at the exchange
| matters. Shaving nanoseconds is how you get a competitive
| edge.
|
| > I would posit is it things like network latency that
| would dominate.
|
| Everything matters. Everything is measured.
|
| edit to add: I'm not saying we write software that either
| has or relies upon unitialized values. I'm just saying in
| such a hypothetical, it's not a cut and dry "do the right
| thing (correct according to the language spec)" decision.
| Imustaskforhelp wrote:
| We make trade decisions with hardware _without even
| seeing an entire packet coming in on the network_
|
| Wait what????
|
| Can you please educate me on high frequency trading... ,
| like I don't understand what's the point of it & lets say
| one person has created a hft bot then why the need of
| other bot other than the fact of different trading strats
| and I don't think these are profitable / how they compare
| in the long run with the boglehead strategy??
| hermitdev wrote:
| This is a vast, _vast_ over-simplification: The primary
| "feature" of HFT is providing liquidity to market.
|
| HFT firms are (almost) always willing to buy or sell at
| or near the current market price. HFT firms basically
| race each other for trade volume from "retail" traders
| (and sometimes each other). HFTs make money off the
| spread - the difference between the bid & offer -
| typically only a cent. You don't make a lot of money on
| any individual trade (and some trades are losers), but
| you make money on doing a lot of volume. If done
| properly, it doesn't matter which direction the market
| moves for an HFT, they'll make money either way as long
| as there's sufficient trading volume to be had.
|
| But honestly, if you want to learn about HFT, best do
| some actual research on it - I'm not a great source as
| I'm just the guy that keeps the stuff up and running; I'm
| not too involved in the business side of things. There's
| a lot of negative press about HFTs, some positive.
| myrmidon wrote:
| No, just throw the __noinit attribute at every place where
| its needed.
|
| You probably would not even need it in a lot of instances
| because the compiler would elide lots of dead stores
| (zeroing) even without hinting.
| pjmlp wrote:
| It is acceptable enough for Windows, Android and macOS,
| that have been doing for at least the last five years.
|
| That is the usual fearmongering when security improvements
| are done to C and C++.
| TuxSH wrote:
| > this would be unacceptable for os kernels
|
| Depends on the boundary. I can give a non-Linux,
| microkernel example (but that was/is shipped on dozens of
| millions of devices):
|
| - prior to 11.0, Nintendo 3DS kernel SVC (syscall)
| implementations did not clear output parameters, leading to
| extremely trivial leaks. Unprivileged processes could
| retrieve kernel-mode stack addresses easily and making
| exploit code much easier to write, example here:
| https://github.com/TuxSH/universal-
| otherapp/blob/master/sour...
|
| - Nintendo started clearing all temporary registers on the
| Switch kernel at some point (iirc x0-x7 and some more); on
| the 3DS they never did that, and you can leak kernel object
| addresses quite easily (iirc by reading r2), this made an
| entire class of use-after-free and arbwrite bugs easier to
| exploit (call SvcCreateSemaphore 3 times, get sema kernel
| object address, use one of the now-patched exploit that can
| cause a double-decref on the KSemaphore, call
| SvcWaitSynchronization, profit)
|
| more generally:
|
| - unclearead padding in structures + copy to user =
| infoleak
|
| so one at least ought to be careful where crossing
| privilege boundaries
| bjourne wrote:
| There are many low-level devices where initialization is very
| expensive. It may mean that you need two passes through
| memory instead of one, making whatever code you are running
| twice as slow.
| myrmidon wrote:
| I would argue that these cases are pretty rare, and you
| could always get nominal performance with the __noinit
| hint, but I think this would seldomly even be needed.
|
| If you have instances of zero-initialized structs where you
| set individual fields after the initialization, all modern
| compiler will elide the dead stores in the the typical
| cases already anyway, and data of relevant size that is
| supposed to stay uninitialized for long is rare and a bit
| of an anti-pattern in my opinion anyway.
| modeless wrote:
| Ok, those developers can use a compiler flag. We need
| defaults that work better for the vast majority.
| bjourne wrote:
| Then why are you using C? :P
| 01HNNWZ0MV43FF wrote:
| I'm not, looks like a bad language with worse
| implementations
| rwmj wrote:
| GCC now supports -ftrivial-auto-var-
| init=[zero|uninitialized|pattern] for stack variables
| https://gcc.gnu.org/onlinedocs/gcc/Optimize-
| Options.html#ind...
|
| For malloc, you could use a custom allocator, or replace all
| the calls with calloc.
| myrmidon wrote:
| Very nice, did not know about this!
|
| The only problem with vendor extensions like this is that
| you can't really rely on it, so you're still kinda forced
| to keep all the (redundant) zero intialization; solving it
| at the language level is much nicer. Maybe with C2030...
| bluGill wrote:
| C++26 has everything initialiied by default. The value is not
| specified though. Implementations are encourage to use
| something weird to detect using before explict
| initialization.
| nullc wrote:
| Zero initializing often hides real and serious bugs, however.
| Say you have a function with an internal variable LEN that
| ought to get set to some dynamic length that internal
| operations will run over. Changes to the code introduce a
| path which skips the setting of LEN. Current compilers will
| (very likely) warn you about the potentially uninitialized
| use, valgrind will warn you (assuming the case gets
| triggered), and failing all that the program will potentially
| crash when some large value ends up in LEN-- alerting you to
| the issue.
|
| Compare with default zero init: The compiler won't warn you,
| valgrind won't warn you, and the program won't crash. It will
| just be silently wrong in many cases (particularly for
| length/count variables).
|
| Generally the attention to exploit safety can sometimes push
| us in directions that are bad for program correctness. There
| are many places where exploit safety is important, but also
| many cases where its irrelevant. For security it's generally
| 'safe' is a program erroneously shuts down or does less than
| it should but that is far from true for software generally.
|
| I prefer this behavior: Use of an uninitialized variable is
| an error which the compiler will warn about, however, in code
| where the compiler cannot prove that it is not used the
| compiler's behavior is implementation defined and can include
| trapping on ue, initializing to zero, or initializing to ~0
| (the complement of zero). The developer may annotate with
| _noinit which makes any use UB and avoids the cost of
| inserting a trap or ~0 initialization. ~0 init will usually
| fail but seldom in a silent way, so hopefully at least any
| user reports will be reproducible.
|
| Similar to RESTRICT _noinit is a potential footgun, but its
| usage would presumably be quite rare and only in carefully
| maintained performance critical code. Code using _noinit like
| RESTRICT is at least still more maintainable than assembly.
|
| This approach preserves the compiler's ability to detect
| programmer error, and lets the implementation pick the
| preferred way to handle the remaining error. In some contexts
| it's preferable to trap cleanly or crash reliably (init to ~0
| or explicit trap), in others its better to be silently wrong
| (init 0).
|
| Since C99 lets you declare variables wherever so it is often
| easy to just declare a variable where it is first set and
| that's probably best, of course. .. when you can.
| mastax wrote:
| Do distros have tooling to deal with this type of change?
|
| I imagine it would be very useful to be able to search through
| all the C/C++ source files for all the packages in the distro
| in a semantic manner, so that it understands typedefs and
| preprocessor macros etc. The search query for this change would
| be something like "find all union types whose first member is
| not its largest member, then find all lines of code where that
| type is initialized with `{0}`".
| ryao wrote:
| As a retired Gentoo developer, I can say not really as far as
| I know. There could be static analysis tools that can find
| this, but I am not aware of anyone who runs them on the
| entire distribution.
| mastax wrote:
| In theory it's just an extension of IDE tooling. A CLI with
| a little query language wrapping libclang. In practice I'm
| sure it's a nightmare just to get 20,000 packages' build
| systems wrangled such that the right source files get
| indexed by libclang, and all the endless plumbing for
| downloading packages and reporting results, and on and on.
| ryao wrote:
| Distribution build systems typically operate outside of
| an IDE. I suspect that it would be a nightmare to get
| 20,000 packages to compile in an IDE.
|
| It is possible in theory to write a compiler plugin to
| generate an error when code that does this is found and
| it would make it easy to find all of the instances in all
| packages by building with `make -k`, provided that the
| code is not hidden behind an unused package flag.
| anon-3988 wrote:
| lol this is exactly the kind of stuff I expects from C or C++
| haha its kinda insane people just decide to do this amidst all
| the talk about correctness/safety.
| nikic wrote:
| Fun fact: GCC decided to adopt Clang's (old) behavior at the
| same time Clang decided to adopt GCC's (old) behavior.
|
| So now you have this matrix of behaviors: * Old GCC:
| Initializes whole union. * New GCC: Initializes first member
| only. * Old Clang: Initializes first member only. * New Clang:
| Initializes whole union.
| Blikkentrekker wrote:
| I have to say, I've read the discussion this generated and it's
| a bit scary how no one seems to know whether type punning
| through unions is undefined or not in C, or rather, my
| conclusion reading it all is more so that many people are wrong
| and that is defined behavior, but some of the people who are
| wrong about it are actual GCC compiler developers so it can't
| be too easy to be right.
| elvircrn wrote:
| "C++ Modules have been greatly improved."
|
| It would be nice to know what these great improvements actually
| are.
| artemonster wrote:
| those were the greatest improvements of all time. all of them.
| :D
| canucker2016 wrote:
| Later in the article, it mentions: Improved
| experimental support for C++23, including: std
| and std.compat modules (also supported for C++20).
|
| From https://developers.redhat.com/articles/2025/04/24/new-c-
| feat...: The next major version of the GNU
| Compiler Collection (GCC), 15.1, is expected to be released in
| April or May 2025. GCC 15 greatly improved the
| modules code. For instance, module std is now supported (even
| in C++20 mode).
| boris wrote:
| In GCC 14, C++ modules were unusable (incomplete, full of bugs,
| no std modules, etc). I haven't tried 15 yet but if that
| changed, then it definitely qualifies for a "great
| improvement".
| bluGill wrote:
| Still no std modules but otherwise likely useable. modules
| are ready for early adoptors to use and start writing the
| books on what you should do. (Not how to do it, those books
| are mostly written though not in print. How hou should as is
| was imbort std a good idea or shoule containers and
| algorithms been split - or maybe something I haven't though
| of)
| omoikane wrote:
| Really excited about #embed support:
|
| > C: #embed preprocessing directive support.
|
| > C++: P1967R14, #embed (PR119065)
|
| See also:
|
| https://news.ycombinator.com/item?id=32201951 - Embed is in C23
| (2022-07-23)
| NekkoDroid wrote:
| I'd really wish for an `std::embed<...>` that would be a
| consteval function (IIRC there is a proposal for this, but I
| don't know its status). The less pre-processor stuff going on
| the less there is to worry about, the syntax would end up much
| cleaner and you can create your own wrapper functions.
| codr7 wrote:
| Finally, musttail, can't wait to try that out.
| fithisux wrote:
| Any Hope for HaikuOs + Winlibs. GDC would be greatly appreciated.
| pjmlp wrote:
| Interesting to see some improvements being done to Modula-2
| frontend as well.
___________________________________________________________________
(page generated 2025-04-25 23:00 UTC)