https://lwn.net/SubscriberLink/1018486/1dcd29863655cb25/

LWN.net Logo LWN
.net News from the source LWN

  * Content
      + Weekly Edition
      + Archives
      + Search
      + Kernel
      + Security
      + Events calendar
      + Unread comments
      + -------------------------------------------------------------
      + LWN FAQ
      + Write for us

User: [        ] Password: [        ] [Log in]
|
[Subscribe]
|
[Register]
Subscribe / Log in / New account

Some __nonstring__ turbulence

[LWN subscriber-only content]

    Welcome to LWN.net

    The following subscription-only content has been made available
    to you by an LWN subscriber. Thousands of subscribers depend on
    LWN for the best news from the Linux and free software
    communities. If you enjoy this article, please consider
    subscribing to LWN. Thank you for visiting LWN.net!

By Jonathan Corbet
April 24, 2025
New compiler releases often bring with them new warnings; those
warnings are usually welcome, since they help developers find
problems before they turn into nasty bugs. Adapting to new warnings
can also create disruption in the development process, though,
especially when an important developer upgrades to a new compiler at
an unfortunate time. This is just the scenario that played out with
the 6.15-rc3 kernel release and the implementation of
-Wunterminated-string-initialization in GCC 15.

Consider a C declaration like:

    char foo[8] = "bar";

The array will be initialized with the given string, including the
normal trailing NUL byte indicating the end of the string. Now
consider this variant:

    char foo[8] = "NUL-free";

This is a legal declaration, even though the declared array now lacks
the room for the NUL byte. That byte will simply be omitted, creating
an unterminated string. That is often not what the developer who
wrote that code wants, and it can lead to unpleasant bugs that are
not discovered until some later time. The
-Wunterminated-string-initialization option emits a warning for this
kind of initialization, with the result that, hopefully, the problem
-- if there is a problem -- is fixed quickly.

The kernel community has worked to make use of this warning and,
hopefully, eliminate a source of bugs. There is only one little
problem with the new warning, though: sometimes the no-NUL
initialization is exactly what is wanted and intended. See, for
example, this declaration from fs/cachefiles/key.c:

    static const char cachefiles_charmap[64] =
        "0123456789"                    /* 0 - 9 */
        "abcdefghijklmnopqrstuvwxyz"    /* 10 - 35 */
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ"    /* 36 - 61 */
        "_-"                            /* 62 - 63 */
        ;

This char array is used as a lookup table, not as a string, so there
is no need for a trailing NUL byte. GCC 15, being unaware of that
usage, will emit a false-positive warning for this declaration. There
are many places in the kernel with declarations like this; the ACPI
code, for example, uses a lot of four-byte string arrays to handle
the equally large set of four-letter ACPI acronyms.

Naturally, there is a way to suppress the warning when it does not
apply by adding an attribute to the declaration indicating that the
char array is not actually holding a string:

    __attribute__((__nonstring__))

Within the kernel, the macro __nonstring is used to shorten that
attribute syntax. Work has been ongoing, primarily by Kees Cook, to
fix all of the warnings added by GCC 15. Many patches have been
circulated; quite a few of them are in linux-next. Cook has also been
working with the GCC developers to improve how this annotation works
and to fix a problem that the kernel project ran into. There was some
time left to get this job done, though, since GCC 15 has not actually
been released -- or so Cook thought.

Fedora 42 has been released, though, and the Fedora developers, for
better or worse, decided to include a pre-release version of GCC 15
with it as the default compiler. The Fedora project, it seems, has
decided to follow a venerable Red Hat tradition with this release.
Linus Torvalds, for better or worse, decided to update his
development systems to Fedora 42 the day before tagging and releasing
6.15-rc3. Once he tried building the kernel with the new compiler,
though, things started to go wrong, since the relevant patches were
not yet in his repository. Torvalds responded with a series of
changes of his own, applied directly to the mainline about two hours
before the release, to fix the problems that he had encountered. They
included this patch fixing warnings in the ACPI subsystem, and this
one fixing several others, including the example shown above. He then
tagged and pushed out 6.15-rc3 with those changes.

Unfortunately, his last-minute changes broke the build on any version
of GCC prior to the GCC 15 pre-release -- a problem that was likely to
create a certain amount of inconvenience for any developers who were
not running Fedora 42. So, shortly after the 6.15-rc3 release,
Torvalds tacked on one more patch backing out the breaking change and
disabling the new warning altogether.

This drew a somewhat grumpy note from Cook, who said that he had
already sent patches fixing all of the problems, including the
build-breaking one that Torvalds ran into. He asked Torvalds to
revert the changes and use the planned fixes, adding: "`It is, once
again, really frustrating when you update to unreleased compiler
versions'". Torvalds disagreed, saying that he needed to make the
changes because the kernel failed to build otherwise. He also
asserted that GCC 15 was released by virtue of its presence in
Fedora 42. Cook was unimpressed:

    Yes, I understand that, but you didn't coordinate with anyone.
    You didn't search lore for the warning strings, you didn't even
    check -next where you've now created merge conflicts. You put
    insufficiently tested patches into the tree at the last minute
    and cut an rc release that broke for everyone using GCC <15. You
    mercilessly flame maintainers for much much less.

Torvalds stood his ground, though, blaming Cook for not having gotten
the fixes into the mainline quickly enough.

That is where the situation stands, as of this writing. Others will
undoubtedly take the time to fix the problems properly, adding the
changes that were intended all along. But this course of events has
created some bad feelings all around, feelings that could maybe have
been avoided with a better understanding of just when a future
version of GCC is expected to be able to build the kernel.

As a sort of coda, it is worth saying that Torvalds also has a
fundamental disagreement with how this attribute is implemented. The
__nonstring__ attribute applies to variables, not types, so it must
be used in every place where a char array is used without trailing
NUL bytes. He would rather annotate the type, indicating that every
instance of that type holds bytes rather than a character string, and
avoid the need to mark rather larger numbers of variable
declarations. But that is not how the attribute works, so the kernel
will have to include __nonstring markers for every char array that is
used in that way.

Index entries for this article
Kernel               GCC


[Send a free link]


-----------------------------------------
[Log in] to post comments

No safeguards?

Posted Apr 24, 2025 16:27 UTC (Thu) by estansvik (subscriber, #
127963) [Link] (15 responses)

Seems like a broken development process if patches can be added
without going through review and CI. I thought the kernel had that,
even for patches from Linus?
[Reply to this comment]
No safeguards?

Posted Apr 24, 2025 18:53 UTC (Thu) by koverstreet (subscriber, #
4296) [Link] (8 responses)

Automated testing/CI? No.

There's automated /build/ testing that would've caught this, which
Linus skipped. But beyond that, it's "every subsystem for itself",
which means filesystem folks are regularly stuck doing QA for the
rest of the kernel.

I've had to triage, debug, and chase people down for bugs in mm,
sched, 9p, block, dm - and that's really just counting the major ones
that blew up my CI, never mind the stuff that comes across my desk in
the course of supporting my users.

6.14 was the first clean rc1 in ages, and I was hoping that things
were improving - but in 6.15 I've lost multiple days due to bugs in
other subsystems, again. And this is basic stuff that we have
automated tests for, but people don't want to fund testing, let alone
be bothered with looking at a dashboard.

I have fully automated test infrastructure, with a dashboard, that I
could start running on other subsystem trees today, and wrapping
other test suites is trivial (right now it mainly runs fstests and my
own test suite for bcachefs).

People just don't care, they're happy to do things the same old way
they've always done as long as they still get to act like cowboys.
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 5:37 UTC (Fri) by marcH (subscriber, #57642) [
Link] (6 responses)

> I have fully automated test infrastructure, with a dashboard, that
I could start running on other subsystem trees today, and wrapping
other test suites is trivial.

If you have already paid for enough resources, just go and do it. At
least for all the suites and coverage that don't meltdown your
infrastructure.

> People just don't care, they're happy to do things the same old way
they've always done as long as they still get to act like cowboys.

_Some_ people don't care. But, there are these wonderful things
called "name and shame", "peer pressure", etc. It's not nice so
people don't say it out loud but the massive success of CI is largely
based on those. There are some memes, search "you broke the build"
for instance. Don't get me wrong: these are neither an exact science
nor a silver bullet. So it may make a large difference in some areas
and very little in others. But for sure it _will_ make some
difference and be worth it.

If you can for a reasonable effort, then stop thinking and discussing
about it; just go and do it.

[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 6:00 UTC (Fri) by koverstreet (subscriber, #4296)
[Link] (2 responses)

> If you have already paid for enough resources, just go and do it.
At least for all the suites and coverage that don't meltdown your
infrastructure.

I've got enough hardware for my own resources, but it's not cheap,
and I run those machines hard, so I'm not terribly inclined to
subsidize the big tech companies that don't want to pay for testing
or support the community. Already been down that road.

And I don't want to get suckered into being the guy who watches
everyone else's test dashboards, either :)

It really does need to be a community effort, or at least a few
people helping out a bit with porting more tests, and maintainers
have to want to make use of it. I've got zero time left over right
now for that sort of thing, since I'm trying to lift the experimental
label on bcachefs by the end of the year.
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 18:47 UTC (Fri) by marcH (subscriber, #57642) [
Link] (1 responses)

> I have fully automated test infrastructure, with a dashboard, that
I could start running on other subsystem trees today, and wrapping
other test suites is trivial

2 comments later:

> ... but it's not cheap, and I run those machines hard, so I'm not
terribly inclined to subsidize the big tech companies that don't want
to pay for testing or support the community. Already been down that
road.

Understood but please don't give everyone false hopes again :-)

A last, somewhat desperate attempt to change your mind: please don't
underestimate the awesome power of "role-modeling"[*]. For instance,
you could pick a test suite that regularly finds regressions in only
the couple, worst subsystems and run a small test subset to keep
usage to a minimum? Based on what you wrote, this could be enough to
continuously highlight regressions in those subsystems and
continuously crank up the pressure on them to take over what you
continuously demonstrate. If you keep the test workload small, this
should cost you nothing but the initial time to set it up, which you
wrote would be small. Who knows, other subsystems might even fear
you'll come after them next? :-) Sorry, I meant: be continuously
impressed by that demo and desire copying it?

BTW wouldn't that also help your _own_ workload, at least a bit? I
mean you have to keep debugging this incoming flow of regressions
anyway, don't you?

[*] A personal, very recent example: I finally figured out the
security model of https://github.com/actions/cache , combined it with
ccache and cut down kernel compilation in paltry GitHub runners from
10 minutes down to 30 seconds. If I had a "role-model" like you said
you could be, I would have done this months earlier!

[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 20:46 UTC (Fri) by koverstreet (subscriber, #
4296) [Link]

> Understood but please don't give everyone false hopes again :-)

I'm trying to motivate other people to step up and help out, either
by getting it funded or contributing tests, by talking about what's
already there and what we do have.

I am _not_ going to get this done on my own. But I'll certainly help
out and lead the effort if other people are interested.
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 7:42 UTC (Fri) by josh (subscriber, #17465) [Link
] (2 responses)

> But, there are these wonderful things called "name and shame",
"peer pressure", etc. It's not nice so people don't say it out loud
but the massive success of CI is largely based on those.

No, one of the many massive successes of CI is that it gets *rid* of
those. The right answer to "you broke the build" is not "shame on
you", it's "shame on our lack of tooling, that that wasn't caught
before it was merged".
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 14:20 UTC (Fri) by marcH (subscriber, #57642) [
Link]

There are developers who have the discipline and desire to try to
break their own code before sharing it. There are others who do not
and prefer to wing it in order to "save time" and effort. They don't
push all the test buttons they have and wait for bug reports instead.
The world is not that binary and the same people can be either at
different times but you get the idea. Whether the latter people
actually save time themselves is debatable (bug fixing can be
extremely time-consuming) but for sure they massively disrupt and
waste the time of the former people and of the project as a whole.

The "name and blame" comes from version control and "git blame"; CI
does not change that. But automation acts as a dispassionate referee
by removing most of the personal and subjective elements like:
- Removes debates about workspace-specific configurations: the
automated configurations are "gold".
- The first messenger is a robot
- Choice of test coverage and priorities. You still need to discuss
what gets tested, how often etc. but these discussions happen when
configuring CI, _before_ regressions and tensions happen.

It's not a silver bullet and you still have projects that ignore
their own CI results, don't have enough test coverage, have enraged
CI debates,... but in every case it exposes core issues at the very
least which is still major progress.
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 18:13 UTC (Fri) by marcH (subscriber, #57642) [
Link]

> ... lack of tooling, that that wasn't caught before it was merged

_Pre-merge_ checks are critical and they make indeed a massive
difference wrt. to avoiding disrupting others and reducing tensions,
good point.

Automation is not just "pre-merge" though. Longer, post-merge daily/
weekly test runs are still required in the many projects that limit
pre-merge checks to less than ~1h for prompt feedback; those projects
will still have some regressions merged. Much fewer and narrower
regressions merged but still some.

There is also the funny issue of A and B passing separately but not
together. Rare but happens. This is solved by Gitlab "merge trains",
Github "merge queues" and similar but these require more
infrastructure.

Last but not least: the issue of flaky tests, infra failures and
other false positives that degrade SNR and confuse "aggressive"
maintainers who merge regressions. And who'd want to fix the flaky
tests or the infra? Gets little credit and promotions. As often, the
main issue is not technical. It's cultural and/or a business
decision.

[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 6:53 UTC (Fri) by estansvik (subscriber, #127963)
[Link]

Okay, I feel your pain Kent.

I'm not doing kernel stuff, but thought that patches were at least
gated by some build testing. Pretty amazing for such a high profile
project to not at least have that. So it's all done on scout's honor?
[Reply to this comment]
No safeguards?

Posted Apr 24, 2025 19:17 UTC (Thu) by ballombe (subscriber, #9523) [
Link]

"regression testing"? What's that? If it compiles, it is good, if it
boots up it is perfect.
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 9:09 UTC (Fri) by Avamander (guest, #152359) [
Link] (4 responses)

You seem to expect too modern development practices from the kernel.
Keep in mind they're still lugging around patches over email.

Can't wait for them to switch to actually usable tools where patches
don't get stuck in spam filters or between different mailing lists.
Maybe then I'll bother to change a few small bugs that have been in
the kernel for years.
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 12:02 UTC (Fri) by pizza (subscriber, #46) [Link]
(3 responses)

> Maybe then I'll bother to change a few small bugs that have been in
the kernel for years.

You'll just come up with some other excuse.

[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 13:31 UTC (Fri) by Avamander (guest, #152359) [
Link] (2 responses)

> You'll just come up with some other excuse.

I've actually submitted patches, but will not do so any longer
because how truly terrible the workflow is. It's not worth the
hassle, especially not for small changes.

Have you contributed anything though? I doubt it for some reason.
[Reply to this comment]
That's enough

Posted Apr 25, 2025 13:37 UTC (Fri) by corbet (editor, #1) [Link]

We really do not need to be flinging mudballs at each other; stop,
please?
[Reply to this comment]
No safeguards?

Posted Apr 25, 2025 14:43 UTC (Fri) by pizza (subscriber, #46) [Link]

> Have you contributed anything though? I doubt it for some reason.

I'm in the MAINTAINERS file.

[Reply to this comment]
Bad Practice

Posted Apr 24, 2025 16:42 UTC (Thu) by JanSoundhouse (subscriber, #
112627) [Link] (7 responses)

IMO its pretty hard to side with Linus on this one. Making last
minute changes without even checking if its actually working is
pretty bad practice. Does Linus not have of access to some kind of
infrastructure that does the absolute minimal checks in the form of
"does-it-compile"? Works on my machine, lets ship it!

Maybe someone can help and setup some basic actions for him on his
github repo?
[Reply to this comment]
Bad Practice

Posted Apr 25, 2025 6:49 UTC (Fri) by mathstuf (subscriber, #69389) [
Link]

Agreed. The number of times "oh, this is trivial" and skipped CI has
bitten me is a significant percentage of the times I've done it.
*Sometimes* it works, but that "this is trivial" judgement really
hampers the self-review process for spotting actual issues. But
eating crow when you screw up is not a fatal thing...accept it, learn
from it, and move on.
[Reply to this comment]
Bad Practice

Posted Apr 25, 2025 8:14 UTC (Fri) by jlargentaye (subscriber, #
75206) [Link]

I agree that, as presented, this is a particular poor showing from
Torvalds.

> Maybe someone can help and setup some basic actions for him on his
github repo?

Surely you're joking. Or are you unaware of his poor opinion of
GitHub's every design decision? To make it clear, the GitHub Linux
repo is offered only as a read-only mirror.
[Reply to this comment]
Bad Practice

Posted Apr 25, 2025 15:38 UTC (Fri) by rweikusat2 (subscriber, #
117920) [Link] (4 responses)

This wasn't 'shipped'. It was a release candidate published so that
others could test it, precisely to catch bugs which might only occur
in environments different from the one the person who put together
the release candidate used. Considering that the behavoir of GCC 15
and GCC 14 differs with regard to this particular kind of using an
attribute, it's also not inappropriate to refert to the actual issue
as a GCC 14 bug or at least a property of GCC 14 the GCC developers
no longer consider useful.
[Reply to this comment]
Bad Practice

Posted Apr 25, 2025 16:19 UTC (Fri) by pbonzini (subscriber, #60935)
[Link] (3 responses)

No, not at all. There is a set of supported GCC versions which is
much larger than "whatever Linus has on his machine". If he didn't
want to check what was in linux-next or on the mailing list he
totally could have worked around the issue on his machine, but he
shouldn't have pushed untested crap to the -rc.
[Reply to this comment]
Bad Practice

Posted Apr 25, 2025 17:21 UTC (Fri) by rweikusat2 (subscriber, #
117920) [Link] (2 responses)

Then, what's your theory about the reason for this change of
behaviour fromm GCC 14 to GCC 15 if it was neither a bugfix nor
something the GCC developers considered a necessary improvement?
Random mutation perhaps?

The very point of having a release candidate is to enable testing by
others. It's not a release and bugs are expected.
[Reply to this comment]
Bad Practice

Posted Apr 25, 2025 18:21 UTC (Fri) by marcH (subscriber, #57642) [
Link]

> The very point of having a release candidate is to enable testing
by others. It's not a release and bugs are expected.

Come on, failure to compile (!) with everything except a pre-release
GCC is absolutely not expected from a _release candidate_. Yes,
people are expected to test release candidates and find bugs but they
are not expected to waste everyone's time with that sort of issue in
2025. This steals from actual test time.

[Reply to this comment]
The Error of -Werror

Posted Apr 25, 2025 18:45 UTC (Fri) by mussell (subscriber, #170320)
[Link]

So right now if you try to build the 6.15-rc3 tag with GCC 14 and
CONFIG_WERROR, the error you will get is

    /media/net/src/linux/drivers/acpi/tables.c:399:1: error:
    'nonstring' attribute ignored on objects of type 'const char[][4]
    ' [-Werror=attributes]
    399 | static const char table_sigs[][ACPI_NAMESEG_SIZE]
    __initconst __nonstring = {

And the GCC documentation for nonstring, says that

    The nonstring variable attribute specifies that an object or
    member declaration with type array of char, signed char, or
    unsigned char, or pointer to such a type is intended to store
    character arrays that do not necessarily contain a terminating
    NUL.

According to the C standard (and contrary to popular belief), char*
and char[] are two distinct types as the latter has storage
associated with it (ie. in .rodata) while the former is a single
machine word (assuming pointers are represented by machine words.)
What seems to have changed in GCC 15 is that you can now declare an
array of char arrays as nonstring. On older compilers, trying to use
an attribute where it can't be used gives a warning from
-Wattributes, which is upgraded to an error by -Werror.
From my perspective, GCC did the right thing by allowing nonstring to
be applied to char[][] since it aligns with programmers' expectations
that char*[] and char[][] are basically the same. In fact, I consider
<GCC 15's behaviour a bug and I see no reason not to backport this
change to earlier versions. Really this is just -Werror doing -Werror
things.
[Reply to this comment]
Imagine: the future

Posted Apr 24, 2025 17:17 UTC (Thu) by marcH (subscriber, #57642) [
Link] (15 responses)

Imagine a programming language where a string is a real type and not
just an array of byte. Maybe, the language could keep track of the
length instead of using a special, escape character?

Imagine CI environments that continuously compile and run some basic
checks with a range of toolchains and environments.

Imagine not upgrading your OS the day before a release. Or adding
last minute, unreviewed and barely tested changes, a.k.a "Friday
Deployment".

Imagine some "team development software" that automagically connect
dots and people with this new fancy "hyperlink" concept, so people
spend less time searching email archives and git logs. It could be
called a "forge"?

All too futuristic, sorry. I bet no one has ever ventured in such
uncharted territories.

[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 8:20 UTC (Fri) by eru (subscriber, #2753) [Link]
(14 responses)

C "strings" work the way they do because C is a low level language,
where you want to be able to do low-level things when necessary. It's
a feature, not a deficiency.
If you don't need byte-level control, there are of course lots of
languages with smart strings and other nice things, and most userland
applications should definitely be written in such languages, not C.
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 8:29 UTC (Fri) by mathstuf (subscriber, #69389) [
Link]

Note that there's nothing stopping C from having better
size-annotated strings except the dearth of useful APIs for such
things in the standard library and crappy package management.
Personally, I think size-annotated strings are *better* than
NUL-terminated strings because you don't have to reallocate to get a
substring that doesn't happen to coincide with the end of the source
string. You just return a struct with the size and an interior
pointer. Of course, the lack of tracking ownership for such things in
C is also a major problem (which C++ indeed has with
`std::string_view`), but I think it'd be much nicer overall than
having to juggle `\0` bytes at the right places.
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 10:15 UTC (Fri) by tialaramex (subscriber, #
21167) [Link] (12 responses)

No. C is like this because it's just barely enough to get software
out the door in the mid-early 1970s.

You can do all this stuff "where you want to be able to do low-level
things when necessary" in Rust, but Rust wouldn't have fitted on the
mini computer a research team owns in 1974. The question is, why do
we still care about whether the compiler could fit on a a 50+ year
old machine ? I argue we should not care about that.

Dennis specifically tried to get fat pointers into C later and this
proposal was rejected. Fat pointers, if you don't know, are how
Rust's &str knows how big that string slice is - instead of an
ordinary pointer it's a pointer plus more data the same size, in this
case a length. If you're register starved (as older machines often
were) this is expensive, but modern ISAs often have lots of named
registers (and far more actual registers behind them for perf
reasons) so this is an excellent choice on anything vaguely modern.

Fat pointers are also how Rust chooses to do dynamic dispatch, a
pointer to "A bird of some kind" plus a pointer to "A table of
functions for this kind of bird" is a fat pointer which enables us to
emit code which doesn't care specifically what kind of bird this is,
only that it's a bird and here's how to do bird things with it.
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 11:15 UTC (Fri) by mathstuf (subscriber, #69389)
[Link]

I'm aware of why it is like it is from its roots. But improvements
can be done. At least conceptually, because...

> Dennis specifically tried to get fat pointers into C later and this
proposal was rejected.

Well, that's disappointing. Sometimes the standardization committees
are their own worst enemy :( . (FD: I'm a member of the C++
committee, but not C.)
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 11:52 UTC (Fri) by uecker (guest, #157556) [Link]
(2 responses)

I think parent is right. Having a fat pointer means you embed some
hidden information and this is what is not "low-level." You can have
length-prefixed string pointers in C just fine and I use them in some
projects. You certainly do not need Rust for this. As some other
commenter pointed out, the APIs are all just build around traditional
C strings and we would need to agree on a single string pointer type
to really get people to switch (which we should do in WG14).
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 13:28 UTC (Fri) by tialaramex (subscriber, #
21167) [Link]

How is the fat pointer "hidden information"? What's hidden about it ?

The thing for WG14 to have done was land the fat pointer types for
say C99. I think that would have been one of those "controversial at
the time but rapidly proved correct" choices like removing gets in
C11. If you (or WG14 as a whole) make that happen in C2b that'd be
welcome, but it's very late, I do not expect I would make any use of
this, the decades of C programming are behind me.

The pascal-style length-prefixed string is completely orthogonal to
the fat pointer string slice. That's why if I look inside a Rust
binary it has text like
"EPERMENOENTESRCHEINTREIOENXIOE2BIGENOEXECEBADF" baked inside it,
there aren't any zero terminators _or_ length prefixes, neither is
needed to refer to EPERM or E2BIG in that text with a slice.
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 16:54 UTC (Fri) by khim (subscriber, #9252) [Link
]

> I think parent is right. Having a fat pointer means you embed some
hidden information and this is what is not "low-level."

That's just nonsense. By that reasoning we shouldn't have structs, or
opaque pointers or any other ways of "hiding" information... yet that's
literally bread-and-butter of any large project, include Linux
kernel.

> You certainly do not need Rust for this.

You need something-else-not-C for that, though.

We have all that stupidity because C literals behave like they do and
without changing language one couldn't fix that problem.

> As some other commenter pointed out, the APIs are all just build
around traditional C strings and we would need to agree on a single
string pointer type to really get people to switch (which we should
do in WG14).

Yeah, that, too. Rust got saner strings than C because it started
from scratch while C++ is a mess because it haven't.

The question of "how often new language have to be introduced" is a
good one, but it feels as if the good answer is "somewhere between
"every 10 years" and "every 20 years"... with all languages being
supported for about 3-5x as long".

Simply because certain things do need a full rewrite on new
foundations with certain critical fixes embedded... and yet the only
way to efficiently do that is via the well-known "one funeral at a
time" way... means languages have to chage with generations... and these
happen once per approximately 15 years.

[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 15:11 UTC (Fri) by excors (subscriber, #95769) [
Link] (2 responses)

I've recently been looking at FORTRAN 77 and it effectively takes the
fat pointer approach too. You declare a character variable (i.e. a
string) with a compile-time-constant size, like `CHARACTER*(72) MSG`.
Subroutines can declare a dummy argument (parameter) as `CHARACTER*
(*)`, meaning the caller determines the size of the string.
Assignment will automatically truncate to the string's actual size,
so buffer overflows are impossible (in this case); and assignment of
a shorter string will pad the result with space characters. (Most
built-in operations ignore trailing spaces, e.g. 'A' .EQ. 'A ' will
return .TRUE., so you can have variable-length strings in
statically-allocated storage without an explicit terminator.)

That also works when you pass a substring into a call, much like
string slices in Rust, meaning the size may be dynamic and the
compiler has to pass it as a hidden argument alongside the pointer to
the bytes.

(Subroutines can also declare the dummy argument with an explicit
size, in which case the size passed by the caller is ignored. If it's
declared as larger than the actual size, you get undefined
behaviour.)

It seems to get a bit tricky when you have functions that return
`CHARACTER*(*)` (meaning the caller has to provide the storage and
pass it as another hidden argument) and use concatenation (`MSG =
"HELLO " // WORLD()` etc, meaning an efficient compiler has to figure
out where the result of the concatenation is going to be assigned to
and how much has already been written before it calls the function,
so WORLD can write its output directly into a substring of MSG, and
if there are zero bytes left then the compiler is explicitly allowed
to not call the function at all). And character arrays are weird.

But in general, considering it was half a century ago, it doesn't
seem an entirely terrible system.

Earlier versions of FORTRAN sound actually terrible since there was
no CHARACTER type, apparently you'd just store multiple characters
(maybe up to 10 depending on platform) in an INTEGER or DOUBLE
PRECISION variable and manipulate them arithmetically. CHARACTER
wasn't standardised until FORTRAN 77, long after the origins of C, so
I guess it wouldn't have served as inspiration for C, but at least
those ideas were around in the 70s.
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 16:20 UTC (Fri) by eru (subscriber, #2753) [Link]

FORTRAN was (and I guess still is) used for number crunching, and
strings were needed mainly for labelling data in output to fanfold
paper. So you can

      WRITE 5,100
100   FORMAT(41HGET AWAY WITH NONEXISTENT STRING FEATURES)

[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 19:08 UTC (Fri) by Wol (subscriber, #4433) [Link]

(As I understand it, the "official" spelling is FORTRAN IV and
earlier, Fortran 77 and later ...)

So Fortran actually has strings as part of the language. I'd have
thought it had a double-size - the space available, and the space
used.

And from using FORTRAN, I know we had a string library, and we just
shoved them into an integer array, or used Hollerith variables, as in
16HThis is a string. I don't personally remember using Hollerith,
though.

Cheers,
Wol
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 16:28 UTC (Fri) by eru (subscriber, #2753) [Link]
(4 responses)

I don't know much about Rust, but I am guessing it then needs also
raw pointers and conversions between them and fat pointers, otherwise
some low-level operations would be impossible to program.

I believe there is absolutely no point in extending C with such
things now. C occupies its particular niche as a "just do what I say"
language fine, and someone who is not happy with its feature set can
and should use some more modern language, like Rust.
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 17:14 UTC (Fri) by khim (subscriber, #9252) [Link
]

> I don't know much about Rust, but I am guessing it then needs also
raw pointers and conversions between them and fat pointers, otherwise
some low-level operations would be impossible to program.

Sure. from_raw_parts is there for cases where it's needed (like in
FFI).

> C occupies its particular niche as a "just do what I say" language
fine

I would say that C occupies the niche of "it should just go away,
eventually" language.

Just like we no longer use PL/I (like Multics) or Pascal (like a
MacOs)... C should, eventually, go away, too.

It's just a bit sad that with all the fascination about managed code
people stopped thinking about replacement for low-level languages... we
have got one, sure, but that was actually an accident, the plan was
to produce yet-another-high-level-language.

P.S. And, of course it would be stupid to start the grand rewrite
from OS kernel: OS kernels are extremely stubborn and it takes a lot
of time to do them right... but apparently we are slowly going in that
direction, anyway. Studies show that green, just written Rust code is
comparable in quality to C code that had 3-5 years of
bugfixing... means Rust rewrite only makes sense when code in question
is rewritten for some other reason, not just because "it's Rust, it's
shiny, let's rewrite everything"... even if, somehow, people are still
doing it, anyway.

[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 17:16 UTC (Fri) by intelfx (subscriber, #130118)
[Link] (1 responses)

> C occupies its particular niche as a "just do what I say" language
fine

As repeatedly demonstrated by optimizing compilers, the problems
start when C does *not* "just do what I say".
[Reply to this comment]
Imagine: the future

Posted Apr 25, 2025 18:53 UTC (Fri) by marcH (subscriber, #57642) [
Link]

Mandatory reference...

https://queue.acm.org/detail.cfm?id=3212479 "C Is Not a Low-level
Language"

[Reply to this comment]
Rust thin and fat pointers

Posted Apr 25, 2025 17:17 UTC (Fri) by farnz (subscriber, #17727) [
Link]

In Rust, whether a raw pointer is "thin" or "fat" is determined by
the type of the pointee; a *const u32 is always a thin pointer to a
u32, while a *const (dyn Debug) is always a fat pointer. Note that
wherever I say *const, there's an equivalent *mut for raw pointers to
mutable memory, and you can only get mutable (exclusive) references
from *mut pointers.

There's also an API (marked experimental, because there's a chunk of
details that everyone wants to get right before it stabilises and
cannot be changed if it's bad) that lets you "split" any pointer into
a *const () (which is always thin) and an opaque representation of
the other part (either a zero-sized type for a thin pointer, or the
other half of the fat pointer) and rebuild a pointer from a thin
pointer and the opaque representation of the other half.

Additionally, there's casts that let you do any conversions between
raw pointer types that you want to do, and a set of APIs (the "strict
provenance" APIs) which exist to allow you to write code that will
fail to compile if it does things that aren't guaranteed to work on
all reasonable CPUs and compilers - at the expense of also failing to
compile some things that are guaranteed to work on all reasonable
compilers for your choice of CPU.

It ends up being a much bigger API surface for raw pointers than C
has, but that's largely because where C's API surface says "make a
mistake in a reasonable use of pointers, and you have UB", Rust is
aiming for "there is a function in the API surface that will either
work, or fail to compile, if your use of raw pointers is reasonable".
[Reply to this comment]
Cataloguing errors and fixes

Posted Apr 24, 2025 17:59 UTC (Thu) by geofft (subscriber, #59789) [
Link]

A couple years back at my day job we set up a Slack bot to
auto-respond to particular error messages (with regex match) and
suggest what you should do about them. Usually we'd run into this
situation with something that had been fixed on trunk, but would
still affect people with out-of-date clones, people intentionally
working on older branches, production systems that hadn't been
redeployed, etc. A few of us noticed we were spending a lot of time
manually replying on Slack, which also meant the person asking had to
wait until someone saw their message and replied; the bot was a way
to write a high-quality answer once and get it to people as soon as
they answered. It got to the point where we would proactively set up
regex matches and replies for migrations that we knew would have
backwards incompatibilities, even before shipping the migration.

In this case, you could imagine Kees setting up a responder for these
particular GCC errors and a reply saying that the patches to fix them
are over here and are waiting to be pulled into mainline (assuming,
hypothetically, that there were some system like Slack or even email
that Linus would have posted a message to saying, "anyone run into
these errors before?").

I've always thought that it would be a neat next step to integrate
this directly into command-line output, somehow, so that if your
terminal ran a command that produced an error, it could look it up
against some trusted/curated source of known errors applicable to
your work and display an alert if it recognized one. The key trick
here is that the catalog of errors has to be online in some fashion:
this isn't just about improving error messages or behavior in the
code you have on disk, because you want to be able to intelligently
respond to problems that were discovered after whatever git commit
you're on. ("Online" could of course mean that you regularly download
a database of errors and query it locally; it doesn't have to involve
sending your build logs somewhere in real time, but the database has
to be updated frequently.)
[Reply to this comment]
make -j allmodconfig

Posted Apr 25, 2025 9:45 UTC (Fri) by adobriyan (subscriber, #30858)
[Link]

It took 30 minutes to compile amd64 allmodconfig on my previous
potato (8c/16t) and it takes 13.5 minutes on the current one (16c/
32t).

So it is ~4 hours per allmodconfig per 1c/2t core per 1 compiler.

Allyesconfig is even worse.

Good luck with that, dear volunteers.
[Reply to this comment]
Do not underestimate the impact of fedora breakage.

Posted Apr 25, 2025 11:29 UTC (Fri) by ballombe (subscriber, #9523) [
Link] (5 responses)

GCC 15 will introduce a change of the default C standard level which
cause some software to fail to build. For a list of example, see <
https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=ftbfs-g...>
However GCC has a reliable and well documented release timeline, so
software project know when they need to publish a new version to
support gcc 15.

By releasing Fedora with an unreleased version of gcc, Fedora cause
all such software project to have to rush to fix gcc-15 related
breakage before that date or appear broken to fedora users.

This is not respectful of the gcc team nor of the other projects.

This is gcc 2.96 all over again.
[Reply to this comment]
Do not underestimate the impact of fedora breakage.

Posted Apr 25, 2025 12:38 UTC (Fri) by jwakely (subscriber, #60262) [
Link] (3 responses)

No, this is overblown rubbish. Fedora does this *every year*, and the
release of fedora 42 with a GCC 15.0.1 snapshot happened 10 days
before the release of the final GCC 15.1 version.

Do you really think everybody was waiting for those last 10 days to
fix their code, and were just caught unawares by fedora 42 being a
few days sooner? Don't be silly.

As for not being respectful of the gcc team, the people who made the
fedora changes are many of the same people releasing GCC 15. The
maintainer for fedora's gcc package was also the release manager of
GCC 15.1.

GCC 2.96 was a completely different situation, with a version that
didn't exist upstream, containing unreleased changes unique to the
"2.96" release. The snapshot in fedora 42 is almost identical to the
final GCC 15.1, and will be updated to a newer 15.1.1 snapshot in a
few days.

[Reply to this comment]
Do not underestimate the impact of fedora breakage.

Posted Apr 25, 2025 17:49 UTC (Fri) by geert (subscriber, #98403) [
Link] (2 responses)

Can you build all packages (incl. the Linux kernel) in Fedora 42 with
the compiler that comes with Fedora 42?
[Reply to this comment]
Do not underestimate the impact of fedora breakage.

Posted Apr 25, 2025 18:14 UTC (Fri) by pizza (subscriber, #46) [Link]
(1 responses)

> Can you build all packages (incl. the Linux kernel) in Fedora 42
with the compiler that comes with Fedora 42?

Yes, that is a hard requirement for being shipped as part of F42.

(Though that may require shipping patches that are not yet upstream)

[Reply to this comment]
Do not underestimate the impact of fedora breakage.

Posted Apr 25, 2025 22:32 UTC (Fri) by AdamW (subscriber, #48457) [
Link]

No it isn't. The F42 FTBFS (Fails To Build From Scratch) tracker
still depends on several hundred open bugs - https://
bugzilla.redhat.com/show_bug.cgi?id=2300528 . Not all of those are
triggered by GCC 15, but some are for sure.

It's not practically possible to ensure *every* package in a distro
as big as Fedora builds with a new compiler version, and the release
date of the compiler is only really marginally related (GCC 15 is now
officially released, but that doesn't mean all those apps have fixes
upstream now).
[Reply to this comment]
Do not underestimate the impact of fedora breakage.

Posted Apr 25, 2025 12:43 UTC (Fri) by pizza (subscriber, #46) [Link]

> By releasing Fedora with an unreleased version of gcc, Fedora cause
all such software project to have to rush to fix gcc-15 related
breakage before that date or appear broken to fedora users.

You conveniently forget that Fedora shipping the bleeding edge of
compilers has been their MO from the beginning and that shipping an
almost-released complier happens approximately every few releases.
You also conveniently leave out the fact that Fedora packagers are
the primary contributors of upstream fixes for problems that using
bleeding-edge GCC uncovers.

> This is not respectful of the gcc team nor of the other projects.

...You mean the GCC folks that are employed by Red Hat... and produce
and maintain the Fedora GCC packages?

Meanwhile, what's the effective difference for other projects? They
will still need to be fixed for GCC 15 regardless, and approximately
nobody pre-emptively tries major new compiler releases before its
gets packaged and shipped in at least one major distribution. And
even then, they probably don't care what a bleeding/rolling
distribution like Fedora ships because the only binaries they support
are produced by ancient EL/LTS releases.

(On that note, I recently filed a bug against AppImage's tooling
because it currently barfs when you try to use it with binaries
produced with <3yr-old binutils)

> This is gcc 2.96 all over again.

Oh, you mean it produces C++ binaries that are ABI-incompatible with
both the previous _and_ subsequent releases due to an ongoing
years-long rewrite of the C++ standard library, plus other breakages
caused by fixing innumerable spec-compliance bugs?

[Reply to this comment]
GCC pre-releases and venerable traditions

Posted Apr 25, 2025 11:35 UTC (Fri) by decathorpe (subscriber, #
170615) [Link] (4 responses)

All even-numbered Fedora releases ship with new major versions of
GCC, that's nothing new. And given how GCC and Fedora development
cycles line up, it's usually a snapshot close to the first stable
release of the new version:
The Fedora 42 mass rebuild was done with a very early GCC 15 snapshot
(this is basically the first "real world" testing new GCC versions
get every year!), and it was now released with a very late GCC 15
snapshot that isn't quite yet the new 15.1 release (though I don't
really understand why 15.1 is the first stable release and not 15.0).
It's definitely not an ideal situation, but arguably better than
shipping a less-tested major version ~5 months after release instead.
[Reply to this comment]
GCC pre-releases and venerable traditions

Posted Apr 25, 2025 12:49 UTC (Fri) by jwakely (subscriber, #60262) [
Link] (3 responses)

>though I don't really understand why 15.1 is the first stable
release and not 15.0

Because what number do you give "not yet 15.0" snapshots? 14.99?

15.0.0 means the development trunk for the 9-10 months of new
features development, then 15.0.1 is a pre-release snapshot as we
approach the release, then 15.1.0 is the first release. After that
release, new snapshots from that branch are 15.1.1, until the 15.2.0
release then snapshots will be 15.2.1 and so on.

https://gcc.gnu.org/develop.html#num_scheme
[Reply to this comment]
GCC pre-releases and venerable traditions

Posted Apr 25, 2025 13:51 UTC (Fri) by mathstuf (subscriber, #69389)
[Link]

> Because what number do you give "not yet 15.0" snapshots? 14.99?

That's one option. What we do on CMake is, when branching, we set the
`main` branch's patch number "to the stratosphere", usually with
YYYYMMDD, but a static "high" number like 100 also works (though,
e.g., the kernel may prefer 1000 here since it *does* reach 100+
patch releases). The minor level is shared between the latest release
branch and `main`. So if you have official releases:

14.1.0 (released 2025-04-01)
14.1.1 (released 2025-04-13)
14.2.0 (released 2025-04-23)

while, correspondingly, `main` could be:

14.1.20250401 # post-14.1.0
14.1.20250402
...
14.1.20250412
14.1.20250413 # post-14.1.1
14.1.20250414
...
14.1.20250422
14.2.20250423 # post-14.2.0

We bump the patch level every night, but it may also make sense to do
so only if the last change was not a patch-date bump. This allows
main-tracking users to get at least some gradient on
headed-to-the-next-version development if needed while not
"reserving" a magic `.0` patch version interpretation or doing even/
odd splits.
[Reply to this comment]
GCC pre-releases and venerable traditions

Posted Apr 25, 2025 14:18 UTC (Fri) by decathorpe (subscriber, #
170615) [Link] (1 responses)

Why introduce even more special numbers" Ways to indicate
pre-releases that are not based on assigning arbitrary meaning to
"special" numbers exist, like 15.0.0-pre.
[Reply to this comment]
GCC pre-releases and venerable traditions

Posted Apr 25, 2025 16:49 UTC (Fri) by jwakely (subscriber, #60262) [
Link]

15.0.0 generally sorts before 15.0.0-pre though, and the advantage of
the current system is that it doesn't need to add any new parts, it
fits the longstanding major.minor.patchlevel numbering scheme.
[Reply to this comment]
Fedora not stable distro.

Posted Apr 25, 2025 14:38 UTC (Fri) by r1w1s1 (subscriber, #169987) [
Link] (1 responses)

Personally, I've always seen Fedora as Red Hat's testing lab -- not a
stable distro. But Linus uses it as a daily driver anyway. 
[Reply to this comment]
Fedora not stable distro.

Posted Apr 25, 2025 18:28 UTC (Fri) by marcH (subscriber, #57642) [
Link]

Using a bleeding edge distro as a "daily driver" is fine. What is not
fine is not having at least one _other_, more stable system at least
compile-testing release candidates. Ideally this would of course be
done in some basic kernel.org CI but short of having any CI for
release candidates for some unknown reason, some other box or VM
would also do the trick.

[Reply to this comment]
on Fedora and GCC

Posted Apr 25, 2025 22:09 UTC (Fri) by AdamW (subscriber, #48457) [
Link]

"Fedora 42 has been released, though, and the Fedora developers, for
better or worse, decided to include a pre-release version of GCC 15
with it as the default compiler. The Fedora project, it seems, has
decided to follow a venerable Red Hat tradition with this release."

Well, no, it's a bit more complicated than this.

We include a technically-prerelease GCC with *every* even numbered
Fedora release. We have done all the way back to Fedora 28.
Even-numbered Fedora releases have .0.1 gcc builds, which - per the
GCC plan, https://gcc.gnu.org/develop.html - are late (stage 4)
pre-releases. The first release in a GCC series is always versioned
.1. Fedora 28 had gcc 8.0.1 (pre-release for 8.1). 30 had 9.0.1, 32
had 10.0.1 and so on.

This is nothing like the ancient-history GCC 2.96 thing, because it's
all worked out and co-ordinated with the GCC developers, who are
*entirely* aware of it. It's effectively part of the GCC development
process: rebuilding everything in Fedora with the under-development
GCC shakes out a lot of bugs and issues that otherwise might not be
found till much later.

The kernel team in general is also aware of this, AFAIK. It really
does seem like Linus was, uh, kinda headstrong in just slapping in a
'fix' for his box instead of working with Kees, here.
[Reply to this comment]

                  Copyright (c) 2025, Eklektix, Inc.
   Comments and public postings are copyrighted by their creators.
          Linux is a registered trademark of Linus Torvalds