[HN Gopher] White space does matter in C23
___________________________________________________________________
White space does matter in C23
Author : ingve
Score : 62 points
Date : 2024-01-17 06:45 UTC (1 days ago)
(HTM) web link (gustedt.wordpress.com)
(TXT) w3m dump (gustedt.wordpress.com)
| PaulHoule wrote:
| ... puts the C in Cthulhu.
| tmtvl wrote:
| I think most programming languages have syntactically significant
| whitespace. I believe Fortran doesn't (or didn't), which helped a
| bug fly under the radar at NASA: DO 10 I=1.10
|
| Which got interpreted as: DO10I = 1.10
|
| Whereas the programmer wanted: DO 10 I=1,10
|
| For a DO loop. Conversely, with SSW a language will evaluate
| these two expressions differently: inta = 10;
| int a = 10;
| pdw wrote:
| Algol 60 also allowed whitespace in variable names. But they
| had a solution to avoid Fortran's confusion: keywords had to be
| specially marked.
| https://en.wikipedia.org/wiki/Stropping_(syntax)
| jll29 wrote:
| White space in variable names is a bad idea.
|
| And not everything that is possible is worth doing; e.g. I
| once designed a language ("Leazy") where keywords don't have
| to be declared, and can be used as variable names, just to
| show I could still write an LL(1) recursive descent parser
| for it. You don't want that in anything for daily use, as it
| introduces confusion.
| nerdponx wrote:
| Meanwhile both SQL (and more recently Python) have tokens
| that are keywords in certain contexts and regular
| identifiers in others.
| Findecanor wrote:
| PL/I was infamous for it being possible to express valid
| code that read "IF IF THEN THEN ELSE ELSE".
| layer8 wrote:
| C++ has some contextual keywords as well:
| final (C++11) override (C++11) import (C++20)
| module (C++20)
| avgcorrection wrote:
| > White space in variable names is a bad idea.
|
| Pff. You can have your cake and eat it too: disallow
| whitespace in variable names except no-break space. ;)
| enriquto wrote:
| And the same thing for filenames!
|
| Writing shell script under the assumption that filenames
| do not contain spaces is a liberating experience. I want
| more of that! And it is nearly possible, by tr ' '
| 0x00A0'ing every call to fopen, (probably as an option
| for mount).
| rogerbinns wrote:
| Even more fun is zero length names. In SQLite they didn't
| require table and column names to be at least one
| character, so you can do this: CREATE
| TABLE []([] []);
|
| Which will create a table with zero length name containing
| one column with a zero length name and zero length type.
| And yes you can do all the regular SQL against them
| providing you quote the zero length name.
| Someone wrote:
| > where keywords [...] can be used as variable names
|
| PL/I has that, too, because the designers thought you
| couldn't expect programmers to know all keywords. For PL/I,
| that's a correct assumption. Implementations can have
| hundreds of keywords, and some of them are single-letter
| (http://bitsavers.trailing-
| edge.com/pdf/ibm/series1/GC34-0084... pages 19-25 mentions
| A, B, E, F, P, R, S, V and X)
| layer8 wrote:
| We'd have long debates about spaces vs. tabs in
| identifiers. ;)
| formerly_proven wrote:
| Pfft, just make your syntax a prefix-free code.
| actionfromafar wrote:
| FORTRAN also had (has?) significant columns:
|
| https://web.stanford.edu/class/me200c/tutorial_77/03_basics....
| pklausler wrote:
| Fortran '90 and later has some requirements for blanks, but a
| parser that also needs to be able to parse F'77 can't rely on
| them -- so I have to go out of the way to detect missing blanks
| and complain about them.
|
| This feature makes some tokenization ambiguous without context
| -- is MODULEPROCEDUREFOO to be interpreted as "MODULE
| PROCEDUREFOO" or "MODULE PROCEDURE FOO"? But tokenization
| without any reserved words is a tricky problem anyway.
| o11c wrote:
| Link gives me JSON, not HTML?
|
| The JSON appears to mentions that this is a regression affecting
| `U"string"` where U is a macro (that expands to a string
| literal).
|
| Obviously there are numerous examples of where whitespace always
| mattered even in prior versions.
| omoikane wrote:
| R"(x)" literals are neat not just because whitespaces matter, but
| also because they are tokenized before macro expansion. Thus you
| can write a C23 detector like this:
| #include<stdio.h> #define r(R) R"()" int
| main() { puts(r()[0] ? "C99" /* r() evaluates
| to "()" */ : "C23" /* r() evaluates to ""
| */); }
|
| Output: https://gcc.godbolt.org/z/Wj3s6KEGK
|
| I have used that trick here:
|
| https://www.ioccc.org/years.html#2015_yang
|
| (C23 wasn't a thing back then, but the same trick can be used to
| differentiate C++11 from C++98).
| silasdavis wrote:
| And that manages to be the most intelligible part of prog.c
| rwbt wrote:
| That's clever!
| defen wrote:
| This is checking for the presence of raw string literals (A GNU
| C extension) not C23. If you compile with `-std=gnu99` instead
| of `-std=c99` you'll get "C23" as output.
| Sharlin wrote:
| The context is different standard versions. Random extensions
| don't count. C23 has raw string literals, C before 23
| doesn't.
| ksherlock wrote:
| > C23 has raw string literals
|
| Are you sure about that? I only see u, u8, U, and L defined
| as encoding-prefixes.
| defen wrote:
| No it doesn't. If you don't specify a standard for GCC it
| uses GNU extensions by default.
| omoikane wrote:
| My bad, I just saw "R()" in the linked blog and thought the
| feature made it to C23, but looks like it's not standard.
|
| https://en.cppreference.com/w/c/23
|
| On the plus side, I now have a GNU extension detector.
| complianceowl wrote:
| White Space Matters
| Whitespace wrote:
| You're damn straight I do!
| downvotetruth wrote:
| Not in else case.
| jxy wrote:
| > Generally, it is often assumed that in C spaces don't
| contribute much to the interpretation of programming text
|
| I can think of only one exception. In function-like macro
| definitions, the opening parenthesis `(` must directly follow the
| identifier. Though I guess the newline is significant in macro
| definitions in general, too.
|
| Are there other places where white space matters?
___________________________________________________________________
(page generated 2024-01-18 23:00 UTC)