[HN Gopher] Progress on C23
       ___________________________________________________________________
        
       Progress on C23
        
       Author : ingve
       Score  : 82 points
       Date   : 2021-09-05 12:22 UTC (10 hours ago)
        
 (HTM) web link (thephd.dev)
 (TXT) w3m dump (thephd.dev)
        
       | gavinray wrote:
       | Apostrophes for integer-literal separators?!
       | 
       | C'mon, EVERY other language I've ever worked with that had them
       | uses underscores.
       | 
       | Why did you do this =(
       | 
       | Grateful there is now any symbol for this, but this an incredibly
       | un-intuitive one.
       | 
       | Programming languages use underscores, (most) countries use
       | commas (in varying decimal group positions).
       | 
       | What a confusing choice.
        
         | uluyol wrote:
         | C++ uses apostrophes. If you want to use numbers in header
         | files that are common to both C and C++ code, you want the
         | syntax to be the same.
        
           | gavinray wrote:
           | Welp, yeah I'd say that's a pretty good reason.
           | 
           | I then shift blame to whomever decided that was a good idea
           | for C++, tossing it on the towering pile of similar questions
           | directed towards C++'s architecture.
        
             | jcelerier wrote:
             | it would conflict with user defined literals which were
             | there before:                 auto operator""_01(unsigned
             | long long l) {         return 0;       }            int x =
             | 100_01; // x == 0
             | 
             | so, as always, backward compat :D
             | 
             | (grepping in my ~ I see at least one occurence of `operator
             | "" _0b` and if you allow `_0b` it's going to be hard to
             | justify not allowing `_0123456`).
        
               | WalterBright wrote:
               | A leading underscore just makes it an identifier. After
               | all, ' has a similar ambiguity - is '0' an integer or a
               | character? We use embedded _ in D with no problems.
        
       | Ericson2314 wrote:
       | What is stuck layout for _BitInt(N)? Can we please please please
       | have a "super packed" to deprecate bitfields?
        
       | chromatin wrote:
       | Nice. I welcome the inclusion of UTF16 and UTF32 types being
       | standardized, and `stdckint.h` looks nice.
       | 
       | That said, I left C for D years ago. I get easy, almost
       | transparent use of C libraries, ability to run with or without GC
       | (`-betterC`), metaprogramming, digit group separators (one of the
       | changes slated for C23), UTF16/32 types, and an amazing standard
       | library. (Full disclosure, much of the std library Phobos is GC
       | dependent but this is being worked on)
        
         | gavinray wrote:
         | Do you use "-BetterC" (WorseD) flag though? Am curious. Also
         | yes, D is love.
        
           | chromatin wrote:
           | LOL. In truth, I use so much `std` (which is an amazing piece
           | of work) that I don't use `-betterC`. I do sometimes do my
           | own memory allocs using `std.allocator` though.
        
       | xeeeeeeeeeeenu wrote:
       | >this also means that you can use it to print the "low" bits of
       | an int when trying to just read the low-end bits as well, such as
       | by using printf("%w8d", 0xFEE);
       | 
       | Note that this happens to work when the argument's type is int
       | (or anything smaller) only because of default argument
       | promotions. For larger types it will cause undefined behaviour.
       | So, for example, printf("%w8d", 0x1LL) is not legal.
        
         | nynx wrote:
         | Insane that it's this easy to cause undefined behavior. Forgot
         | what size a "long long" is on your arch when writing a print
         | statement? Undefined behavior.
        
           | tux3 wrote:
           | Multiplying unsigned shorts is famously (?) undefined
           | behavior on most platforms.
           | 
           | They get promoted to int, and the result overflows.
           | 
           | https://stackoverflow.com/questions/33732489/
        
           | vlovich123 wrote:
           | This seems like one of the more innocuous ones that's
           | automatically caught by compiler warnings. Admittedly
           | projects have to turn that on and fix warnings (or build with
           | errors) and not all do. The standard should really mandate
           | that certain kind of warnings today should just be
           | unconditional compiler errors.
        
       | nn3 wrote:
       | I missed what the free_sized() is good for. From a naive look it
       | seems redundant.
       | 
       | Still no computed goto?
       | 
       | Some progress on standardizing inline assembler syntax would be
       | nice too.
       | 
       | Or maybe it won't matter anymore because everyone will be using
       | either gcc or clang.
        
         | aseipp wrote:
         | Modern allocators divide allocations into "size classes" based
         | on the size of the allocation (normally rounded up to some
         | power of 2 or whatever.) When you call something like free(m),
         | the allocator will often have to figure out what size class 'm'
         | was put into before it can proceed. For example, once you free
         | some memory you might not return it to the OS, but keep it
         | around in a cache structure so that it can be re-used. You can
         | only put 'm' into the right cache if you compute the size class
         | and use that. If the metadata for the object is "out of band"
         | (i.e. not next to it directly), figuring out the size class of
         | 'm' might be a little costly.
         | 
         | If your allocator can't use the sizing information when
         | free'ing, it can just be a no-op. If it can, it can result in
         | some performance gains.
         | 
         | You also get the bonus the allocator can run stricter
         | consistency checks on the object itself i.e. free(m,size) can
         | ensure that 'm' actually is of the given 'size', or abort the
         | program. This can help find and catch latent bugs.
        
         | pkhuong wrote:
         | sized free is good for safety if your malloc cross-checks the
         | size argument with its metadata, and for performance if your
         | malloc assumes the size argument is correct without having to
         | hit malloc's internal metadata.
         | 
         | There's also a middle ground where you get more ILP by assuming
         | the size argument is correct (for performance), and overlap the
         | work of `free`ing memory with a confirmation that the size
         | argument matches malloc's internal metadata.
        
       | marcodiego wrote:
       | My favorite promises of c2x:                 Closures:
       | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2737.pdf
       | Type inference for variable definitions and function returns:
       | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2735.pdf
       | Type generic programming: http://www.open-
       | std.org/jtc1/sc22/wg14/www/docs/n2734.pdf       Defer mechanism:
       | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2589.pdf
       | 
       | What is miss the most: the preprocessor still has no macro that
       | modifies or creates a new macro.
        
         | vlovich123 wrote:
         | At what point does this just become C++ with less library
         | support?
        
       | WalterBright wrote:
       | The article didn't mention fixing C's Biggest Mistake
       | https://www.digitalmars.com/articles/C-biggest-mistake.html which
       | has a simple and backwards compatible fix for C buffer overflows,
       | probably the single biggest cause of memory safety bugs.
        
         | __phantomderp wrote:
         | Hi! Article author here. I have long, long since waxed poetic
         | about how many bugs and problems this can solve (even just a
         | poor man's library version):
         | 
         | https://twitter.com/__phantomderp/status/1381314735174524928
         | 
         | And, very recently, I have begun to scheme and agitate for a
         | feature similar to what your article proposes:
         | 
         | https://twitter.com/__phantomderp/status/1424466518797135876
         | 
         | I am not sure people will go for `..`, and I would also like to
         | find a way to enable composition and multi-dimensionality in an
         | easier fashion (nesting arrays, for example, does not require
         | that all of the memory is laid out perfectly flat. This is the
         | case in both C and C++, and is taken advantage of by e.g. ASan
         | and other memory-shadowing runtimes that add shadowing cushion
         | around each array member).
         | 
         | As a Standards Committee person, I can't really just demand
         | "and this will go into the standard"; our charter requires 2
         | existing C compilers/stdlibs to implement the thing (unless the
         | thing is so dead-simple we can just shake hands and implement
         | it because it's not contentious) and then, after that, requires
         | consensus (which is its own roadblock to getting things done
         | when someone doesn't like the syntax/way-you-are-doing-it/gets-
         | in-the-way-of-their-implementation).
         | 
         | So, for example, if the C parser in D were to support this
         | syntax, and someone else were to support some kind of syntax
         | for this, and they all got together and wrote a paper for this
         | idea, that would count as two implementations..........
         | 
         | Hint hint. Wink wink. Nudge nudge? 0:D
        
         | Gibbon1 wrote:
         | What's gross to me is compilers often know a function has been
         | passed a pointer to fixed sized object. And have hacky ways to
         | determine that at run time. Not to mention all these compilers
         | do know what a phat pointer is.
         | 
         | Says to me the problem is the people on the standards
         | committee.
        
           | WalterBright wrote:
           | I can understand C's reluctance to add features and
           | complexity, anything that would make it not C. But the memory
           | safety issue is such an enormous problem, and the fix is so
           | simple, and the fix has been proven in 20 years of constant
           | use in D, that I just can't see not incorporating it.
        
       | camgunz wrote:
       | AIUI the lack of checked division is that integer division always
       | produces an equal or smaller number, and therefore can't
       | overflow.
        
         | saagarjha wrote:
         | Integer division can overflow: INT_MIN / -1.
        
       | MontyCarloHall wrote:
       | >I should also note that C23 will also have Binary Integer
       | Literals, so the same number can be written out in a more precise
       | grouping of binary as well:                 const unsigned
       | magical_number = 0b0110'0001'0110'0011'0110'0001'0110'0010;
       | 
       | Finally. It's mystifying to me why this took so long to add; not
       | everyone is fluent in translating hex digits to binary nybbles on
       | the fly. This will greatly help with clearly defining and
       | manipulating bitfields. The fact that digit separators can be
       | placed arbitrarily will further help delimit fields of variable
       | width, e.g.                 #                    A B C  D
       | uint8_t flags = 0b00'1'0'11'01;
       | 
       | A and B are binary fields; C and D are quaternary fields. The
       | first two bits are padding. This is a lot more readable than
       | uint8_t flags = 0x2D;
        
         | WalterBright wrote:
         | I added 0b binary literals to the Datalight C compiler nearly
         | 40 years ago, and put them in D, too. It sounds like a great
         | idea, but it turns out to just not be useful in practice. One
         | of the problems is the unwieldy length of them (and yes, you
         | can group them in D with _ ).
         | 
         | I predict people will use it for a while, then abandon it.
        
         | Ericson2314 wrote:
         | what does                 struct X {         _BitInt(1) a;
         | _BitInt(1) b;       };
         | 
         | do now?
        
         | 12thwonder wrote:
         | why not                   struct Foo{             bool
         | some_flag : 1;             bool other_flag : 1;
         | ....         }          ?
        
           | MontyCarloHall wrote:
           | Because a struct of bools != a bitfield. C booleans as
           | defined in stdbool.h are just convenience macros aliasing
           | "true" to 8 bit integer 1 and "false" to 0. AFAIK, there
           | aren't any compilers smart enough to implicitly pack a struct
           | of bools into a bitfield, and then make member access
           | operations implicitly mask the bitfield. Here's a quick
           | example, using the latest GCC:
           | https://godbolt.org/z/c7T54jzGT
        
             | coolreader18 wrote:
             | Not sure if they edited the comment after you commented,
             | but that is a bitfield now, at least
        
               | MontyCarloHall wrote:
               | You're right, I didn't see the :1's. Maybe it was edited,
               | maybe I was just blind?
        
             | [deleted]
        
             | isidor3 wrote:
             | I wrote a little solver for some programming puzzle.
             | Thought I was being clever by using bitfields for an array
             | of booleans to reduce memory usage and bandwidth, as they
             | seemed a natural fit for the problem I was solving.
             | 
             | Turned out that it was actually significantly faster to use
             | one byte per boolean and forgo the masking operations. I
             | assume the processor was just good enough at keeping its
             | cache filled in that particular workload, so the additional
             | masking operations just slowed things down. So I understand
             | why you might not want a compiler to automatically do this.
        
       | hgs3 wrote:
       | I'm still crossing my fingers for a defer mechanism.
        
         | quelsolaar wrote:
         | I'm a member of the C standard group and i strongly opposed
         | defer. It creates an invisible jump, at the point of the
         | execution. Its essentially a "comefrom" statement, a structure
         | that was created as a joke trying to find something worse than
         | goto. I'm much rather have people use goto. At least you can
         | see where the goto statement is and find the label where it is
         | going.
        
       | Decabytes wrote:
       | I am learning C now for the first time and trying to use as many
       | of the new quality of life features as possible, at least for my
       | personal projects.
       | 
       | It's good to see that the standard committee is adding these new
       | things that make C easier to use.
       | 
       | It's been a bit of a struggle to stick just with C, because a lot
       | of people I see teaching/writing modern C, just write C in a cpp
       | file, and cherry pick the c++ features they want.
       | 
       | I wonder how many standards we would have to go through, before
       | the people that are writing C+ (C with some C++ but no classes,
       | RAII, etc) to be converted back to plain old C
        
       ___________________________________________________________________
       (page generated 2021-09-05 23:01 UTC)