[HN Gopher] How to read C type declarations (2003)
       ___________________________________________________________________
        
       How to read C type declarations (2003)
        
       Author : aragonite
       Score  : 119 points
       Date   : 2024-05-17 08:57 UTC (14 hours ago)
        
 (HTM) web link (www.unixwiz.net)
 (TXT) w3m dump (www.unixwiz.net)
        
       | rep_lodsb wrote:
       | C syntax is just horrible. It would be better to have something
       | like "ARRAY [100] OF POINTER TO FUNCTION RETURNING POINTER TO
       | INT", if you need it more than once or twice you can declare a
       | type for it.
        
         | bregma wrote:
         | You missed the opening "PROCEDURE DIVISION.".
        
           | ape4 wrote:
           | ADD A TO B GIVING C would be so much clearer
        
         | pyjarrett wrote:
         | Ada syntax is close to what you described:
         | Foo : array (Positive range 1 .. 100) of access function return
         | access Integer;
        
         | PhilipRoman wrote:
         | The root problem is having both postfix and prefix operators
         | (and whatever their combination is called) for types.
         | 
         | Plus the fact that arrays in type names sometimes act as
         | pointers and sometimes not.
        
         | wruza wrote:
         | Still unclear at things like "returning pointer to array of
         | pointer to function ...".
         | 
         | The best shot is to name intermediate types, since they aren't
         | really intermediate most of the times. Properly structured
         | programs tend to disassemble complex structures to work on
         | them, so you need these names anyway.
         | 
         | That said, C compilers definitely lack -Wunreadable-types flag
         | which must be default.
        
           | ape4 wrote:
           | I was about to say the same thing. If you have an array of
           | widgets used in several places in your program it would be
           | odd not give it a name. Then you can have a pointer to it,
           | etc.
        
           | bluetomcat wrote:
           | Pointers to arrays are an incredibly rare thing in real-world
           | C programming, let alone functions returning them. They
           | contain the same address as a pointer to the beginning of
           | that array, with the only difference that pointer arithmetic
           | will be done in sizeof(arr) steps, not in element-size steps.
           | 
           | It is a widely accepted practice to typedef pointers to
           | functions.
           | 
           | With pointers, more than 2 levels of indirection is an
           | indication of a badly written program.
           | 
           | In practice, the most complex declarations you should write
           | are "pointer to pointer to x", "array of pointers to x",
           | "array of function pointers returning x". Stuff like "pointer
           | to function returning pointer to pointer to array" is
           | nonsensical most of the time.
        
             | DrNosferatu wrote:
             | Rare? Unless it's people obfuscating the codebase so they
             | are able to keep their job forever without delivering any
             | new/real value.
        
         | ajross wrote:
         | > C syntax is just horrible.
         | 
         | All syntax is horrible. Rust is the en vogue tool of the moment
         | and it's objectively harder to read than C. Python used to be
         | clean and now it's sort of a mess. C++, well...
         | 
         | Some languages try to get away with having simple syntax
         | though. Javascript is in this camp, as of course are all the
         | Lisps. And the result is that you end up drowning in a sea of
         | _complicated semantics_ instead, c.f. the famous  "wat" video
         | or the fact that every sexpr macro-ized DSL works like its own
         | little butterfly.
         | 
         | You can't win. C is fine. Could you do better if you started
         | from scratch? Sure. Would it last? No.
        
           | nicoburns wrote:
           | > Rust is the en vogue tool of the moment and it's
           | objectively harder to read than C
           | 
           | I think you're just used to C syntax. I find Rust syntax
           | _much_ easier to read than C syntax (and that was already
           | true when I didn 't know either language well).
        
             | ajross wrote:
             | Meh. Grandparent who thinks "C is horrible" just isn't used
             | to C. Familiarity breeds facility, that's no surprise. But
             | no, Rust is just bad at this point given all the stuff
             | that's been added and all the historical idioms that were
             | pushed and then abandoned. It's not C++ bad, but it's
             | objectively "worse" than C whose quirks all fit in a tiny
             | K&R paperback.
        
               | leetcrew wrote:
               | it's subjective to an extent, but requiring a "reverse
               | spiral rule" (which itself has exceptions) to understand
               | inline types is about as close to objectively bad as it
               | gets.
               | 
               | rust surely has it's issues, but reading types is not one
               | of them. whatever you put inside the <> of a outermost
               | symbol is a straightforward tree parse.
        
               | jstimpfle wrote:
               | The "reverse spiral rule" is a myth and that website
               | should be shut down. It only adds confusion for the
               | already confused.
               | 
               | C type syntax has a simple guiding principle, there is no
               | type syntax and "declaration follows usage" i.e. normal
               | expression syntax. A few ugly additions that don't
               | cleanly fit in this model were made later for practical
               | reasons, but they don't change the basic idea, which is
               | that there is so little to C type syntax that you can
               | barely see it (which may also be why it's so easy to read
               | once you've grokked it).
        
               | kibwen wrote:
               | If anything is objectively wrong, it's C's grammar, and
               | it's not even close.
               | 
               | When you have an entire Wikipedia article dedicated to
               | how infamously hard your language is to parse, you have
               | no choice but to admit that you've messed up:
               | https://en.wikipedia.org/wiki/Lexer_hack
               | 
               | And let's not forget:
               | https://en.wikipedia.org/wiki/Dangling_else
        
               | uecker wrote:
               | Having written a C parser, C is not hard to parse. It
               | does not fit some nice model, and if you assume this and
               | don't know about the issues, you might waste your time.
               | It iis still simple to write a C parser.
        
               | jstimpfle wrote:
               | C itself is not that hard to parse. The "lexer hack"
               | thing is a theoretical impurity but it just means you
               | have to add types to a type table -- and probably open
               | and close scopes -- while parsing. The preprocessor
               | though seems to be pretty annoying.
               | 
               | C++ must be comparatively a lot harder to parse -- C++
               | syntax is a huge amount of complexity added to a system
               | that wasn't designed with that in mind.
               | 
               | That you link "Dangling else" is curious. The wikipedia
               | page talks about "ambiguity" but in my mind it's no
               | different than the ambiguity in "a + b * c", does it mean
               | "(a + b) * c" or "a + (b * c)"? Well, make a choice and
               | write it down.
        
               | skitter wrote:
               | > Well, make a choice and write it down.
               | 
               | Or don't and don't allow unnecessary ambiguity. For * and
               | +, there's a agreed upon order of operations far more
               | universal than a programming language, but it's sensible
               | to use a partial order so nobody is mistaken about
               | whether e.g. logical or and bitshift binds tighter.
               | Similary C counld have just not have allowed omitting the
               | braces around single-line else clauses and would have
               | been a simpler language for it.
        
               | ajross wrote:
               | > For * and +, there's a agreed upon order of operations
               | 
               | So... we collectively made a choice and wrote it down,
               | then? I think you whooshed on the point. The if/else
               | ambiguity is just a precedence rule, like many others in
               | the expression grammar of every (well, every non-lisp)
               | language that exists.
        
               | ajross wrote:
               | There are C compilers that run in 64k of RAM. Let's not
               | overstate the complexity here.
        
           | tialaramex wrote:
           | > Rust is the en vogue tool of the moment and it's
           | objectively harder to read than C
           | 
           | If this is "objectively" true please link the study which was
           | able to somehow measure that.
        
             | shrimp_emoji wrote:
             | It's subjectively objective. Rust syntax is hard and spiky,
             | like a crab's carapace.
        
           | mmaniac wrote:
           | C syntax is only better than Rust's because of its
           | simplicity. If C were as semantically rich as Rust it'd be
           | more of an unreadable mess.
           | 
           | That said, C's syntactic missteps appear to be accidents,
           | while Rust delights itself in being ugly.
        
           | samatman wrote:
           | There are several languages which are motivated by providing
           | a better experience than C in C's native domain. Rust is not
           | one of those languages.
           | 
           | Rust has an innovative, but highly opinionated, memory model:
           | the premise is writing high performance _memory safe_
           | programs, with  'zero cost' abstractions and so on. You might
           | write a program in Rust rather than C in the way you might
           | write a program in Go rather than C: because the application
           | is better modeled in a language which isn't C. For programs
           | which pertain to the domain where C should be considered on
           | its merits, if you try to write them in Rust, you may as well
           | start each module with an unsafe block.
           | 
           | But no, C is not fine. It has a stunning amount of own-goals
           | making it _pointlessly difficult_ to write a program with a
           | fine-grained memory policy that is correct. C also doesn 't
           | have a syntax _at all_ until you 've run CPP, and then you
           | have a context-sensitive parse. Neither of these are mere
           | technical formalities, they actively inhibit understanding a
           | C program.
        
             | steveklabnik wrote:
             | > For programs which pertain to the domain where C should
             | be considered on its merits, if you try to write them in
             | Rust, you may as well start each module with an unsafe
             | block.
             | 
             | This is often said, but does not reflect reality. At this
             | point, we have tons of bare-metal and OS-level Rust code
             | existing in the world, and it still has very little unsafe
             | overall.
        
               | samatman wrote:
               | Granted, that was put poorly, let me try again. What I
               | said doesn't in fact have the implication you took from
               | it, but I didn't explain myself particularly well.
               | 
               | There are many programs one can write in C, or Zig, or
               | some other manual-transmission language, which, if you
               | wrote them in Rust, would just be wrapped in a big unsafe
               | block. That isn't because those programs have memory
               | errors in them, it's because they have a memory policy
               | which safe Rust doesn't support.
               | 
               | Examples include certain garbage collection algorithms,
               | and embedded programs where all memory is pre-allocated
               | and references shared somewhat promiscuously. It's
               | possible to write analogous programs in safe Rust, but
               | not the same program. The space of programs which are
               | both correct, and not possible in safe Rust, is infinite.
               | They don't come with compiler guarantees, so other
               | support is needed. CompCert is a good example of an
               | approach to that which isn't Rust's.
               | 
               | Keeping in mind that I'm referring to _specific programs_
               | , not "a program which solves the problem domain".
               | Otherwise we could just say that since you can write a
               | program which solves the problem domain in Python,
               | there's no use for Rust. That wouldn't lead to a
               | productive conversation, would it.
               | 
               | People still choose C over Rust, even knowing what Rust
               | is. This isn't the early days of the evangelism strike
               | force, Steve. The above is why.
               | 
               | Sometimes they might be better off writing the analogous
               | program in Rust, or seeing if they can get the program
               | they're trying to write with limited use of unsafe
               | blocks. Sometimes not. I know you've put a lot of work
               | into your hammer over the years, but it's never going to
               | be the only tool in the box.
               | 
               | Languages, and I'm thinking of Zig here, which make it
               | easier to correctly implement a memory policy which
               | doesn't happen to be Rust's, should be applauded, and not
               | called "a massive step backwards for the industry <pouty
               | sad face>". Like it or not, they're working in the same
               | domain as Rust, based on different principles.
               | 
               | But yes, there is an infinitely large class of programs
               | which can be written in the embedded and OS spaces, in
               | Rust, with very little unsafe overall. I didn't intend to
               | imply otherwise.
        
         | inopinatus wrote:
         | uggclang: error in integer literal, did you mean "ONE HUNDRED"?
         | at line one
        
         | jcparkyn wrote:
         | I was with you until the second sentence - there's so many
         | better syntaxes C could've used but that's not one of them.
        
         | jstimpfle wrote:
         | I agree it's annoying (really bad for tooling), and what's more
         | annoying is that for a human who understands how it works,
         | there is no syntax as easy to read and write as C declaration
         | syntax. It's terse and intuitive, not adding new type syntax
         | but reusing what already exists as expression syntax.
        
       | bluetomcat wrote:
       | Treat them like ordinary expressions around an identifier. At the
       | deepest level is the identifer, then some operators around it
       | with a certain precedence, and finally, on the left, a list of
       | type specifiers indicating the final type of the expression when
       | all the operators are applied.
       | 
       | Those operators are the same ones you would use in an expression
       | to use your declared object. They are the unary dereferencing
       | operator, the array subscription operator, the function call
       | operator and the parentheses for overcoming default precedence.
        
       | delta_p_delta_x wrote:
       | It is a bit sad that an entire blog post is required to explain
       | type declarations in a language as common as C. It is even worse
       | that C++ stuck with it. It more atrocious still that C decided it
       | was good syntax to attach operator* to the name instead of the
       | type to justify this crazy business, which thankfully C++
       | reversed a little with `auto` and West const.
       | 
       | Imagine how much easier that last example would be with a
       | trailing return type, Ptr<> and Array<> parametrised types, and
       | lambda-esque function objects instead of function pointers:
       | char * (*(**foo[][8])())[]
       | 
       | versus                 let foo: Array<Array<Ptr<() ->
       | Ptr<Arr<Ptr<Char>>>>, 8>> = ...
       | 
       | It's immediately clear what the top-level type of `foo` is: some
       | parametrised `Array`. We can do better, though.
       | 
       | In this hypothetical language, one would probably dispense with
       | 'pointer to char' and 'pointer to array' and replace them with
       | view/range/span types, eliminating a layer of type
       | parametrisation and many classes of bugs:                 let
       | foo: Span<Array<Ptr<() -> Ptr<Arr<StringView>>>, 8>> = ...
       | 
       | Though I am dumbfounded by why there needs to be a 'pointer to
       | array' in the function return type, and a 'pointer to function
       | pointer'. I assume these are unknown arrays themselves, which
       | means...                 let foo: Span<Array<Span<() ->
       | Span<Arr<StringView>>>, 8>> = ...
       | 
       | A perfect language would probably have multi-dimensional spans,
       | which would eliminate the two `Span<Array<>>` and replace with
       | some `MultiSpan<>`.
       | 
       | Addendum: https://cdecl.org/ is a very useful and very self-aware
       | resource: 'C gibberish - English'.
        
         | Quekid5 wrote:
         | In C++, you _can_ do much of what you want with std::array,
         | std::mdspan... and you could also create a Ptr <T> with "using"
         | to define a type alias for T* -- see [0]. Of course the default
         | syntax for pointers is T* and that's not going to change, so
         | it'll be an uphill battle to keep consistent in any codebase.
         | (I guess systematic use of a rule in clang-tidy or such could
         | help with that.)
         | 
         | [0] https://en.cppreference.com/w/cpp/language/type_alias
        
           | delta_p_delta_x wrote:
           | Indeed, and there's `std::function` (albeit not being a zero-
           | cost abstraction; I believe `std::move_only_function` and
           | `std::copyable_function` in C++23 and C++26 respectively are
           | replacements) and lambda statements. But C++ decided the
           | bracket salad `[](){}` was the best way to write lambdas, and
           | I'll never forgive the standard for it.
        
             | kccqzy wrote:
             | Any sort of type erasure on functions will have a non-zero
             | cost. That's why the only zero-cost abstraction is having a
             | template parameter (such as remove_if), but these come with
             | a code size cost instead.
        
         | neonsunset wrote:
         | Span<T> is pretty much C# syntax :)
         | 
         | (Also funny, multi-dimensional spans are coming in the form of
         | TensorSpan<T>)
        
           | delta_p_delta_x wrote:
           | Indeed. Very inspired by .NET, C++, and Rust.
        
         | optymizer wrote:
         | imo anything beyond Array<Something> should usually be aliased
         | to help with readability.
         | 
         | Array<Ptr<(Int, String) -> Bool>>
         | 
         | Should be:
         | 
         | Array<OnClickCallback>
         | 
         | Applying that to C code keeps the insane decls in check,
         | especially when using function pointer types
        
         | tempodox wrote:
         | West const is a travesty given that either side, or both, of a
         | pointer can be `const`. And with refs I prefer to see that
         | something is a `const &` in one place instead of scanning 3 km
         | of template argument list to find that out.
        
           | delta_p_delta_x wrote:
           | Ahh, what a typo. I meant East const:
           | https://hackingcpp.com/cpp/design/east_vs_west_const.html
        
         | BearOso wrote:
         | I've never seen anything that complex in my experience, and it
         | would be considered bad code pretty much everywhere.
         | 
         | Rule #1 of programming is to make code easy to understand and
         | self-documenting. This should be a sequence of typedefs with
         | self-explanatory names followed by a function call taking one
         | pointer or type and returning another.
        
         | uecker wrote:
         | Just create typedefs or use typeof. One could wrap this in
         | #define _Ptr(T) typeof(T\*)       #define _Array(T,N)
         | typeof(T[N])       _Ptr(_Array(int, 8)) x = ...
         | 
         | and one can also define view and span types in C. No
         | hypothetical language needed. But I do not think this is an
         | improvement.
        
         | nuancebydefault wrote:
         | Still, C syntax is 100 times easier to remember than things
         | like templating, smart pointers, dozen flavours of initialyzers
         | in C++, which is used continuously as well.
        
       | cjs_ac wrote:
       | Rob Pike has some commentary on this in this blog post on Go's
       | declaration syntax: https://go.dev/blog/declaration-syntax
        
         | ninkendo wrote:
         | > Go takes its cue from here, but in the interests of brevity
         | it drops the colon
         | 
         | This was the biggest mistake IMO. Go's declaration syntax needs
         | colons, it would have been 100x more readable.
         | 
         | Compare:                   func foo(x int, y int) int {
         | var sum int = x + y             return sum         }
         | 
         | To:                   func foo(x: int, y: int): int {
         | var sum: int = x + y             return sum         }
         | 
         | The colons would have made it so much more clear to read for
         | me.
        
           | pjmlp wrote:
           | So many things that it misses out, not only colons.
        
           | ufo wrote:
           | The lack of colons is also hard for the parser. The
           | recursive-descent code for parsing a Go parameter declaration
           | is quite tricky and got even trickier after they introduced
           | generic types.
        
           | kbolino wrote:
           | The types stand out more with colons, certainly, but I don't
           | find the former example any harder to read. It helps that
           | types can only show up in certain places. Readability does
           | suffer when multiple consecutive parameters use the same type
           | and it's elided on all but the last, though. That's a quirk
           | unique to Go that I haven't seen on any language using colons
           | to specify types.
        
             | lpribis wrote:
             | Ada and Pascal are both languages which allow multiple
             | consecutive parameters with the same type and also use
             | `name: type` syntax. I agree, even when using Ada/Pascal
             | with colons its difficult to mentally parse.
        
           | alberth wrote:
           | Having the colon touching the type definition always made
           | more sense to me
           | 
           | So instead of this:                 func foo(x: int, y: int):
           | int   {           var sum: int = x + y           return sum
           | }
           | 
           | You'd have this instead:                 func foo(x :int, y
           | :int) :int   {           var sum :int = x + y
           | return sum       }
        
             | shrimp_emoji wrote:
             | What the fuck? No.
             | 
             | That makes as much sense as `int *i`, when `int* i` is
             | obviously the correct (though heterodox) way.
        
               | teo_zero wrote:
               | And how would you read the following?
               | int* i, j;
               | 
               | C declarations simply are not "first the type, then the
               | variable"...
        
               | lionkor wrote:
               | you would read it, spot the problem, and fix it to make
               | it two lines.
               | 
               | if you dont have space for that one line, switch back to
               | punchcards
        
               | jandrese wrote:
               | IMHO the way comma behaves when declaring multiple
               | variables of the same type was a mistake from day one.
               | The sort of thing that only sticks around because it
               | would break too much existing code to fix it. If I were
               | in charge I'd declare the syntax deprecated in the next
               | version of C and introduce a new character, maybe | that
               | would do the same thing except it would propagate the
               | pointer as you would expect.
               | 
               | In my code I never use it. Every variable gets its own
               | line. You might think this would blow up the
               | declarations, but in practice it's not a problem,
               | especially in modern C where you can move declarations
               | down to where they're used instead of stuffing them all
               | up at the top of the function.
        
               | jstimpfle wrote:
               | The only "type" thing about the declaration above is
               | "int", and that does get applied to all the comma-
               | separated expressions -- so in that way the comma acts
               | totally reasonable.
               | 
               | People think that "int *x;" means "declare x as a
               | pointer-to-int" but really it means "declare x such that
               | *x (it's a code expression) is of type int." So it's a
               | really roundabout way to say that x is a "pointer to x"
               | while that concept of "pointer to" isn't even in the
               | language syntax.
        
               | jandrese wrote:
               | Which makes it even worse because it is inconsistent in
               | real world use. "long" is propagated through the comma,
               | but * is not.
        
               | jstimpfle wrote:
               | It's not inconsistent. There are two things in a
               | declaration: a type (here int) and an expression (here
               | *x). If you add commas you can have multiple of the
               | latter. That's it. Stop thinking of the star as part of
               | the type, it isn't. Leave it with the expression to the
               | right, and thinks make sense .
               | 
               | It's why I always cringe when I see                 int*
               | x;
               | 
               | as opposed to                  int *x;
               | 
               | There is a reason why K&R do it that way.
        
               | jandrese wrote:
               | The idea that the * isn't part of the type has always
               | felt completely wrong to me. The entire concept of an
               | expression modifying the type doesn't provide any benefit
               | that I can see. The only thing the expression should be
               | doing is giving you the chance to assign the name to the
               | new variable you're creating. Once you define the type,
               | which might be a pointer or not, it should be fixed.
               | 
               | I get that this is a pretty fundamental disagreement
               | between me and Knuth & Richie, but I think history has
               | shown that this idea of splitting exactly one aspect of
               | the type (if the variable is direct or a pointer) was a
               | mistake. It's been the cause of a lot of bugs and the
               | cases where you want to define both direct and indirect
               | types on the same line basically don't exist. For clarity
               | you should split those definitions up regardless of what
               | the syntax allows.
               | 
               | There are basically no cases where you should be writing
               | the following code even if it is technically correct:
               | int *pointer, direct;
        
               | jstimpfle wrote:
               | I never use comma, I always define one variable per line.
               | Much easier to edit and read. However, the way it works
               | is certainly not inconsistent.
               | 
               | As to your other concerns, it makes perfect sense.
               | int WHATEVER;
               | 
               | read as "WHATEVER is an int" and you can work backwards
               | from there. There is no more type syntax than that. The
               | thing is, it's much easier to look at stuff this way
               | since otherwise you'll have to shift back and forth
               | between 2 different syntaxes (types and expressions) all
               | the time.
               | 
               | I've been looking a lot for a better more algebraic
               | syntax too. All syntaxes I've found have lost on two axes
               | _in the common and simple cases_
               | 
               | - low cognitive overhead
               | 
               | - terseness
               | 
               | Yes, the principle of declaration follows usage is
               | already stretched to its limit, and C++ has certainly
               | overstretched it. But still for the bread-and-butter use
               | cases, the "better" syntaxes seem to always lose on those
               | 2 axes.
        
               | zzo38computer wrote:
               | I only declare multiple variables in one line if it is a
               | "simple" type, without pointers, arrays, functions, etc.
               | 
               | Also, a code such as:                 int*x=y;
               | 
               | will define the initial value of x and not of *x. For
               | this reason, a space before the asterisk can be confusing
               | in this way. And, for the reasons described in another
               | comment, putting the space after the asterisk is also
               | confusing. So, I find it clearer to omit both spaces.
               | 
               | When I require more complicated types, I will generally
               | use typedef (or typeof) instead of having to deal with
               | the confusing syntax of types in C.
        
               | delta_p_delta_x wrote:
               | > There is a reason why K&R do it that way.
               | 
               | This was and remains a bad decision, full stop.
               | 
               | 'Pointer to something' _should_ have been a parametrised
               | type rather than C 's weird syntax of overloading
               | operator*. And I won't accept the argument that C was too
               | old to have parametrised types: ML, Lisp, and many other
               | functional-first languages developed alongside C and had
               | them. We've just stuck with C because it got popular
               | because of UNIX and then Linux.
               | 
               | I avoid raw pointers whenever I can when I write C++, and
               | I yearn for a language that exposes pointers with a rich
               | and expressive type system instead of making them
               | glorified 64-bit unsigned ints. Pointers ought to be one-
               | dimensional unsigned parametrised affine space types[1]
               | (note: _not_ affine types, which are completely
               | different[2]), with several implications:
               | 
               | - comparing pointers to different types should be invalid
               | 
               | - subtracting pointers to the same type should return a
               | signed 'pointer difference' type
               | 
               | - adding pointers to the same type should be invalid
               | 
               | - Get the address contained by a pointer with
               | Ptr<T>::address()
               | 
               | [1]: http://videocortex.io/2018/Affine-Space-Types/
               | 
               | [2]: https://en.wikipedia.org/wiki/Substructural_type_sys
               | tem#Affi...
        
               | jstimpfle wrote:
               | Would've, should've, could've.
               | 
               | Programming languages are a huge waste of time. It's a
               | domain with _lots_ of opportunities for bike-shedding,
               | and then for hard-core R &D too. But programming is still
               | engineering, simple & easy that gets the job done wins
               | over needlessly complex. And yes, inertia is real too,
               | but if a programming language was dramatically better at
               | what C does best, I wouldn't still start new code in it.
               | 
               | When you're doing systems programming, pointers being
               | "glorified" 64-bit ints is pretty much what you need.
               | When your requirements hit reality, all the nice
               | abstractions break down quite easily.
               | 
               | > I avoid raw pointers whenever I can when I write C++
               | 
               | Empirically, this is the path to madness.
               | 
               | > - comparing pointers to different types should be
               | invalid
               | 
               | It produces a warning on my C compiler and an error on
               | C++ compiler. Note that there _can_ be pointers
               | containing equal addresses with different types -- a
               | struct and its first member have the same address.
               | 
               | > - adding pointers to the same type should be invalid
               | 
               | I think it is, or what exactly do you mean?
               | 
               | > - subtracting pointers to the same type should return a
               | signed 'pointer difference' type
               | 
               | it does, that type is called ptrdiff_t, which is a signed
               | integer type.
               | 
               | > - Get the address contained by a pointer with
               | Ptr<T>::address()
               | 
               | But you didn't want them to be glorified 64-bit ints?
               | Well, because of pointer arithmetic, which is a practical
               | necessity, they kind of are, but then again they're not
               | because C tries to abstract from that. You can kinda
               | unwrap the pointer abstraction by casting to the
               | (optional) uintptr_t type.
               | 
               | The thing is, not all machines necessarily have one flat
               | address space that contains everything from functions to
               | all data. Pointer representation isn't necessarily
               | uniform -- e.g., you can have a bigger type for void-
               | pointers that contain type information too, this way you
               | could do run-time type checking.
        
               | nuancebydefault wrote:
               | To me int* x is preferred, here's why...
               | 
               | I read it like 'IntegerPointer x' or 'x is an
               | integerPointer' or 'x is a pointer to an integer'
               | 
               | It rhymes with the following : 'MilkGlass m' reads as:
               | 
               | 'm is a MilkGlass' or 'm is a Glass of Milk'
        
               | jstimpfle wrote:
               | Yeah but this quickly breaks down -- it's not how it
               | "really works". What about arrays for example?
               | int* x[200];
               | 
               | You can fight all you want, but C doesn't have type
               | syntax. What you're looking at is expression syntax.
        
               | nuancebydefault wrote:
               | I would read that x is a collection of 200 intPointers.
               | Is that not correct?
        
               | jstimpfle wrote:
               | Well yes but there is something off because 200 is not
               | with the "type" anymore.
               | 
               | Now do a pointer to an array of 200 ints.
        
               | jayd16 wrote:
               | So what about smart pointers? Is it not inconsistent that
               | smart pointers are a type but raw pointers are not?
        
               | jstimpfle wrote:
               | Raw pointers "are" a type too. It's just that there isn't
               | really syntax to describe the type, at least in the
               | original declaration syntax.
               | 
               | Note that structs and typedefs that you define are proper
               | type names, you can switch out "int" for them. You can
               | easily "typedef int *intPtr;" and then merrily go "intPtr
               | myptr".
               | 
               | (I think cast type-specs might be a bit of an exception
               | here were the rule breaks a bit -- because they don't
               | "use" a name in the cast "expression". I don't intimately
               | understand how those work. The whole thing is an idea
               | that was stretched to the limits, especially with C++
               | where consistency has pretty much been given up).
        
               | alberth wrote:
               | What's wrong with 'x :int*' ?
        
               | skupig wrote:
               | Let me illustrate it like this
               | 
               | :does this look correct to you?
        
           | knome wrote:
           | It just looks busier without any benefit to me.
        
           | kibwen wrote:
           | The irony is that Pike's previous language, Limbo, did use
           | colons here: https://en.wikipedia.org/wiki/Limbo_(programming
           | _language)#E...
           | 
           | I can only assume he removed them, not for brevity, but as a
           | compromise for the sake of C programmers to only ask them to
           | learn one new thing rather than two.
        
           | king_geedorah wrote:
           | Might as well just drop the var keyword altogether at that
           | point.                   func foo(x: int, y: int): int {
           | sum : int = x + y             return x + y         }
           | 
           | With type inference the type declaration just collapses out
           | to what you have now.                   func foo(x: int, y:
           | int): int {             sum := x + y             return x + y
           | }
           | 
           | I've seen it in a few languages and it has seemed quite nice
           | in use and appearance to me.
        
           | Zardoz84 wrote:
           | I personally find more readable to have the type first and no
           | colon, and type inference :                   int func
           | foo(int x, int y) {             var sum = x + y
           | return sum         }
        
       | jsheard wrote:
       | Obligatory cdecl plug:
       | https://cdecl.org/?q=int+%28*%28*foo%29%28void+%29%29%5B3%5D
        
         | TomMasz wrote:
         | My (pardon the pun) goto utility back in my C programming days.
        
       | ckocagil wrote:
       | Using the clockwise spiral rule.
       | 
       | https://c-faq.com/decl/spiral.anderson.html
        
         | tumult wrote:
         | The spiral rule is not real and fails on even basic examples
         | like                   char * foo[10][5];
         | 
         | The OP article shows you how to actually read the declarations,
         | not a fake trick that will lead you astray.
        
       | seodisparate wrote:
       | Reminds me of https://goshdarnfunctionpointers.com/ which has
       | helped in some cases.
        
         | dailykoder wrote:
         | I think I've been writing C for well over 10 years now and this
         | year was finally the one where I forced myself to remember
         | function pointer declarations. I don't even know why it took me
         | so long to remember such a simple thing, but possibly writing
         | "c function pointer" into my favorite search engine and hitting
         | enter was faster than thinking about it for a moment.
         | 
         | But maybe also because I changed up my general programming
         | approach a bit and try to think a bit longer before hitting up
         | a search engine. Also forced myself to read more reference
         | sheets than stackoverflow or chatgpt. And I must say, this made
         | me a lot more comfortable, because I understand more. Can
         | recommend to go through the struggle/10
        
           | shrimp_emoji wrote:
           | When they're function parameters, you can write them as
           | regular function declarations, btw. ;) Might hurt your
           | memorization efforts though.
        
           | scoutt wrote:
           | I thought it was just me, that after 20 years of writing C I
           | still have to look how to declare function pointers.
        
             | jandrese wrote:
             | I'm weird that I have no problem getting function pointers
             | but have to look up the syntax to declare an enum every
             | damn time.
        
           | nuancebydefault wrote:
           | What i usually do is: Write a normal function declaration,
           | prepend it with typedef and append _t to the function name.
           | DONE
           | 
           | The same trick works for anything you want to typedef!
        
       | daghamm wrote:
       | C type declarations are quite sane, until you need function
       | pointers. Then things get really crazy.
       | 
       | Consider this type declaration in Go:
       | func(string) []func(int)[]*float64
       | 
       | This took me 5 seconds to write. The C/C++ counterpart would
       | probably take an hour to get right :(
        
         | lieks wrote:
         | Hmm... Let's see...
         | 
         | Oh, functions can't return arrays in C. That would be a
         | pointer. Well, that makes it easier.                   double
         | **(**(*)(char *))(int)
         | 
         | It did take a couple minutes, but I just woke up. Also,
         | idiomatically there would be a typedef:
         | typedef double **A(int);         typedef A **B(char *);
         | B *
         | 
         | That took a few seconds to write.
        
         | Arch-TK wrote:
         | Function pointers are easy if you typedef the function type.
         | 
         | What a pointer to function (int) returning int?
         | 
         | The difficult way:                   int (*p)(int);
         | 
         | The easy way:                   typedef int Func(int);
         | Func *p;
         | 
         | The typedef reads just like a normal function declaration
         | except for the name is the type name rather than the function
         | symbol name. Just like a function declaration, you can add more
         | documentation by giving the parameters names. e.g.:
         | // used for some hypothetical allocator API         typedef
         | void *Alloc(size_t size);
         | 
         | Similar trick can be used for pointers to arrays (just typedef
         | the array type). Although in practice pointers to arrays just
         | make for awful code.
        
       | Arch-TK wrote:
       | Unfortunately I was astounded, horrified, and simultaneously
       | pleased to discover that I was immediately able to decipher the
       | "hard" (huh ?????) example.
       | 
       | Although if I was writing this myself, I would split it up like
       | so:                   typedef char *StringArray[];
       | typedef StringArray *Func();         Func **foo[][8];
       | 
       | Which is equivalent and much easier to read. (Although still not
       | trivial. I have no idea what purpose this type really serves. The
       | type names are placeholders as a result.)
       | 
       | C type declarations become particularly tricky specifically when
       | people try to directly declare (or typedef) pointers to
       | functions/arrays without an intermediate typedef for the
       | function/array type.
       | 
       | So if you make it a rule to typedef array types before creating
       | pointers to them (although this usually makes for some rather
       | unreadable code so maybe try avoiding it in general) and to
       | typedef function types before creating pointers to them, you will
       | usually have much more readable code.
       | 
       | As a bonus, you can then use the function typedef to provide an
       | early type-check for a callback function (even if it's a callback
       | your code doesn't directly use). E.g.:                   typedef
       | int Func(int a);         Func foo;         int foo(int a)
       | {             return a * 2;         }
       | 
       | Or, you know, switch to a higher level language.
        
         | dvhh wrote:
         | Was thinking the same, but there is always this "clever" dev
         | who insist on putting everything in one line
        
       | mmaniac wrote:
       | C type declarations (and array decays, for that matter) seem to
       | be baggage from B.
       | 
       | In B the only type is the machine word, which can perform
       | arithmetic and be dereferenced. Arrays the only special kind of
       | declaration and are declared like                 auto foo[10]
       | 
       | Where auto is a storage class specifier like extern or static.
       | 
       | In that world, having declaration follow usage makes perfect
       | sense. There's no "other" kinds of data to worry about, the
       | operators you can apply to an expression describes its type
       | exactly.
        
         | WalterBright wrote:
         | The array decay is probably the worst mistake C made, and was
         | unfortunately carried over to C++. Fortunately, it is fixable:
         | 
         | https://www.digitalmars.com/articles/C-biggest-mistake.html
        
           | tialaramex wrote:
           | As usual the worst mistake is Hoare's Billion Dollar Mistake
           | (null). But yes obviously arrays shouldn't decay.
           | 
           | The "fix" would be to not have arrays decay. I believe C++
           | could probably get there if it had taken Epochs, I further
           | believe taking Epochs (P1881) when it was proposed would have
           | been extremely disruptive but perhaps possible and _worth
           | attempting_. I do not believe it 's still possible, the
           | moment passed and with it, in my view, any hope of salvaging
           | C++.
           | 
           | Without Epochs, any and every such change to C++ is similarly
           | disruptive and that's too expensive so more or less nothing
           | gets fixed, instead layers of kludges must be added. That's
           | how Foonathan ends up with: class foo final
           | trivially_relocatable namespace() { ... };
        
             | WalterBright wrote:
             | The change I proposed is not disruptive in any way.
             | 
             | As for the billion dollar mistake, a null pointer reference
             | results in a seg fault. But the array decay results in
             | buffer overflows, the #1 problem in shipped C code, and the
             | buffer overflows result in memory corruption. Memory
             | corruption is much, much, much worse than a seg fault. For
             | example, buffer overflows are exploitable by malware. Seg
             | faults are not.
             | 
             | Hence, the array decay is the worst mistake.
        
               | WalterBright wrote:
               | I am baffled by the lack of interest in incorporating the
               | proposal into C. Instead, C23 gets Unicode normalized
               | identifiers, a complex and pointless feature that is
               | easily achieved by other means, if one really really
               | wants it.
        
           | juped wrote:
           | I never knew you shared my biggest gripe with C. I really
           | wish this or an equivalent had made it into some C standard.
        
       | teddyh wrote:
       | $ sudo apt install cdecl       [...]       $ cdecl       Type
       | `help' or `?' for help       cdecl> explain char
       | *(*(**foo[][8])())[];       declare foo as array of array 8 of
       | pointer to pointer to function returning pointer to array of
       | pointer to char       cdecl>
        
         | pquki4 wrote:
         | Wow, nice. Now I think every IDE should come with this...
        
         | samatman wrote:
         | Thought I'd take a crack at it in Zig:                   const
         | foo: [_][8]**fn () [*][*:0]const u8;
         | 
         | This is an array of result-defined size, of arrays of eight
         | pointers to fn pointers, which returns a many-item pointer to
         | null-terminated arrays of u8. Null terminated because we're
         | going to assume that the C data structure is properly null-
         | terminated...
         | 
         | Still an insane data structure, and I'm still learning Zig, so
         | I can't promise this is precisely equivalent to the C, I got as
         | close as I could and am unwilling to try and compile, let alone
         | build, such a type.
         | 
         | No one would actually do this unless they needed `extern`
         | compatibility. Ignoring the fact (pointed out in the Fine
         | Article) that no one would use anything vaguely like this, a
         | more Zig native version might be                   const foo:
         | [_][8]**fn() [][]const u8
         | 
         | Which returns a slice of slices of immutable u8s, these being
         | fat pointers which include their length.
        
           | xigoi wrote:
           | Nim equivalent:                  seq[array[8, ptr ptr proc():
           | ptr seq[ptr char]]]]
           | 
           | I love having actual words instead of random punctuation
           | soup.
        
         | asimeqi wrote:
         | Why do I remember The C Programming Language book by K&R
         | explaining how to write a version of cdecl? I just checked the
         | second edition of the book and my memory seems to be wrong.
        
           | sea6ear wrote:
           | It's not listed as cdecl and (therefore?) not findable in the
           | index, but in chapter 5 - Pointers and Arrays - the book
           | presents the programs dcl and undcl that translate between C
           | declarations and English renderings of them.
           | 
           | I believe dcl mimics the cdecl program.
        
         | WalterBright wrote:
         | D Language:                   char*[]*function()[8][] foo;
        
         | nuancebydefault wrote:
         | I think brackets are needed in the sentence, otherwise it is
         | ambiguous?
        
       | inopinatus wrote:
       | Reminded of that time I tried to sneak this ternary delight into
       | FreeBSD src:                   if ((Lflag ? chown :
       | lchown)(p->fts_accpath, s->st_uid, -1))
       | (void)printf(" not modified: %s\n",
       | strerror(errno));
       | 
       | which relies on the function prototypes of the chown(2) and
       | lchown(2) syscalls being otherwise identical.
        
         | kjs3 wrote:
         | That's demented. I don't think I'd ever think to even try that
         | in C. _chef kiss_
        
         | jandrese wrote:
         | Nothing says "I'm up to some shenanigans" than the ? : syntax
         | in C.
        
       | jxy wrote:
       | It's fun stuff of translating C declarations to English, given
       | English uses its special order for possessive and verb object
       | order. If you use Japanese, you can write the Japanese from left
       | to write in order, while reading the C declarations from outside
       | in.                   char *(*(**foo [][8])())[]
       | 
       | then reads as
       | 
       | charhenopointanoPei Lie henopointawoFan suGuan Shu
       | henopointahenopointano8Yao Su karanaruPei Lie noWei Ding
       | saizunoPei Lie
       | 
       | which uses the reverse of English possessive, and object verb
       | order.
        
       ___________________________________________________________________
       (page generated 2024-05-17 23:01 UTC)