hngopher.com

       [HN Gopher] I wrote the least-C C program I could
       ___________________________________________________________________
        
       I wrote the least-C C program I could
        
       Author : ingve
       Score  : 235 points
       Date   : 2022-02-21 06:55 UTC (16 hours ago)
        
 (HTM) web link (briancallahan.net)
 (TXT) w3m dump (briancallahan.net)
        
       | DougBTX wrote:
       | A "least-C" needs lazy evaluation at a minimum :-)
        
         | ivxvm wrote:
         | My thoughts exactly. It's just C with a slightly different
         | syntax. Would be way more interesting if it was using lazy
         | evaluation, or maybe some other kind of term rewriting,
         | possibly with garbage collection or some smart miniature region
         | based memory management.
        
       | WalterBright wrote:
       | > I use lots of characters that look like ASCII but are in fact
       | not ASCII but nonetheless accepted as valid identifier
       | characters.
       | 
       | Clever, I was wondering how the : was done, but it's an
       | abomination :-/
       | 
       | With some simple improvements to the language, about 99% of the C
       | preprocessor use can be abandoned and deprecated.
        
         | Koshkin wrote:
         | In C++, anyway. C's expressiveness, on the other hand, is
         | pretty weak, and a preprocessor is very useful there.
         | 
         | A better preprocessor (a C code generator, effectively) would
         | be a simple program that would interpret the <% and %> brackets
         | or similar (by "inverting" them). It is very powerful paradigm.
        
           | WalterBright wrote:
           | You're talking about metaprogramming. I've seen C code that
           | does metaprogramming with the preprocessor.
           | 
           | If you want to use metaprogramming, you've outgrown C and
           | should consider a more powerful language. There are plenty to
           | pick from. DasBetterC, for example.
        
         | WalterBright wrote:
         | To clarify, what is needed are:
         | 
         | 1. static if conditionals
         | 
         | 2. version conditionals
         | 
         | 3. assert
         | 
         | 4. manifest constants
         | 
         | 5. modules
         | 
         | I occasionally find macro usages that would require templates,
         | but these are rare.
        
           | OskarS wrote:
           | One other thing that would be great that sometimes people use
           | the preprocessor for is having the names variables/enums as
           | runtime strings. Like, if you have an enum and a function to
           | get the string representation for debug purposes (i.e. the
           | name of the enum as represented inside the source code):
           | typedef enum { ONE, TWO, THREE } my_enum;              const
           | char* getEnumName(my_enum val);
           | 
           | you can use various preprocessor tricks to implement
           | getEnumName such that you don't have to change it when adding
           | more cases to the enum. This would be much better implemented
           | with some compiler intrinsic/operator like `nameof(val)` that
           | returned a string. C# does something similar with its
           | `nameof`.
        
             | Someone wrote:
             | > you can use various preprocessor tricks to implement
             | getEnumName such that you don't have to change it when
             | adding more cases to the enum.
             | 
             | For those who don't know: the X Macro
             | (https://en.wikipedia.org/wiki/X_Macro,
             | https://digitalmars.com/articles/b51.html)
        
               | OskarS wrote:
               | Hey, even an article written by Walter, that's a fun
               | coincidence! :)
               | 
               | This is slightly different than the form I've seen it,
               | but same idea: in the version I've seen, you have a
               | special file that's like "enums.txt" with contents like
               | (warning, not tested):                   X(red)
               | X(green)         X(blue)
               | 
               | and then you write:                   typedef enum {
               | #define X(x) x             #include "enums.txt"
               | #undef X         } color;              const char*
               | getColorName(color c) {             switch (c) {
               | #define X(x) case x: return #x;                 #include
               | "enums.txt"                 #undef X             }
               | }
               | 
               | Same idea, just using an #include instead of listing them
               | in a macro. Thinking about it, it's sort-of a compile
               | time "visitor pattern".
        
             | Koshkin wrote:
             | I like that ONE == 0.
        
               | OskarS wrote:
               | Did not even think about that :) Just so used to thinking
               | of enums like that as opaque values.
        
         | BruceEel wrote:
         | Walter, D has conditional compilation, versioning and CTFE
         | without preprocessor so I guess that covers the 99% "sane"
         | functionality. Where do you draw the line between that and the
         | 1% abomination part, i.e. your thoughts on, say, compile time
         | type introspection and things like generating ('printing')
         | types/declarations?
        
           | WalterBright wrote:
           | The abomination is using the preprocessor to redefine the
           | syntax and/or invent new syntax. Supporting identifier
           | characters that look like `:` is just madness.
           | 
           | Of course, I've also opined that Unicode supporting multiple
           | encodings for the same glyph is also madness. The Unicode
           | people veered off the tracks and sank into a swamp when they
           | decided that semantic information should be encoded into
           | Unicode characters.
        
             | scatters wrote:
             | That ship sailed long before Unicode. Even ASCII has
             | characters with multiple valid glyphs (lower case a can
             | lose the ascender, and lower case g is similarly variable
             | in the number of loops), not to mention multiple characters
             | that are often represented with the same glyph (lower case
             | l, upper case I, digit 1).
        
               | WalterBright wrote:
               | That's a font issue with some fonts, not a green light
               | for blessing multiple code points with the exact same
               | glyph.
               | 
               | In fact, having a font that makes l I and 1
               | indistinguishable is plenty of good reason to NOT make
               | this a requirement.
        
             | foxfluff wrote:
             | > The Unicode people veered off the tracks and sank into a
             | swamp when they decided that semantic information should be
             | encoded into Unicode characters.
             | 
             | As if that weren't enough, they also decided to cram half-
             | assed formatting into it. You got bold letters, italics,
             | various fancy-style letters, superscripts and subscripts
             | for this and that.. all for the sake of leagacy
             | compatibility. Unicode was legacy right from the beginning.
        
               | pornel wrote:
               | The "fonts" in Unicode are meant to be for math and
               | scientific symbols, and not a stylistic choice. Don't use
               | them for text, as it can be a cacophony in screen
               | readers.
               | 
               | Unicode chose to support _lossless_ conversion to and
               | from other encodings it replaces (I presume it was
               | important for adoption), so unfortunately it inherited
               | the sum of everyone else 's tech debt.
        
               | WalterBright wrote:
               | Unicode did worse than that. They added code points to
               | esrever the direction of text rendering. Naturally, this
               | turned out to be useful for injecting malware into source
               | code, because having the text rendered backwards and
               | forwards _erases_ the display of the malware, so people
               | can 't see it.
               | 
               | Note that nobody needs these code points to reverse text.
               | I did it above without gnisu those code points.
        
               | WalterBright wrote:
               | Yeah, where do you stop when you start adding fonts to
               | Unicode?
        
               | [deleted]
        
               | bombcar wrote:
               | We have emojis so we're probably not far from Unicode
               | characters that blink.
        
               | hug wrote:
               | If #include <cursive.h> is wrong, I don't want to be
               | right.
        
               | foxfluff wrote:
               | I like using bold and italic text just to mess wit anyone
               | wo's trying to use te searc function in teir browser or
               | editor. SLL s R g'R  sIILR g'g'. t gets extra fun in rich
               | text editors where you can mix unicode styles and real
               | styles. prnoivcioddees oepnpdolretsusnities coonrfusing
               | anedol stohfetiwrare.
        
             | mananaysiempre wrote:
             | What other kind of difference should be encoded into
             | Unicode characters? For example, the glyphs for the Latin
             | _a_ and the Cyrillic _a_ , or the Latin _i_ and the
             | Cyrillic (Ukrainian, Belarusian, and pre-1918 Russian) _i_
             | look identical in practically every situation, and the
             | Latin (Turkish) _i_ and the Greek _i_ aren't far off. At
             | least not far off compared to the Cyrillic (most languages)
             | _d_ and the Cyrillic (Southern) _g_ -like version (from the
             | standard Cyrillic cursive), or the Cyrillic _t_ and the
             | _several_ Cyrillic (Southern) versions that are like either
             | an _m_ or a turned _m_ (from the cursive, again). Yet most
             | people who are acquainted with the relevant languages would
             | say the former are different "letters" (whatever that
             | means) and the latter are the same.
             | 
             | [Purely-Latin borderline cases: umlaut (is _not_ two dots
             | in Fraktur) vs diaeresis (languages that use it are not
             | written in Fraktur), acute (non-Polish, points past the
             | letter) vs kreska (Polish, points _at_ the letter). On the
             | other hand, the mathematical "element of" sign was still
             | occasionally typeset as an epsilon well into the 1960s.]
             | 
             | Unicode decides most of these based on the requirement to
             | roundtrip legacy encodings ("have these been ever encoded
             | differently in the same encoding?"), which seems
             | reasonable, yet results in homograph problems and at the
             | same time the Turkish case conversion botch. In any case,
             | once (sane) legacy encodings run out but you still want to
             | be consistent, what _do_ you base the encoding decisions on
             | but semantics? (On the other hand, once you start encoding
             | semantic differences, where do you stop?..) You _could_ do
             | some sort of glyph-equivalence-class thing, but that would
             | still give you no way to avoid unifying _a_ and _a_ --
             | _everyone_ who writes both writes them the same.
             | 
             | None of this touches on Unicode "canonical equivalence",
             | but your claim ("Unicode supporting multiple encodings for
             | the same glyph is [...] madness") covers more than just
             | that if I understood it correctly. And while I am attacking
             | it in a sense, it's only because I genuinely don't see how
             | this part could have been done differently in a major way.
        
               | cestith wrote:
               | I'm obviously not Walter, but I have a succinct answer
               | that may upset a few people, but avoids a lot of
               | confusion at the same time.
               | 
               | The idea of a letter in an alphabet and a printable glyph
               | for that letter are two different ideas. Unicode could
               | have and probably should have had a two-layer encoding
               | where the letters are all different but an extra step
               | resolves letters to glyphs. Where one glyph can represent
               | more than one letter, a modifier can be attached to
               | represent the parent alphabet so no semantic information
               | is lost. Comparison for "same character" would be at the
               | glyph level without modifiers, and we could have avoided
               | a bunch of different Unicode equivalence testing
               | libraries that have to be individually written,
               | maintained, and debugged. Use in something like a spell
               | checker, conversion to other character sets, or
               | stylization like cursive could have used the glyph and
               | source-language modifier both.
        
               | mananaysiempre wrote:
               | (I expect Walter probably has better things to do than to
               | reply to random guys on the 'net, but we can always hope,
               | and I was curious :) )
               | 
               | First off, Unicode cursive (bold, Fraktur, monospace,
               | _etc._ ) Latin letters are not meant to be styles, they
               | are mathematical symbols. Of course, that doesn't mean
               | people aren't going to use them for that[1], and I'm not
               | convinced Unicode should have gotten into that particular
               | can of worms, but I think you can consistently say that
               | the difference between, for example, an italic X for the
               | length of a vector and a bold X for the vector itself (as
               | you could encounter in a mechanics text) is not (just)
               | one of style. Similarly for the superscripts and modifier
               | letters--a [ph] and a [pk] or a [kj] and a [kj] in an IPA
               | transcription (for which the modifiers are intended)
               | denote very different sounds (granted, ones that are
               | unlikely to be used at the same time by a single speaker
               | in a single language, but IPA is meant to be more general
               | than that).
               | 
               | (Or wait, was this a reply to my point about Russian vs
               | Bulgarian _d_? The Bulgarian one is not a cursive
               | variant, it's derived from a cursive one but is a
               | perfectly normal upright letter in both serif and sans-
               | serif, that looks exactly the same as a Latin "single-
               | storey" g as in most sans-serif fonts but never a Latin
               | "double-storey" g as in most serif fonts, and printed
               | Bulgarian _only_ uses that form--barring font problems--
               | while printed Russian never does. I guess you could
               | declare all of those to be variants of one another, even
               | if it's wrong etymologically, but even to a Cyrillic user
               | who has never been to Bulgaria that would be quite
               | baffling.)
               | 
               | As to your actual point, I don't think the comparison you
               | describe could be made language-independent enough that
               | you wouldn't still end up needing to use a language-
               | specific collation equivalence at the same time (which
               | seems to be your implication IIUC). _E.g._ a French
               | speaker would usually want _oe_ and _oe_ to compare the
               | same but different from _o_ -diaeresis, but a German
               | speaker might (or might not) want _oe_ and _o_ -umlaut to
               | compare the same, while every font renders _o_ -diaeresis
               | and _o_ -umlaut exactly the same. French speakers (but
               | possibly not in every country?) will almost always drop
               | diacritics over capital letters, and Russian speakers
               | frequently turn _io_ ( /jo/, /o/) into _e_ ( /je/, /e/)
               | except in a small set of words where there's a
               | possibility of confusion (the surnames Chebyshev and
               | Gorbachev, which end in _-iov_ /-of/, are well-known
               | victims of this confusion). _A_ is a stylistic varisnt of
               | _aa_ in Norwegian, but a speaker of Finnish (which
               | doesn't use _a_ ) would probably be surprised if forced
               | to treat them the same.
               | 
               | And that's only in Europe--what about Arabic, where
               | positional variants can make (what speakers think of) a
               | single letter look very different. Even _in_ Europe,
               | should _s_ and _s_ be "the same glyph"? They certainly
               | have the same phonetic value, and you always have to use
               | one or the other...
               | 
               | Of course, we already have a (font-dependent) codepoint-
               | to-glyph translation in the guise of OpenType shaping,
               | but it's not particularly useful for anything but display
               | (and even there it's non-ideal).
               | 
               | [1] https://utcc.utoronto.ca/~cks/space/blog/tech/PeopleA
               | lwaysEx...
        
               | pvg wrote:
               | _printed Bulgarian only uses that form_
               | 
               | This is a total pedantitangent but I don't think that's
               | actually true. These wikipedia pages don't talk about it
               | directly but I think give a bit of the flavour/related
               | info that suggest it's not nearly that set in stone:
               | 
               | https://bg.wikipedia.org/wiki/%D0%91%D1%8A%D0%BB%D0%B3%D0
               | %B0...
               | 
               | https://bg.wikipedia.org/wiki/%D0%93%D1%80%D0%B0%D0%B6%D0
               | %B4...
               | 
               | The second one, in particular, says early versions of
               | Peter I's Civil Script had the g-looking small d, so
               | these variants have been used concurrently for some time.
        
               | cestith wrote:
               | I made no mention of collation, alternate compositions,
               | or of fonts. All I'm saying is that Unicode from the
               | beginning could have had capital alpha and capital Latin
               | 'A' been the same glyph with a glyph-part representation
               | and a separate letter-part representation could have made
               | clear which was which. O-with-umlaut and o-with-diareses
               | could have been done the same. Since you've mentioned
               | fonts, I'll carry on through that topic. Rather than
               | having two code points with two different entries in
               | every font, we could have considered the glyph and the
               | parent alphabet as two pieces of data and had one entry
               | in the font for the glyph.
        
               | wyldfire wrote:
               | Ignoring Unicode and focusing just on C: if the glyph
               | matches a glyph used in any existing C operator maybe it
               | shouldn't be legal as an identifier character.
        
               | mananaysiempre wrote:
               | I'm not defending either standard Unicode identifiers or
               | C Unicode identifiers (which are, incidentally, very
               | different things, see WG14/N1518), no :) The Agda people
               | make good use of various mathematical operators,
               | including ones that are very close to the language syntax
               | ( _e.g._ colon as built-in type ascription and equals as
               | built-in definition, but Unicode colon-equals as a
               | substitution operator for a user-defined type of terms in
               | a library for processing syntax), but overall I'm not
               | convinced it's worth it at all.
               | 
               | As a way to avoid going ASCII-only, though, excluding
               | only things that look like syntax might be simultaneously
               | not going far enough (how are homograph collisions
               | between user-defined identifiers any better?) and too far
               | (reliably transplating identifiers between languages that
               | use different sets of punctuation seems like it'd be
               | torturously difficult).
        
               | WalterBright wrote:
               | It's a good question. The answer is straightforward.
               | Let's say you saw `i` in a book. How would you know if it
               | is Latin or Cryillic?
               | 
               | By the context!
               | 
               | How would a book distinguish `a` as in `apple` from `a`
               | as in `a+b`? (Unicode has a separate letter a from a math
               | a.)
               | 
               | By the context!
               | 
               | This is what I meant by Unicode has no business adding
               | semantic content. Semantics come from context, not from
               | glyph. After all, what if I decided to write:
               | 
               | (a) first bullet point
               | 
               | (b) second bullet point
               | 
               | Now what? Is that letter a or math symbol a? There's _no
               | end_ to semantic content. It 's _impossible_ to put this
               | into Unicode in any kind of reasonable manner. Trying to
               | do it leads one into a swamp of hopelessness.
               | 
               | BTW, the attached article is precisely about deliberately
               | _misusing_ identical glyphs in order to _confuse_ the
               | reader because the C compiler treats them differently.
               | What better case for semantic content for glyphs being a
               | hopelessly wrongheaded idea.
        
         | ReleaseCandidat wrote:
         | > With some simple improvements to the language, about 99% of
         | the C preprocessor use can be abandoned and deprecated.
         | 
         | Arguably the C feature most used in other languages is the C
         | preprocessor's conditional compilation for e.g. different OSes.
         | Used by languages from Fortran (yes, there exists FPP now - for
         | a suitable definition of 'now') to Haskell (yes, `{-# LANGUAGE
         | CPP #-}`).
        
       | omgmajk wrote:
       | Hasn't everyone done at least something similar to this? I'm
       | surprised, I re-define C quite often when I'm bored.
        
       | kajal7052 wrote:
        
       | RegW wrote:
       | Yep. Been there. Done that.
       | 
       | In the 80's I worked for a guy who insisted that we wrote all our
       | C using macros that made it look like FORTRAN, amongst much other
       | nonsense. How fondly I remember the many hilarious hours spent
       | trying to pin down the cause of unexpected results.
       | 
       | I don't remember any specific examples, but consider:
       | 
       | #define SQ(v) v*v
       | 
       | int sq = SQ(++v);
        
         | Too wrote:
         | Classic pitfall with any type of text based pre processor. All
         | variables inside need excessive amount of parenthesis.
         | 
         | In similar vein there is also the "do {} while 0" trick to
         | allow macros to be appear like normal functions and end with
         | semicolon.
         | 
         | Don't even want to imagine how many more hacks would be needed
         | to transform into another syntax using macros only.
        
       | chx wrote:
       | There was a reddit thread about crafty Unicode usage in
       | programming a few years ago.
       | 
       | https://www.reddit.com/r/rust/comments/5penft/parallelizing_...
       | 
       | > If you look closely, those aren't angle brackets, they're
       | characters from the Canadian Aboriginal Syllabics block, which
       | are allowed in Go identifiers. From Go's perspective, that's just
       | one long identifier.
       | 
       | The thread goes downhill _fast_ from there to the point where
       | 
       | > I once wrote a short Swift program with every identifier a
       | different length chain of 0-width spaces.
        
       | oefrha wrote:
       | Well, pasting the code into VS Code ruined some of the fun, since
       | the non-ASCII homoglyphs are now highlighted by default.
        
       | CapsAdmin wrote:
       | There's also https://libcello.org/ a popular (?) macro-heavy
       | library which makes C feel modern.
        
         | lionkor wrote:
         | "modern" meaning less explicit?
        
           | retrac wrote:
           | Meaning classes, algebraic data types, pattern matching,
           | boxed objects, iterators and garbage collection. All they
           | need is smart pointers or a borrow checker and it'd
           | practically be C++ or Rust, except it's rather brittle
           | because it's just a bunch of macros.
        
         | foxfluff wrote:
         | Have you ever seen it used in the wild?
        
       | gmiller123456 wrote:
       | Back when I had just learned Pascal, and was beginning to learn
       | C, I did some of this. No idea why I thought that would make it
       | easier to learn. I did not take it as far as the author of this
       | article did. But I did expand it to function calls like "#define
       | writeln printf". Looking back, I'm a bit amazed I managed to
       | learn it, as I was obviously putting more work into not learning
       | C than learning it.
        
         | vidarh wrote:
         | It was practically a rite of passage back in the days when
         | Pascal and Pascal-like languages were common to do this with
         | C....
        
         | [deleted]
        
       | raverbashing wrote:
       | > would use the C preprocessor to create his own language and
       | then write the implementation of his language in that self-
       | defined language
       | 
       | Yeah that sounds like the easiest way to make your colleagues
       | hate you
       | 
       | I "love" how we had more languages in the 70s (usually created as
       | a one-off project for people with not so much user friendliness
       | in mind) think m4, awk, tcl, etc
        
         | na85 wrote:
         | Awk is actually great. M4 not so much.
         | 
         | Some absolute lunatic solved this year's Advent of Code in m4;
         | it was impressive.
        
           | erik_seaberg wrote:
           | Terraform module args used to be very limited, and I didn't
           | know how to generate JSON it would take instead of HCL, so I
           | actually used m4 to avoid repeating every template n times.
           | And now we are sad because of course Terraform has improved
           | quite a bit.
        
         | ThinBold wrote:
         | I mean we do have a lot of (perhaps too many) markdown dialects
         | today. Wikipedia, wordpress, github, stackexchange, you name
         | it. Last time I was using a Q&A forum for calculus course, it
         | uses $$ to start and close a MathJax div.
        
           | smcl wrote:
           | My fave is Jira, where they have one syntax when creating an
           | issue and another for editing it
        
           | fhars wrote:
           | Well, at least that is the obvious way to delimit a math div,
           | isn't it?
        
         | nine_k wrote:
         | Would you consider doing string processing in C rather than in
         | Awk or Tcl?
        
           | jfk13 wrote:
           | In FORTRAN, thank you.
           | 
           | (It was a long time ago, and there was no C compiler on our
           | IBM/370...)
        
           | anthk wrote:
           | TCL is basically THE string processing language... because
           | everything is a string :p.
           | 
           | For short scripts, awk is nice, but most people would use
           | Python nowadays, and die hard Unix greybeards will use Perl
           | or TCL depending on the mood.
        
             | pjmlp wrote:
             | Version 8.4 changed it a bit.
        
           | raverbashing wrote:
           | It is funny question because I think most languages get
           | string processing right. Pascal gets it right.
           | 
           | Except C.
        
             | Koshkin wrote:
             | But BSTR, too, is a C construct.
        
         | foxfluff wrote:
         | > Yeah that sounds like the easiest way to make your colleagues
         | hate you
         | 
         | Well I'm not Whitney's colleague but I really like his code.
        
           | smcl wrote:
           | What do you like about it? I don't think it needs to be
           | stated why the majority of people here probably hate it, but
           | I am curious why anyone would actively like it. I can maybe
           | see that there's a sense of achievement in being able to grok
           | a codebase that is often described as unreadable
        
             | foxfluff wrote:
             | It might be unreadable in the same sense as Chinese or
             | Russian is to someone who hasn't learned to read it. Learn
             | to read and it turns out not to be unreadable?
             | 
             | I like it because it makes it easier for me to see the big
             | picture. The forest and the mountain. It doesn't over-
             | emphasize the bark on the trees; it doesn't drag on and
             | make me scroll & jump through a maze of boring minute
             | detail. At the same time, it doesn't actually bury and hide
             | whatever detail there is; it's all there for when you need
             | it. Whitney also generally simplifies things a lot and
             | avoids tedious contortions others would make for
             | portability or some theoretical conception of
             | maintainability, readability, or vague "best practices".
             | It's very straightforward and -- once you get past the
             | Whitney vocabulary and general style -- there are no
             | mountains of abstractions and layers you need to grok
             | before you can work on the code.
             | 
             | The biggest problem is that after getting used to that
             | style, "normal" code starts to feel like kindergarten books
             | with very simple sentences written in horse sized capital
             | letters, a handful per page. Except that those letters are
             | not used to write a short children's story, but a complex
             | labyrinthine machine, and the over-emphasis on minute
             | detail just obscures the complexity and you end up with
             | cross cutting concerns spread over thousands of lines of
             | code and many many files. It might look clean and readable
             | on the surface, yet: there be dragons. And nobody wants to
             | tame those dragons because there's so much code. And then I
             | find myself sighing and asking: why on earth do we need an
             | entire page for what would really amount to a line of code
             | if we didn't insist on spelling everything out like it's
             | babby's third program after hello world? It's just tedious.
             | 
             | Whitney certainly writes his own sort of dragons, but it's
             | easier to keep them all in your head. For example, the b
             | compiler won't work out of the box on multiple platforms.
             | I'm fairly confident I could port it without much effort,
             | as long as the platform meets some basic requirements.
        
               | smcl wrote:
               | Yeah I don't think _anyone_ likes over-engineered
               | architecture astronaut code with too many layers and
               | unnecessary abstractions, whether that 's been formatted
               | in the Whitney style or in a more conventional one. I
               | think what I can't get over are the short identifiers
               | (and filenames) and the way it just looks like a wall of
               | text without any breathing space, though looking at
               | another example someone posted[0] it seems there's a bit
               | more whitespace and structure than I remember.
               | 
               | > Learn to read and it turns out not to be unreadable?
               | 
               | There's the thing, if someone learns Russian they can
               | converse with 140 million people in Russia, similarly
               | Ukrainians and Belarusians will be fine and they could
               | probably make themselves understood through the Caucasus
               | and Central Asia. If you learn to read C written in the
               | Arthur Whitney style you can "converse" with a fairly
               | small number of people who like the Whitney style[1]. So
               | taking the example a bit further, I learned the Cyrillic
               | alphabet in an afternoon and through knowing another
               | Slavic language I can roughly parse the meaning of many
               | Russian things I read (audio is another thing entirely, I
               | can only pick out a handful of Czech/Russian homophones).
               | If I had gotten up to speed with the ngn/k codebase would
               | I be productive on one of the projects you wrote in a
               | similar style, or is there a similar productivity wall
               | where I'd have to first learn some idioms local to your
               | codebase?
               | 
               | Sorry for the questions, I know people who like this
               | style probably have to answer these questions fairly
               | frequently. I am genuinely just quite curious though.
               | 
               | [0] = https://codeberg.org/ngn/k/src/branch/master/h.c
               | 
               | [1] = is there a proper name for this or is it ok to
               | refer to it as "Whitney style"?
        
               | foxfluff wrote:
               | I don't know if there's a proper name for it. At least
               | people who are aware of the style would probably
               | recognize Whitney's name so that's the best term I've
               | come up with yet.
               | 
               | For me, the end goal isn't Whitney style, but I've been
               | pursuing effective programming all my life. When I
               | learned to code, I wasn't talking to anyone except my
               | computer, and that alone was exciting enough to make it a
               | life-long hobby and profession for me.
               | 
               | Do you know what brought me to Hacker News? Arc. And Paul
               | Graham's writings about Lisp. The message was never about
               | a popular language that everyone speaks. If anything, it
               | was rather the opposite: pg saw in lisp a powerful and
               | expressive (if niche) language that makes you competitive
               | against larger players who stick to boring mainstream
               | languages. I wasn't interested in competition or
               | startups, but merely in powerful ways to make the
               | computer do what you want. I don't particularly care if
               | I'm the only person on planet earth willing to wield that
               | power; it's for my own enjoyment. Programming for me
               | isn't about "product" so "productivity wall" isn't
               | something I think about, and complaining about
               | productivity wall would be a bit like complaining that
               | getting fit for Tour de France is time consuming, why
               | don't they just drive it by car?
               | 
               | That said, I think there are people who find K, APL, and
               | related languages very productive in their niche. I'm
               | definitely not speaking for everyone.
               | 
               | Anyway, it is the curiosity and desire to discover a
               | powerful way to command the computer that has driven me
               | to study Haskell, APL, PostScript, forth, TLA+, lisps,
               | BQN, K, Erlang, and more. "Whitney C" is just one
               | milestone along the journey, and I don't know where the
               | journey will eventually lead; I'm just not happy with any
               | existing language right now.
               | 
               | So the answer is no, learning Whitney C will not make
               | someone immediately productive with my code, just as
               | learning Java does not make you immediately productive
               | C++. _They are different languages._ However, anything
               | you learn can shrink the productivity wall; knowing APL
               | or BQN, K, and Whitney C might make it easier to grasp
               | whatever I come up with next. That applies to all of
               | programming in general though; the more you know, the
               | more you know, and some of that knowledge will almost
               | always transfer. There will be familiar patterns and
               | ideas.
               | 
               | I also think people seriously overestimate the
               | productivity wall. As you say, one can learn to read
               | cyrillic in an afternoon. Kana in a weekend. But learning
               | Russian or Japanese is significantly more work than
               | learning the script. In terms of scope, I'd say learning
               | APL or Whitney C is closer to learning kana than it is to
               | learning Japanese.
               | 
               | (EDIT: I also find it ironic that programmers are
               | ostensibly excited about learning new things, yet at the
               | same time programmers really love to complain about
               | languages that look alien won't give a few weekends to
               | learn them)
        
               | smcl wrote:
               | Good to know re the name, I typed it out a couple of
               | times and wondered if I was just doing something stupid.
               | I really wasn't trying to formulate an attack on this
               | style, if someone or some organization uses it and it
               | works then more power to them. I was really just trying
               | to understand a bit better, but it's possible that is
               | something that I can only really get by experience.
               | 
               | So I find APL, J, K and friends quite fascinating (and J
               | is on my list to try) but I haven't seen much hostility
               | to them. People understandably get a bit intimidated by
               | how different it is but they usually still seem curious.
               | The real hostility is reserved for Whitney C. In this
               | case I don't think it's like - if you'll forgive me for
               | abusing that human language metaphor a bit - an English
               | speaker learning Russian, more like
               | ifAnEnglishSpeakerEncountered
               | aLocaleWhereEnglishIsWrittenA
               | ndStructuredDifferentlyToWhat
               | theyWereUsedToSomethingLikeTh         is.
               | 
               | I can understand why their instinct is to recoil in
               | horror and think "I already know a more standard English,
               | I'm currently learning Russian and Japanese ... I have
               | little patience for trying out this alternate form of
               | English". It's obviously an exaggerated/contrived
               | example, but this is genuinely how that C code appears to
               | an outsider at first blush (or at least it did to me and
               | a couple of my friends).
               | 
               | That said your replies have piqued my interest, I'm gonna
               | have to properly dig through that ngn/k repo some day. If
               | I turn into a Whitney C guy then I'm holding you
               | responsible :D
        
               | foxfluff wrote:
               | Yep. It's mostly a knee jerk reaction.
               | 
               | After accustomization with a style that doesn't force
               | explicit declarations of identifiers and their types,
               | verbose type conversions, line breaks and indentation
               | after every statement and brace, etcetra., one could
               | definitely make a different (and similarly exaggerated)
               | human language metaphor. For example, take some English
               | text and feed it through a parser. Feels good to read?
               | (S (NP Parsing)            (VP refers                (PP
               | to                    (NP (NP the activity)
               | (PP of                            (S (VP analysing
               | (NP a sentence)                                   (PP
               | into                                       (NP its
               | component categories and functions))))))))            .)
               | 
               | That's a bit how mainstream languages feel after using
               | something that hasn't been forced into such an artificial
               | form :) If you're willing to let go of that, you can
               | write sentences and clauses on the same line, almost like
               | prose!
        
               | smcl wrote:
               | Hehe that's actually a nice way to put it. So it's a
               | little bit like the red pill, you can't really go back
               | after going embracing Whitney C :D
        
       | fhars wrote:
       | I do like the style of kona if I need an example to confuse
       | people about how C looks like,
       | https://github.com/kevinlawler/kona/tree/master/src
        
         | ulucs wrote:
         | That is the commonly accepted way to write k interpreters after
         | all. ngn/k looks more like k than C too.
         | 
         | https://codeberg.org/ngn/k/src/branch/master/a.c
        
           | smcl wrote:
           | I'm so glad I don't have to work with that code, I would lose
           | my mind
        
       | rramadass wrote:
       | Checkout this Arthur Whitney (Genius Language
       | Designer/Programmer) thread -
       | https://news.ycombinator.com/item?id=19481505
       | 
       | In the above thread, user "yiyus" actually went through the macro
       | code line-by-line and made explanatory notes; unbelievable! -
       | https://docs.google.com/document/d/1W83ME5JecI2hd5hAUqQ1BVF3...
        
       | BruceEel wrote:
       | There was the "Val Linker"[1] (also for the Amiga, though I can't
       | seem to find that version), written in a kind of Pascal-ish C,
       | powered by macros.. Snippet:                   void
       | get_order_token()        BeginDeclarations        EndDeclarations
       | BeginCode         While token_break_char Is ' '
       | BeginWhile           order_token_get_char();          EndWhile;
       | copy_string(token, null_string);         If
       | IsIdentifier(token_break_char)          Then           While
       | IsIdentifier(token_break_char)            BeginWhile
       | concat_char_to_string(token, token_break_char);
       | order_token_get_char();            EndWhile;
       | lowercase_string(token);          Else           If
       | token_break_char Is '['            Then             While
       | token_break_char IsNot ']'              BeginWhile
       | concat_char_to_string(token, token_break_char);
       | order_token_get_char();              EndWhile;
       | order_token_get_char();             If case_ignore.val
       | Then               lowercase_string(token);              EndIf;
       | Else             concat_char_to_string(token, token_break_char);
       | order_token_get_char();            EndIf;          EndIf;
       | return;        EndCode
       | 
       | *1
       | https://ftp.sunet.se/mirror/archive/ftp.sunet.se/pub/simteln...
        
       | snarfy wrote:
       | cpaint.h                   #include<curses.h>
       | #include<stdio.h>         #define ?         #define do
       | #define ^ *         #define N -1         #define call
       | #define main.         #define , ];         #define = =
       | #define = ==         #define ; ];         #define not !
       | #define I int         #define M 256         #define or ||
       | #define end ;}         #define CALL }         #define S case
       | #define X x<<8         #define size [         #define <> !=
       | #define var int         #define Y y<<16         #define begin {
       | #define F FILE*f         #define integer         #define POOL ;}}
       | #define W p[y*q+x]         #define Z (W&(M N))         #define
       | packed char         #define OK break;case         #define
       | procedure int         #define fill= return         #define close
       | fclose(f)         #define readln(a) c=fgetc(f)         #define
       | H(a,b) mvaddch(a,b,' ')         #define writeChar(a) fputc((a),f)
       | #define open(a,b) F=fopen((a),(b))         #define A(a)
       | attron(COLOR_PAIR(a))         #define B(a) attroff(COLOR_PAIR(a))
       | #define read(a) switch(getch()){case         #define draw(a)
       | {W=Y;W|=X;W|=((a)&0xff);}         #define LOOP
       | for(y=0;y<w;y++){for(x=0;x<q;x++){         #define check
       | if(fgetc(f)!=83){fclose(f);return N;};w=fgetc(f);q=fgetc(f)
       | #define start initscr();clear();keypad(stdscr,TRUE);cbreak();noec
       | ho();curs_set(0)
        
       | can16358p wrote:
       | In header file's line 12, is there two different semicolons? Is
       | one of them actually semicolon-looking another Unicode character?
       | 
       | (https://github.com/ibara/cpaint/blob/21c70acba373df932920d5f...)
        
         | ThinBold wrote:
         | U+FF1B Fullwidth semicolon. The semicolon in CJKV ideographs.
        
       | johndoe0815 wrote:
       | Steve Bourne used a set of macros to enable ALGOL-like
       | programming in C and used it to implement his Unix shell -
       | https://research.swtch.com/shmacro
        
         | OskarS wrote:
         | This is obviously terrible, but my favorite part of this whole
         | thing is the fact that he included the parentheses in the
         | IF/THEN macros, so you didn't have to do it for the condition
         | in the if statement. So, like, this line doesn't need
         | parentheses around the condition:                   IF (n=to-
         | from)<=1 THEN return FI
         | 
         | All modern languages that try and do a refresh on the C style
         | (Rust and Swift notably) do this, it's clearly the right idea.
         | They should just update the C and C++ syntax to make those
         | parentheses optional at this point.
         | 
         | (PS. some people would also recoil at the assignment-as-
         | expression used in that line, but that's just good clean fun!)
        
           | josefx wrote:
           | They already make them optional for single nested statements,
           | which causes a gigantic mess when you extend code and forget
           | to add them.                  if (a)            doThis();
           | andThat();
           | 
           | is interpreted as                   if(a){
           | doThis();         }         andThat();
           | 
           | Some compilers are nice enough to throw around misleading
           | indentation warnings, but without an explicit block
           | termination like FI this just causes issues all over the
           | place.
        
             | skocznymroczny wrote:
             | he meant making the (a) part optional to make "if a"
             | instead. parenthesis are (), {} are braces :)
        
               | josefx wrote:
               | One of the few times I get to remember that English is
               | only my second language. Can't really think of a reason
               | for not dropping them unless there is a weird corner case
               | in the existing grammar.
        
               | iainmerrick wrote:
               | If you dropped the parentheses around the condition, I
               | think you'd have to make the braces around the body
               | mandatory.
               | 
               | Many (most?) C style guides make the braces mandatory,
               | but it'd be a big step to actually change the language
               | syntax that way. Tons of existing code would need to be
               | updated. .. although I guess that could be automated!
        
               | OskarS wrote:
               | You could probably do an either or thing: "You can drop
               | the parentheses if you use braces. But you can't drop
               | both". Though you're right, this is why it'll never
               | happen.
               | 
               | Basically, C had a choice of eliding the parentheses
               | around the condition or the braces around a single-
               | statement block, and they chose the braces. With a half-
               | century of hindsight, that was probably the wrong choice.
        
               | thaumasiotes wrote:
               | > One of the few times I get to remember that English is
               | only my second language.
               | 
               | Most native speakers aren't aware of the terms. They'll
               | call anything by any name.
               | 
               | There is a significant constituency for "curly brackets
               | {}" and "square brackets []".
               | 
               | Anyway, there are mistakes you could make that would give
               | you away as a nonnative speaker, but that wasn't one of
               | them.
        
               | dragonwriter wrote:
               | > There is a significant constituency for "curly brackets
               | {}" and "square brackets []".
               | 
               | Yes, but there is still basically no constituency for
               | "parentheses" ("{}") or "parentheses" ("[]"), except in
               | some narrow specific contexts; in programming, one
               | example would be discussion of certain Lisp dialects
               | where either all or certain uses of "()" in classic Lisp
               | can (or, in the "certain" case, sometimes _must_ ) use
               | the others, in which context "parentheses" are sometimes
               | used generically for all three.
               | 
               | So, while there is a variety of ways paired delimiters
               | are described by native speakers, the particular use here
               | was still outside of the normal range of that variation.
        
               | jen20 wrote:
               | The GP is only correct in America, and only in formal
               | usage.
               | 
               | In everyday usage in the UK, I'd wager that most refer to
               | () as brackets, [] as square brackets and {} as curly
               | brackets.
        
               | wheybags wrote:
               | In everyday use, by non programmers, they're all just
               | brackets.
        
               | dllthomas wrote:
               | I never realized this. Does "parenthetical" still have
               | the same meaning?
        
               | camtarn wrote:
               | Yep.
               | 
               | Fun fact: parenthesis / parenthetical can mean 'a word or
               | phrase inserted as an explanation or afterthought into a
               | passage which is grammatically complete without it, in
               | writing usually marked off by brackets, dashes, or
               | commas.' (Merriam-Webster) - so you can have a
               | parenthetical statement without using actual parentheses.
        
               | drekipus wrote:
               | would that count?
        
               | [deleted]
        
         | agumonkey wrote:
         | ha, I was googling FORTRAN C macros and went nowhere, that was
         | the page I was looking for :)
        
       | auvi wrote:
       | If I remember correctly, the first Bourne Shell was written in a
       | Pascal-ish C.
        
         | agumonkey wrote:
         | It's interesting, first time I wrote C was after learning
         | programming through Java. My "C" code was all new_<type>(..) ..
         | I couldn't not think in Java syntax.
        
       | tromp wrote:
       | Lennart Augustsson once wrote the least-Haskell Haskell program,
       | amazingly without the help of a preprocessor:
       | 
       | https://augustss.blogspot.com/2009/02/regression-they-say-th...
       | 
       | And to answer the question in the top comment:
       | 
       | https://hackage.haskell.org/package/BASIC-0.1.5.0/docs/Langu...
        
       | kazinator wrote:
       | > _Caught me. I had recently heard that Arthur Whitney, author of
       | the A+, k, and q languages (which are array programming languages
       | like APL and J), would use the C preprocessor to create his own
       | language and then write the implementation of his language in
       | that self-defined language._
       | 
       | Stephen Bourne did this in the sources of the Bourne shell,
       | around 1977, to make C look like Algol.
       | 
       | Here it is in V6 Unix, 1979:
       | 
       | https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
       | 
       | https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
        
       | Koshkin wrote:
       | I once wrote a set of macros to emulate the syntax of Oberon, and
       | then used that to write some code that could later be easily
       | converted to real Oberon. It was a fun exercise - highly
       | recommended.
        
         | rramadass wrote:
         | Do you mean this?
         | 
         | 1) Write CPP macro language to emulate Oberon syntax.
         | 
         | 2) Write program in the above macro language.
         | 
         | 3) This program looks like Oberon but we now have two ways of
         | "compiling" it; a) Feeding it to the CPP/C Compiler or b)
         | Feeding it to the Oberon compiler directly with a little bit of
         | tweaking.
         | 
         | Have i understood it correctly?
        
           | Koshkin wrote:
           | Yes, you have.
        
       | whatsakandr wrote:
       | Once I macro'd `const auto` to `let` in my C++ program. After a
       | few moments of "haha C++ go rust", I got terrified and undid it.
        
         | krupan wrote:
         | let is now a thing in C++, but not const auto, so probably good
         | that you undid it :-)
        
       | marcodiego wrote:
       | I once saw a C header that defined "BEGIN" as {, "END" as } and
       | other pascalisms. I find it difficult to understand how some
       | people are so stubborn to change their model of thinking.
        
         | bombcar wrote:
         | I believe I've seen stuff like this used as partial help when
         | transcribing a program from pascal or Fortran to C, from before
         | the era of automatic tooling to help.
         | 
         | Whether they'd ever go back and finish the migration is not
         | known.
        
         | pickledcods wrote:
         | The biggest pitfall with manual bulk transcribing Pascal to C
         | back in day, was that operator precedence between both
         | languages are really different. Not only is their model of
         | thinking different, it is also wrong.
        
       | edrxty wrote:
       | Yet another pre-processor-to-victory post. Check the header in
       | the source.
       | 
       | If we're going to use crazy header files, I want to see someone
       | get the linux kernel to build and boot while including this:
       | https://gist.github.com/aras-p/6224951
        
         | [deleted]
        
         | ARandomerDude wrote:
         | I don't quite understand these lines:                   #define
         | ?              #define do
         | 
         | I was under the impression it was "#define CNAME value" - what
         | does it mean when there is no value? A trip to Google didn't
         | turn up anything for me, so I'm wondering if a C master can
         | weigh in. Thanks!
        
           | cperciva wrote:
           | It defines those to a value of "", effectively stripping them
           | from the source code.
           | 
           | The most common place where this sort of thing occurs is the
           | common idiom                   #ifdef DEBUG         #define
           | debug(...) realdebug(__VA_ARGS__)         #else
           | #define debug(...)         #endif
           | 
           | which allows you to sprinkle debug() calls through your code
           | and have them disappear if you compile without defining the
           | macro DEBUG.
        
             | asveikau wrote:
             | Of course for your quoted use it's more common to see:
             | #define debug(...) ((void)0)
             | 
             | This forces you to use debug() as a statement. If no value
             | is defined you could omit the semicolon on a debug()
             | statement and it still compiles.
        
         | avian wrote:
         | > #define M_PI 3.2f
         | 
         | A long time ago, a friend who was studying mathematics at the
         | time, approached me laughing hysterically and showed me a page
         | in an "intro to C"-style book. It showed an example of how one
         | would write a "get circumference of a circle" function. At the
         | top of the code, there was a #define for the value of pi.
         | 
         | The text describing the code said something like this about why
         | pi is #defined and not included directly in the expression:
         | 
         | "We define pi as a constant for two reasons: 1) it makes the
         | expressions using it more readable 2) should the value of pi
         | ever change, we will only have to change it in one place in the
         | code"
        
           | lqet wrote:
           | This #define would have almost been required by law in
           | Indiana: https://en.wikipedia.org/wiki/Indiana_Pi_Bill
        
             | jonasenordin wrote:
             | Add Pi to locale?
        
           | abdusco wrote:
           | This is just too good. It must be a joke on the author's part
           | :D
        
             | izietto wrote:
             | Actually I find it pretty clever because pi value may
             | change in a programming context, as can change its
             | precision
        
               | Hamuko wrote:
               | What, you need more than 3.2f?
        
               | fpoling wrote:
               | On more serious note 22.0/7 or 3.1 in base 7 is a rather
               | good approximation sufficient to quite a few problems and
               | good to calculate things in one's head.
        
               | crdrost wrote:
               | Fun fact, 22/7 is only accurate to one part in 2,500, or
               | 0.04%, netting you one extra digit (i.e 3.14 is less
               | accurate with 3 digits but 3.142 is more).
               | 
               | Most remember 3.14159 which is way better than 22/7, but
               | pi has an even better unusual rational approximation,
               | 355/113 gets you _two_ extra digits, it is accurate to
               | one part in 12 million, or 0.000008% . So you have to go
               | to 3.1415927 to beat it.
        
               | cpeterso wrote:
               | JPL answers the question: How Many Decimals of Pi Do We
               | Really Need?
               | 
               | > For JPL's highest accuracy calculations, which are for
               | interplanetary navigation, we use 3.141592653589793.
               | 
               | https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-
               | decimal...
        
               | jazzyjackson wrote:
               | We've been wrong before :)
        
           | spockz wrote:
           | But why is it then 3.2f instead of 3.1? Seeing as 3.14 gets
           | rounded to 3.1. Or is that part of the joke and the reason
           | for why the value nee changing?
        
             | masklinn wrote:
             | Regardless of the actual answer, it can make sense to round
             | it up depending on the use case e.g. if you're calculating
             | the tolerances for a shaft onto which you place a disk,
             | computing the disk diameter (and thus volume, and weight)
             | to be larger than actual provides a safety margin which
             | rounding down would not. Any additional safety margin added
             | afterwards might not be sufficient in all round-down cases.
        
             | avian wrote:
             | Sorry, I should be more clear. This is a line from the
             | header linked by the parent comment that reminded me of
             | this story.
        
           | spullara wrote:
           | That #define is making fun of this:
           | 
           | https://www.forbes.com/sites/kionasmith/2018/02/05/indianas-.
           | ..
        
           | [deleted]
        
           | ChrisRR wrote:
           | Depends if you're using biblical pi
        
           | susam wrote:
           | There is a similar joke in the book Learning Perl, 3rd
           | Edition by Randal L. Schwartz and Tom Phoenix. I wrote a post
           | about it here fourteen years ago:
           | https://susam.net/blog/from-perl-to-pi.html . Quoting from
           | the book:
           | 
           | "It's easier to type $pi than , especially if you don't have
           | Unicode. And it will be easy to maintain the program in case
           | the value of ever changes."
           | 
           | There is also a comment by Randal Schwartz in the comments
           | section where he credits Tom Phoenix with that particular bit
           | of humour.
        
             | avian wrote:
             | Good catch! I might be misremembering this event and it was
             | actually Perl, not C. I see Learning Perl, 3rd Edition was
             | published in 2001, which does put its publication at about
             | the right time in early 00s.
        
           | TonyTrapp wrote:
           | As funny as it sounds, there is still a bit of truth in
           | there, though: code might change from using floats to double
           | for example, so you might want to replace single-precision
           | constants by double-precision constants. Only need to replace
           | a single pi constant in that case. :)
        
         | brian_herman wrote:
         | Can I do it with an #ifndef?
        
         | AtlasBarfed wrote:
         | Oh my the defines. What is language and semantics?
         | 
         | "When I use a word," Humpty Dumpty said, in a rather scornful
         | tone, "it means just what I choose it to mean - neither more
         | nor less."
         | 
         | Very ugly, ill-defined, uncodified democracy. No one tells you
         | when you vote, no one counts up the votes, there are no
         | official results.
         | 
         | When you speak or communicate with someone else, you are voting
         | with them. "I vote this word means X, for a poor/crude/coarse
         | agreement of what X is the first place."
         | 
         | And this is propaganda at its finest. Evil doublespeak: say one
         | thing, and it actually is another. Snuck in.
         | 
         | Fantastic.
         | 
         | Related: https://github.com/Droogans/unmaintainable-code
        
       ___________________________________________________________________
       (page generated 2022-02-21 23:02 UTC)