[HN Gopher] I wrote the least-C C program I could
___________________________________________________________________
I wrote the least-C C program I could
Author : ingve
Score : 235 points
Date : 2022-02-21 06:55 UTC (16 hours ago)
(HTM) web link (briancallahan.net)
(TXT) w3m dump (briancallahan.net)
| DougBTX wrote:
| A "least-C" needs lazy evaluation at a minimum :-)
| ivxvm wrote:
| My thoughts exactly. It's just C with a slightly different
| syntax. Would be way more interesting if it was using lazy
| evaluation, or maybe some other kind of term rewriting,
| possibly with garbage collection or some smart miniature region
| based memory management.
| WalterBright wrote:
| > I use lots of characters that look like ASCII but are in fact
| not ASCII but nonetheless accepted as valid identifier
| characters.
|
| Clever, I was wondering how the : was done, but it's an
| abomination :-/
|
| With some simple improvements to the language, about 99% of the C
| preprocessor use can be abandoned and deprecated.
| Koshkin wrote:
| In C++, anyway. C's expressiveness, on the other hand, is
| pretty weak, and a preprocessor is very useful there.
|
| A better preprocessor (a C code generator, effectively) would
| be a simple program that would interpret the <% and %> brackets
| or similar (by "inverting" them). It is very powerful paradigm.
| WalterBright wrote:
| You're talking about metaprogramming. I've seen C code that
| does metaprogramming with the preprocessor.
|
| If you want to use metaprogramming, you've outgrown C and
| should consider a more powerful language. There are plenty to
| pick from. DasBetterC, for example.
| WalterBright wrote:
| To clarify, what is needed are:
|
| 1. static if conditionals
|
| 2. version conditionals
|
| 3. assert
|
| 4. manifest constants
|
| 5. modules
|
| I occasionally find macro usages that would require templates,
| but these are rare.
| OskarS wrote:
| One other thing that would be great that sometimes people use
| the preprocessor for is having the names variables/enums as
| runtime strings. Like, if you have an enum and a function to
| get the string representation for debug purposes (i.e. the
| name of the enum as represented inside the source code):
| typedef enum { ONE, TWO, THREE } my_enum; const
| char* getEnumName(my_enum val);
|
| you can use various preprocessor tricks to implement
| getEnumName such that you don't have to change it when adding
| more cases to the enum. This would be much better implemented
| with some compiler intrinsic/operator like `nameof(val)` that
| returned a string. C# does something similar with its
| `nameof`.
| Someone wrote:
| > you can use various preprocessor tricks to implement
| getEnumName such that you don't have to change it when
| adding more cases to the enum.
|
| For those who don't know: the X Macro
| (https://en.wikipedia.org/wiki/X_Macro,
| https://digitalmars.com/articles/b51.html)
| OskarS wrote:
| Hey, even an article written by Walter, that's a fun
| coincidence! :)
|
| This is slightly different than the form I've seen it,
| but same idea: in the version I've seen, you have a
| special file that's like "enums.txt" with contents like
| (warning, not tested): X(red)
| X(green) X(blue)
|
| and then you write: typedef enum {
| #define X(x) x #include "enums.txt"
| #undef X } color; const char*
| getColorName(color c) { switch (c) {
| #define X(x) case x: return #x; #include
| "enums.txt" #undef X }
| }
|
| Same idea, just using an #include instead of listing them
| in a macro. Thinking about it, it's sort-of a compile
| time "visitor pattern".
| Koshkin wrote:
| I like that ONE == 0.
| OskarS wrote:
| Did not even think about that :) Just so used to thinking
| of enums like that as opaque values.
| BruceEel wrote:
| Walter, D has conditional compilation, versioning and CTFE
| without preprocessor so I guess that covers the 99% "sane"
| functionality. Where do you draw the line between that and the
| 1% abomination part, i.e. your thoughts on, say, compile time
| type introspection and things like generating ('printing')
| types/declarations?
| WalterBright wrote:
| The abomination is using the preprocessor to redefine the
| syntax and/or invent new syntax. Supporting identifier
| characters that look like `:` is just madness.
|
| Of course, I've also opined that Unicode supporting multiple
| encodings for the same glyph is also madness. The Unicode
| people veered off the tracks and sank into a swamp when they
| decided that semantic information should be encoded into
| Unicode characters.
| scatters wrote:
| That ship sailed long before Unicode. Even ASCII has
| characters with multiple valid glyphs (lower case a can
| lose the ascender, and lower case g is similarly variable
| in the number of loops), not to mention multiple characters
| that are often represented with the same glyph (lower case
| l, upper case I, digit 1).
| WalterBright wrote:
| That's a font issue with some fonts, not a green light
| for blessing multiple code points with the exact same
| glyph.
|
| In fact, having a font that makes l I and 1
| indistinguishable is plenty of good reason to NOT make
| this a requirement.
| foxfluff wrote:
| > The Unicode people veered off the tracks and sank into a
| swamp when they decided that semantic information should be
| encoded into Unicode characters.
|
| As if that weren't enough, they also decided to cram half-
| assed formatting into it. You got bold letters, italics,
| various fancy-style letters, superscripts and subscripts
| for this and that.. all for the sake of leagacy
| compatibility. Unicode was legacy right from the beginning.
| pornel wrote:
| The "fonts" in Unicode are meant to be for math and
| scientific symbols, and not a stylistic choice. Don't use
| them for text, as it can be a cacophony in screen
| readers.
|
| Unicode chose to support _lossless_ conversion to and
| from other encodings it replaces (I presume it was
| important for adoption), so unfortunately it inherited
| the sum of everyone else 's tech debt.
| WalterBright wrote:
| Unicode did worse than that. They added code points to
| esrever the direction of text rendering. Naturally, this
| turned out to be useful for injecting malware into source
| code, because having the text rendered backwards and
| forwards _erases_ the display of the malware, so people
| can 't see it.
|
| Note that nobody needs these code points to reverse text.
| I did it above without gnisu those code points.
| WalterBright wrote:
| Yeah, where do you stop when you start adding fonts to
| Unicode?
| [deleted]
| bombcar wrote:
| We have emojis so we're probably not far from Unicode
| characters that blink.
| hug wrote:
| If #include <cursive.h> is wrong, I don't want to be
| right.
| foxfluff wrote:
| I like using bold and italic text just to mess wit anyone
| wo's trying to use te searc function in teir browser or
| editor. SLL s R g'R sIILR g'g'. t gets extra fun in rich
| text editors where you can mix unicode styles and real
| styles. prnoivcioddees oepnpdolretsusnities coonrfusing
| anedol stohfetiwrare.
| mananaysiempre wrote:
| What other kind of difference should be encoded into
| Unicode characters? For example, the glyphs for the Latin
| _a_ and the Cyrillic _a_ , or the Latin _i_ and the
| Cyrillic (Ukrainian, Belarusian, and pre-1918 Russian) _i_
| look identical in practically every situation, and the
| Latin (Turkish) _i_ and the Greek _i_ aren't far off. At
| least not far off compared to the Cyrillic (most languages)
| _d_ and the Cyrillic (Southern) _g_ -like version (from the
| standard Cyrillic cursive), or the Cyrillic _t_ and the
| _several_ Cyrillic (Southern) versions that are like either
| an _m_ or a turned _m_ (from the cursive, again). Yet most
| people who are acquainted with the relevant languages would
| say the former are different "letters" (whatever that
| means) and the latter are the same.
|
| [Purely-Latin borderline cases: umlaut (is _not_ two dots
| in Fraktur) vs diaeresis (languages that use it are not
| written in Fraktur), acute (non-Polish, points past the
| letter) vs kreska (Polish, points _at_ the letter). On the
| other hand, the mathematical "element of" sign was still
| occasionally typeset as an epsilon well into the 1960s.]
|
| Unicode decides most of these based on the requirement to
| roundtrip legacy encodings ("have these been ever encoded
| differently in the same encoding?"), which seems
| reasonable, yet results in homograph problems and at the
| same time the Turkish case conversion botch. In any case,
| once (sane) legacy encodings run out but you still want to
| be consistent, what _do_ you base the encoding decisions on
| but semantics? (On the other hand, once you start encoding
| semantic differences, where do you stop?..) You _could_ do
| some sort of glyph-equivalence-class thing, but that would
| still give you no way to avoid unifying _a_ and _a_ --
| _everyone_ who writes both writes them the same.
|
| None of this touches on Unicode "canonical equivalence",
| but your claim ("Unicode supporting multiple encodings for
| the same glyph is [...] madness") covers more than just
| that if I understood it correctly. And while I am attacking
| it in a sense, it's only because I genuinely don't see how
| this part could have been done differently in a major way.
| cestith wrote:
| I'm obviously not Walter, but I have a succinct answer
| that may upset a few people, but avoids a lot of
| confusion at the same time.
|
| The idea of a letter in an alphabet and a printable glyph
| for that letter are two different ideas. Unicode could
| have and probably should have had a two-layer encoding
| where the letters are all different but an extra step
| resolves letters to glyphs. Where one glyph can represent
| more than one letter, a modifier can be attached to
| represent the parent alphabet so no semantic information
| is lost. Comparison for "same character" would be at the
| glyph level without modifiers, and we could have avoided
| a bunch of different Unicode equivalence testing
| libraries that have to be individually written,
| maintained, and debugged. Use in something like a spell
| checker, conversion to other character sets, or
| stylization like cursive could have used the glyph and
| source-language modifier both.
| mananaysiempre wrote:
| (I expect Walter probably has better things to do than to
| reply to random guys on the 'net, but we can always hope,
| and I was curious :) )
|
| First off, Unicode cursive (bold, Fraktur, monospace,
| _etc._ ) Latin letters are not meant to be styles, they
| are mathematical symbols. Of course, that doesn't mean
| people aren't going to use them for that[1], and I'm not
| convinced Unicode should have gotten into that particular
| can of worms, but I think you can consistently say that
| the difference between, for example, an italic X for the
| length of a vector and a bold X for the vector itself (as
| you could encounter in a mechanics text) is not (just)
| one of style. Similarly for the superscripts and modifier
| letters--a [ph] and a [pk] or a [kj] and a [kj] in an IPA
| transcription (for which the modifiers are intended)
| denote very different sounds (granted, ones that are
| unlikely to be used at the same time by a single speaker
| in a single language, but IPA is meant to be more general
| than that).
|
| (Or wait, was this a reply to my point about Russian vs
| Bulgarian _d_? The Bulgarian one is not a cursive
| variant, it's derived from a cursive one but is a
| perfectly normal upright letter in both serif and sans-
| serif, that looks exactly the same as a Latin "single-
| storey" g as in most sans-serif fonts but never a Latin
| "double-storey" g as in most serif fonts, and printed
| Bulgarian _only_ uses that form--barring font problems--
| while printed Russian never does. I guess you could
| declare all of those to be variants of one another, even
| if it's wrong etymologically, but even to a Cyrillic user
| who has never been to Bulgaria that would be quite
| baffling.)
|
| As to your actual point, I don't think the comparison you
| describe could be made language-independent enough that
| you wouldn't still end up needing to use a language-
| specific collation equivalence at the same time (which
| seems to be your implication IIUC). _E.g._ a French
| speaker would usually want _oe_ and _oe_ to compare the
| same but different from _o_ -diaeresis, but a German
| speaker might (or might not) want _oe_ and _o_ -umlaut to
| compare the same, while every font renders _o_ -diaeresis
| and _o_ -umlaut exactly the same. French speakers (but
| possibly not in every country?) will almost always drop
| diacritics over capital letters, and Russian speakers
| frequently turn _io_ ( /jo/, /o/) into _e_ ( /je/, /e/)
| except in a small set of words where there's a
| possibility of confusion (the surnames Chebyshev and
| Gorbachev, which end in _-iov_ /-of/, are well-known
| victims of this confusion). _A_ is a stylistic varisnt of
| _aa_ in Norwegian, but a speaker of Finnish (which
| doesn't use _a_ ) would probably be surprised if forced
| to treat them the same.
|
| And that's only in Europe--what about Arabic, where
| positional variants can make (what speakers think of) a
| single letter look very different. Even _in_ Europe,
| should _s_ and _s_ be "the same glyph"? They certainly
| have the same phonetic value, and you always have to use
| one or the other...
|
| Of course, we already have a (font-dependent) codepoint-
| to-glyph translation in the guise of OpenType shaping,
| but it's not particularly useful for anything but display
| (and even there it's non-ideal).
|
| [1] https://utcc.utoronto.ca/~cks/space/blog/tech/PeopleA
| lwaysEx...
| pvg wrote:
| _printed Bulgarian only uses that form_
|
| This is a total pedantitangent but I don't think that's
| actually true. These wikipedia pages don't talk about it
| directly but I think give a bit of the flavour/related
| info that suggest it's not nearly that set in stone:
|
| https://bg.wikipedia.org/wiki/%D0%91%D1%8A%D0%BB%D0%B3%D0
| %B0...
|
| https://bg.wikipedia.org/wiki/%D0%93%D1%80%D0%B0%D0%B6%D0
| %B4...
|
| The second one, in particular, says early versions of
| Peter I's Civil Script had the g-looking small d, so
| these variants have been used concurrently for some time.
| cestith wrote:
| I made no mention of collation, alternate compositions,
| or of fonts. All I'm saying is that Unicode from the
| beginning could have had capital alpha and capital Latin
| 'A' been the same glyph with a glyph-part representation
| and a separate letter-part representation could have made
| clear which was which. O-with-umlaut and o-with-diareses
| could have been done the same. Since you've mentioned
| fonts, I'll carry on through that topic. Rather than
| having two code points with two different entries in
| every font, we could have considered the glyph and the
| parent alphabet as two pieces of data and had one entry
| in the font for the glyph.
| wyldfire wrote:
| Ignoring Unicode and focusing just on C: if the glyph
| matches a glyph used in any existing C operator maybe it
| shouldn't be legal as an identifier character.
| mananaysiempre wrote:
| I'm not defending either standard Unicode identifiers or
| C Unicode identifiers (which are, incidentally, very
| different things, see WG14/N1518), no :) The Agda people
| make good use of various mathematical operators,
| including ones that are very close to the language syntax
| ( _e.g._ colon as built-in type ascription and equals as
| built-in definition, but Unicode colon-equals as a
| substitution operator for a user-defined type of terms in
| a library for processing syntax), but overall I'm not
| convinced it's worth it at all.
|
| As a way to avoid going ASCII-only, though, excluding
| only things that look like syntax might be simultaneously
| not going far enough (how are homograph collisions
| between user-defined identifiers any better?) and too far
| (reliably transplating identifiers between languages that
| use different sets of punctuation seems like it'd be
| torturously difficult).
| WalterBright wrote:
| It's a good question. The answer is straightforward.
| Let's say you saw `i` in a book. How would you know if it
| is Latin or Cryillic?
|
| By the context!
|
| How would a book distinguish `a` as in `apple` from `a`
| as in `a+b`? (Unicode has a separate letter a from a math
| a.)
|
| By the context!
|
| This is what I meant by Unicode has no business adding
| semantic content. Semantics come from context, not from
| glyph. After all, what if I decided to write:
|
| (a) first bullet point
|
| (b) second bullet point
|
| Now what? Is that letter a or math symbol a? There's _no
| end_ to semantic content. It 's _impossible_ to put this
| into Unicode in any kind of reasonable manner. Trying to
| do it leads one into a swamp of hopelessness.
|
| BTW, the attached article is precisely about deliberately
| _misusing_ identical glyphs in order to _confuse_ the
| reader because the C compiler treats them differently.
| What better case for semantic content for glyphs being a
| hopelessly wrongheaded idea.
| ReleaseCandidat wrote:
| > With some simple improvements to the language, about 99% of
| the C preprocessor use can be abandoned and deprecated.
|
| Arguably the C feature most used in other languages is the C
| preprocessor's conditional compilation for e.g. different OSes.
| Used by languages from Fortran (yes, there exists FPP now - for
| a suitable definition of 'now') to Haskell (yes, `{-# LANGUAGE
| CPP #-}`).
| omgmajk wrote:
| Hasn't everyone done at least something similar to this? I'm
| surprised, I re-define C quite often when I'm bored.
| kajal7052 wrote:
| RegW wrote:
| Yep. Been there. Done that.
|
| In the 80's I worked for a guy who insisted that we wrote all our
| C using macros that made it look like FORTRAN, amongst much other
| nonsense. How fondly I remember the many hilarious hours spent
| trying to pin down the cause of unexpected results.
|
| I don't remember any specific examples, but consider:
|
| #define SQ(v) v*v
|
| int sq = SQ(++v);
| Too wrote:
| Classic pitfall with any type of text based pre processor. All
| variables inside need excessive amount of parenthesis.
|
| In similar vein there is also the "do {} while 0" trick to
| allow macros to be appear like normal functions and end with
| semicolon.
|
| Don't even want to imagine how many more hacks would be needed
| to transform into another syntax using macros only.
| chx wrote:
| There was a reddit thread about crafty Unicode usage in
| programming a few years ago.
|
| https://www.reddit.com/r/rust/comments/5penft/parallelizing_...
|
| > If you look closely, those aren't angle brackets, they're
| characters from the Canadian Aboriginal Syllabics block, which
| are allowed in Go identifiers. From Go's perspective, that's just
| one long identifier.
|
| The thread goes downhill _fast_ from there to the point where
|
| > I once wrote a short Swift program with every identifier a
| different length chain of 0-width spaces.
| oefrha wrote:
| Well, pasting the code into VS Code ruined some of the fun, since
| the non-ASCII homoglyphs are now highlighted by default.
| CapsAdmin wrote:
| There's also https://libcello.org/ a popular (?) macro-heavy
| library which makes C feel modern.
| lionkor wrote:
| "modern" meaning less explicit?
| retrac wrote:
| Meaning classes, algebraic data types, pattern matching,
| boxed objects, iterators and garbage collection. All they
| need is smart pointers or a borrow checker and it'd
| practically be C++ or Rust, except it's rather brittle
| because it's just a bunch of macros.
| foxfluff wrote:
| Have you ever seen it used in the wild?
| gmiller123456 wrote:
| Back when I had just learned Pascal, and was beginning to learn
| C, I did some of this. No idea why I thought that would make it
| easier to learn. I did not take it as far as the author of this
| article did. But I did expand it to function calls like "#define
| writeln printf". Looking back, I'm a bit amazed I managed to
| learn it, as I was obviously putting more work into not learning
| C than learning it.
| vidarh wrote:
| It was practically a rite of passage back in the days when
| Pascal and Pascal-like languages were common to do this with
| C....
| [deleted]
| raverbashing wrote:
| > would use the C preprocessor to create his own language and
| then write the implementation of his language in that self-
| defined language
|
| Yeah that sounds like the easiest way to make your colleagues
| hate you
|
| I "love" how we had more languages in the 70s (usually created as
| a one-off project for people with not so much user friendliness
| in mind) think m4, awk, tcl, etc
| na85 wrote:
| Awk is actually great. M4 not so much.
|
| Some absolute lunatic solved this year's Advent of Code in m4;
| it was impressive.
| erik_seaberg wrote:
| Terraform module args used to be very limited, and I didn't
| know how to generate JSON it would take instead of HCL, so I
| actually used m4 to avoid repeating every template n times.
| And now we are sad because of course Terraform has improved
| quite a bit.
| ThinBold wrote:
| I mean we do have a lot of (perhaps too many) markdown dialects
| today. Wikipedia, wordpress, github, stackexchange, you name
| it. Last time I was using a Q&A forum for calculus course, it
| uses $$ to start and close a MathJax div.
| smcl wrote:
| My fave is Jira, where they have one syntax when creating an
| issue and another for editing it
| fhars wrote:
| Well, at least that is the obvious way to delimit a math div,
| isn't it?
| nine_k wrote:
| Would you consider doing string processing in C rather than in
| Awk or Tcl?
| jfk13 wrote:
| In FORTRAN, thank you.
|
| (It was a long time ago, and there was no C compiler on our
| IBM/370...)
| anthk wrote:
| TCL is basically THE string processing language... because
| everything is a string :p.
|
| For short scripts, awk is nice, but most people would use
| Python nowadays, and die hard Unix greybeards will use Perl
| or TCL depending on the mood.
| pjmlp wrote:
| Version 8.4 changed it a bit.
| raverbashing wrote:
| It is funny question because I think most languages get
| string processing right. Pascal gets it right.
|
| Except C.
| Koshkin wrote:
| But BSTR, too, is a C construct.
| foxfluff wrote:
| > Yeah that sounds like the easiest way to make your colleagues
| hate you
|
| Well I'm not Whitney's colleague but I really like his code.
| smcl wrote:
| What do you like about it? I don't think it needs to be
| stated why the majority of people here probably hate it, but
| I am curious why anyone would actively like it. I can maybe
| see that there's a sense of achievement in being able to grok
| a codebase that is often described as unreadable
| foxfluff wrote:
| It might be unreadable in the same sense as Chinese or
| Russian is to someone who hasn't learned to read it. Learn
| to read and it turns out not to be unreadable?
|
| I like it because it makes it easier for me to see the big
| picture. The forest and the mountain. It doesn't over-
| emphasize the bark on the trees; it doesn't drag on and
| make me scroll & jump through a maze of boring minute
| detail. At the same time, it doesn't actually bury and hide
| whatever detail there is; it's all there for when you need
| it. Whitney also generally simplifies things a lot and
| avoids tedious contortions others would make for
| portability or some theoretical conception of
| maintainability, readability, or vague "best practices".
| It's very straightforward and -- once you get past the
| Whitney vocabulary and general style -- there are no
| mountains of abstractions and layers you need to grok
| before you can work on the code.
|
| The biggest problem is that after getting used to that
| style, "normal" code starts to feel like kindergarten books
| with very simple sentences written in horse sized capital
| letters, a handful per page. Except that those letters are
| not used to write a short children's story, but a complex
| labyrinthine machine, and the over-emphasis on minute
| detail just obscures the complexity and you end up with
| cross cutting concerns spread over thousands of lines of
| code and many many files. It might look clean and readable
| on the surface, yet: there be dragons. And nobody wants to
| tame those dragons because there's so much code. And then I
| find myself sighing and asking: why on earth do we need an
| entire page for what would really amount to a line of code
| if we didn't insist on spelling everything out like it's
| babby's third program after hello world? It's just tedious.
|
| Whitney certainly writes his own sort of dragons, but it's
| easier to keep them all in your head. For example, the b
| compiler won't work out of the box on multiple platforms.
| I'm fairly confident I could port it without much effort,
| as long as the platform meets some basic requirements.
| smcl wrote:
| Yeah I don't think _anyone_ likes over-engineered
| architecture astronaut code with too many layers and
| unnecessary abstractions, whether that 's been formatted
| in the Whitney style or in a more conventional one. I
| think what I can't get over are the short identifiers
| (and filenames) and the way it just looks like a wall of
| text without any breathing space, though looking at
| another example someone posted[0] it seems there's a bit
| more whitespace and structure than I remember.
|
| > Learn to read and it turns out not to be unreadable?
|
| There's the thing, if someone learns Russian they can
| converse with 140 million people in Russia, similarly
| Ukrainians and Belarusians will be fine and they could
| probably make themselves understood through the Caucasus
| and Central Asia. If you learn to read C written in the
| Arthur Whitney style you can "converse" with a fairly
| small number of people who like the Whitney style[1]. So
| taking the example a bit further, I learned the Cyrillic
| alphabet in an afternoon and through knowing another
| Slavic language I can roughly parse the meaning of many
| Russian things I read (audio is another thing entirely, I
| can only pick out a handful of Czech/Russian homophones).
| If I had gotten up to speed with the ngn/k codebase would
| I be productive on one of the projects you wrote in a
| similar style, or is there a similar productivity wall
| where I'd have to first learn some idioms local to your
| codebase?
|
| Sorry for the questions, I know people who like this
| style probably have to answer these questions fairly
| frequently. I am genuinely just quite curious though.
|
| [0] = https://codeberg.org/ngn/k/src/branch/master/h.c
|
| [1] = is there a proper name for this or is it ok to
| refer to it as "Whitney style"?
| foxfluff wrote:
| I don't know if there's a proper name for it. At least
| people who are aware of the style would probably
| recognize Whitney's name so that's the best term I've
| come up with yet.
|
| For me, the end goal isn't Whitney style, but I've been
| pursuing effective programming all my life. When I
| learned to code, I wasn't talking to anyone except my
| computer, and that alone was exciting enough to make it a
| life-long hobby and profession for me.
|
| Do you know what brought me to Hacker News? Arc. And Paul
| Graham's writings about Lisp. The message was never about
| a popular language that everyone speaks. If anything, it
| was rather the opposite: pg saw in lisp a powerful and
| expressive (if niche) language that makes you competitive
| against larger players who stick to boring mainstream
| languages. I wasn't interested in competition or
| startups, but merely in powerful ways to make the
| computer do what you want. I don't particularly care if
| I'm the only person on planet earth willing to wield that
| power; it's for my own enjoyment. Programming for me
| isn't about "product" so "productivity wall" isn't
| something I think about, and complaining about
| productivity wall would be a bit like complaining that
| getting fit for Tour de France is time consuming, why
| don't they just drive it by car?
|
| That said, I think there are people who find K, APL, and
| related languages very productive in their niche. I'm
| definitely not speaking for everyone.
|
| Anyway, it is the curiosity and desire to discover a
| powerful way to command the computer that has driven me
| to study Haskell, APL, PostScript, forth, TLA+, lisps,
| BQN, K, Erlang, and more. "Whitney C" is just one
| milestone along the journey, and I don't know where the
| journey will eventually lead; I'm just not happy with any
| existing language right now.
|
| So the answer is no, learning Whitney C will not make
| someone immediately productive with my code, just as
| learning Java does not make you immediately productive
| C++. _They are different languages._ However, anything
| you learn can shrink the productivity wall; knowing APL
| or BQN, K, and Whitney C might make it easier to grasp
| whatever I come up with next. That applies to all of
| programming in general though; the more you know, the
| more you know, and some of that knowledge will almost
| always transfer. There will be familiar patterns and
| ideas.
|
| I also think people seriously overestimate the
| productivity wall. As you say, one can learn to read
| cyrillic in an afternoon. Kana in a weekend. But learning
| Russian or Japanese is significantly more work than
| learning the script. In terms of scope, I'd say learning
| APL or Whitney C is closer to learning kana than it is to
| learning Japanese.
|
| (EDIT: I also find it ironic that programmers are
| ostensibly excited about learning new things, yet at the
| same time programmers really love to complain about
| languages that look alien won't give a few weekends to
| learn them)
| smcl wrote:
| Good to know re the name, I typed it out a couple of
| times and wondered if I was just doing something stupid.
| I really wasn't trying to formulate an attack on this
| style, if someone or some organization uses it and it
| works then more power to them. I was really just trying
| to understand a bit better, but it's possible that is
| something that I can only really get by experience.
|
| So I find APL, J, K and friends quite fascinating (and J
| is on my list to try) but I haven't seen much hostility
| to them. People understandably get a bit intimidated by
| how different it is but they usually still seem curious.
| The real hostility is reserved for Whitney C. In this
| case I don't think it's like - if you'll forgive me for
| abusing that human language metaphor a bit - an English
| speaker learning Russian, more like
| ifAnEnglishSpeakerEncountered
| aLocaleWhereEnglishIsWrittenA
| ndStructuredDifferentlyToWhat
| theyWereUsedToSomethingLikeTh is.
|
| I can understand why their instinct is to recoil in
| horror and think "I already know a more standard English,
| I'm currently learning Russian and Japanese ... I have
| little patience for trying out this alternate form of
| English". It's obviously an exaggerated/contrived
| example, but this is genuinely how that C code appears to
| an outsider at first blush (or at least it did to me and
| a couple of my friends).
|
| That said your replies have piqued my interest, I'm gonna
| have to properly dig through that ngn/k repo some day. If
| I turn into a Whitney C guy then I'm holding you
| responsible :D
| foxfluff wrote:
| Yep. It's mostly a knee jerk reaction.
|
| After accustomization with a style that doesn't force
| explicit declarations of identifiers and their types,
| verbose type conversions, line breaks and indentation
| after every statement and brace, etcetra., one could
| definitely make a different (and similarly exaggerated)
| human language metaphor. For example, take some English
| text and feed it through a parser. Feels good to read?
| (S (NP Parsing) (VP refers (PP
| to (NP (NP the activity)
| (PP of (S (VP analysing
| (NP a sentence) (PP
| into (NP its
| component categories and functions)))))))) .)
|
| That's a bit how mainstream languages feel after using
| something that hasn't been forced into such an artificial
| form :) If you're willing to let go of that, you can
| write sentences and clauses on the same line, almost like
| prose!
| smcl wrote:
| Hehe that's actually a nice way to put it. So it's a
| little bit like the red pill, you can't really go back
| after going embracing Whitney C :D
| fhars wrote:
| I do like the style of kona if I need an example to confuse
| people about how C looks like,
| https://github.com/kevinlawler/kona/tree/master/src
| ulucs wrote:
| That is the commonly accepted way to write k interpreters after
| all. ngn/k looks more like k than C too.
|
| https://codeberg.org/ngn/k/src/branch/master/a.c
| smcl wrote:
| I'm so glad I don't have to work with that code, I would lose
| my mind
| rramadass wrote:
| Checkout this Arthur Whitney (Genius Language
| Designer/Programmer) thread -
| https://news.ycombinator.com/item?id=19481505
|
| In the above thread, user "yiyus" actually went through the macro
| code line-by-line and made explanatory notes; unbelievable! -
| https://docs.google.com/document/d/1W83ME5JecI2hd5hAUqQ1BVF3...
| BruceEel wrote:
| There was the "Val Linker"[1] (also for the Amiga, though I can't
| seem to find that version), written in a kind of Pascal-ish C,
| powered by macros.. Snippet: void
| get_order_token() BeginDeclarations EndDeclarations
| BeginCode While token_break_char Is ' '
| BeginWhile order_token_get_char(); EndWhile;
| copy_string(token, null_string); If
| IsIdentifier(token_break_char) Then While
| IsIdentifier(token_break_char) BeginWhile
| concat_char_to_string(token, token_break_char);
| order_token_get_char(); EndWhile;
| lowercase_string(token); Else If
| token_break_char Is '[' Then While
| token_break_char IsNot ']' BeginWhile
| concat_char_to_string(token, token_break_char);
| order_token_get_char(); EndWhile;
| order_token_get_char(); If case_ignore.val
| Then lowercase_string(token); EndIf;
| Else concat_char_to_string(token, token_break_char);
| order_token_get_char(); EndIf; EndIf;
| return; EndCode
|
| *1
| https://ftp.sunet.se/mirror/archive/ftp.sunet.se/pub/simteln...
| snarfy wrote:
| cpaint.h #include<curses.h>
| #include<stdio.h> #define ? #define do
| #define ^ * #define N -1 #define call
| #define main. #define , ]; #define = =
| #define = == #define ; ]; #define not !
| #define I int #define M 256 #define or ||
| #define end ;} #define CALL } #define S case
| #define X x<<8 #define size [ #define <> !=
| #define var int #define Y y<<16 #define begin {
| #define F FILE*f #define integer #define POOL ;}}
| #define W p[y*q+x] #define Z (W&(M N)) #define
| packed char #define OK break;case #define
| procedure int #define fill= return #define close
| fclose(f) #define readln(a) c=fgetc(f) #define
| H(a,b) mvaddch(a,b,' ') #define writeChar(a) fputc((a),f)
| #define open(a,b) F=fopen((a),(b)) #define A(a)
| attron(COLOR_PAIR(a)) #define B(a) attroff(COLOR_PAIR(a))
| #define read(a) switch(getch()){case #define draw(a)
| {W=Y;W|=X;W|=((a)&0xff);} #define LOOP
| for(y=0;y<w;y++){for(x=0;x<q;x++){ #define check
| if(fgetc(f)!=83){fclose(f);return N;};w=fgetc(f);q=fgetc(f)
| #define start initscr();clear();keypad(stdscr,TRUE);cbreak();noec
| ho();curs_set(0)
| can16358p wrote:
| In header file's line 12, is there two different semicolons? Is
| one of them actually semicolon-looking another Unicode character?
|
| (https://github.com/ibara/cpaint/blob/21c70acba373df932920d5f...)
| ThinBold wrote:
| U+FF1B Fullwidth semicolon. The semicolon in CJKV ideographs.
| johndoe0815 wrote:
| Steve Bourne used a set of macros to enable ALGOL-like
| programming in C and used it to implement his Unix shell -
| https://research.swtch.com/shmacro
| OskarS wrote:
| This is obviously terrible, but my favorite part of this whole
| thing is the fact that he included the parentheses in the
| IF/THEN macros, so you didn't have to do it for the condition
| in the if statement. So, like, this line doesn't need
| parentheses around the condition: IF (n=to-
| from)<=1 THEN return FI
|
| All modern languages that try and do a refresh on the C style
| (Rust and Swift notably) do this, it's clearly the right idea.
| They should just update the C and C++ syntax to make those
| parentheses optional at this point.
|
| (PS. some people would also recoil at the assignment-as-
| expression used in that line, but that's just good clean fun!)
| josefx wrote:
| They already make them optional for single nested statements,
| which causes a gigantic mess when you extend code and forget
| to add them. if (a) doThis();
| andThat();
|
| is interpreted as if(a){
| doThis(); } andThat();
|
| Some compilers are nice enough to throw around misleading
| indentation warnings, but without an explicit block
| termination like FI this just causes issues all over the
| place.
| skocznymroczny wrote:
| he meant making the (a) part optional to make "if a"
| instead. parenthesis are (), {} are braces :)
| josefx wrote:
| One of the few times I get to remember that English is
| only my second language. Can't really think of a reason
| for not dropping them unless there is a weird corner case
| in the existing grammar.
| iainmerrick wrote:
| If you dropped the parentheses around the condition, I
| think you'd have to make the braces around the body
| mandatory.
|
| Many (most?) C style guides make the braces mandatory,
| but it'd be a big step to actually change the language
| syntax that way. Tons of existing code would need to be
| updated. .. although I guess that could be automated!
| OskarS wrote:
| You could probably do an either or thing: "You can drop
| the parentheses if you use braces. But you can't drop
| both". Though you're right, this is why it'll never
| happen.
|
| Basically, C had a choice of eliding the parentheses
| around the condition or the braces around a single-
| statement block, and they chose the braces. With a half-
| century of hindsight, that was probably the wrong choice.
| thaumasiotes wrote:
| > One of the few times I get to remember that English is
| only my second language.
|
| Most native speakers aren't aware of the terms. They'll
| call anything by any name.
|
| There is a significant constituency for "curly brackets
| {}" and "square brackets []".
|
| Anyway, there are mistakes you could make that would give
| you away as a nonnative speaker, but that wasn't one of
| them.
| dragonwriter wrote:
| > There is a significant constituency for "curly brackets
| {}" and "square brackets []".
|
| Yes, but there is still basically no constituency for
| "parentheses" ("{}") or "parentheses" ("[]"), except in
| some narrow specific contexts; in programming, one
| example would be discussion of certain Lisp dialects
| where either all or certain uses of "()" in classic Lisp
| can (or, in the "certain" case, sometimes _must_ ) use
| the others, in which context "parentheses" are sometimes
| used generically for all three.
|
| So, while there is a variety of ways paired delimiters
| are described by native speakers, the particular use here
| was still outside of the normal range of that variation.
| jen20 wrote:
| The GP is only correct in America, and only in formal
| usage.
|
| In everyday usage in the UK, I'd wager that most refer to
| () as brackets, [] as square brackets and {} as curly
| brackets.
| wheybags wrote:
| In everyday use, by non programmers, they're all just
| brackets.
| dllthomas wrote:
| I never realized this. Does "parenthetical" still have
| the same meaning?
| camtarn wrote:
| Yep.
|
| Fun fact: parenthesis / parenthetical can mean 'a word or
| phrase inserted as an explanation or afterthought into a
| passage which is grammatically complete without it, in
| writing usually marked off by brackets, dashes, or
| commas.' (Merriam-Webster) - so you can have a
| parenthetical statement without using actual parentheses.
| drekipus wrote:
| would that count?
| [deleted]
| agumonkey wrote:
| ha, I was googling FORTRAN C macros and went nowhere, that was
| the page I was looking for :)
| auvi wrote:
| If I remember correctly, the first Bourne Shell was written in a
| Pascal-ish C.
| agumonkey wrote:
| It's interesting, first time I wrote C was after learning
| programming through Java. My "C" code was all new_<type>(..) ..
| I couldn't not think in Java syntax.
| tromp wrote:
| Lennart Augustsson once wrote the least-Haskell Haskell program,
| amazingly without the help of a preprocessor:
|
| https://augustss.blogspot.com/2009/02/regression-they-say-th...
|
| And to answer the question in the top comment:
|
| https://hackage.haskell.org/package/BASIC-0.1.5.0/docs/Langu...
| kazinator wrote:
| > _Caught me. I had recently heard that Arthur Whitney, author of
| the A+, k, and q languages (which are array programming languages
| like APL and J), would use the C preprocessor to create his own
| language and then write the implementation of his language in
| that self-defined language._
|
| Stephen Bourne did this in the sources of the Bourne shell,
| around 1977, to make C look like Algol.
|
| Here it is in V6 Unix, 1979:
|
| https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
|
| https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
| Koshkin wrote:
| I once wrote a set of macros to emulate the syntax of Oberon, and
| then used that to write some code that could later be easily
| converted to real Oberon. It was a fun exercise - highly
| recommended.
| rramadass wrote:
| Do you mean this?
|
| 1) Write CPP macro language to emulate Oberon syntax.
|
| 2) Write program in the above macro language.
|
| 3) This program looks like Oberon but we now have two ways of
| "compiling" it; a) Feeding it to the CPP/C Compiler or b)
| Feeding it to the Oberon compiler directly with a little bit of
| tweaking.
|
| Have i understood it correctly?
| Koshkin wrote:
| Yes, you have.
| whatsakandr wrote:
| Once I macro'd `const auto` to `let` in my C++ program. After a
| few moments of "haha C++ go rust", I got terrified and undid it.
| krupan wrote:
| let is now a thing in C++, but not const auto, so probably good
| that you undid it :-)
| marcodiego wrote:
| I once saw a C header that defined "BEGIN" as {, "END" as } and
| other pascalisms. I find it difficult to understand how some
| people are so stubborn to change their model of thinking.
| bombcar wrote:
| I believe I've seen stuff like this used as partial help when
| transcribing a program from pascal or Fortran to C, from before
| the era of automatic tooling to help.
|
| Whether they'd ever go back and finish the migration is not
| known.
| pickledcods wrote:
| The biggest pitfall with manual bulk transcribing Pascal to C
| back in day, was that operator precedence between both
| languages are really different. Not only is their model of
| thinking different, it is also wrong.
| edrxty wrote:
| Yet another pre-processor-to-victory post. Check the header in
| the source.
|
| If we're going to use crazy header files, I want to see someone
| get the linux kernel to build and boot while including this:
| https://gist.github.com/aras-p/6224951
| [deleted]
| ARandomerDude wrote:
| I don't quite understand these lines: #define
| ? #define do
|
| I was under the impression it was "#define CNAME value" - what
| does it mean when there is no value? A trip to Google didn't
| turn up anything for me, so I'm wondering if a C master can
| weigh in. Thanks!
| cperciva wrote:
| It defines those to a value of "", effectively stripping them
| from the source code.
|
| The most common place where this sort of thing occurs is the
| common idiom #ifdef DEBUG #define
| debug(...) realdebug(__VA_ARGS__) #else
| #define debug(...) #endif
|
| which allows you to sprinkle debug() calls through your code
| and have them disappear if you compile without defining the
| macro DEBUG.
| asveikau wrote:
| Of course for your quoted use it's more common to see:
| #define debug(...) ((void)0)
|
| This forces you to use debug() as a statement. If no value
| is defined you could omit the semicolon on a debug()
| statement and it still compiles.
| avian wrote:
| > #define M_PI 3.2f
|
| A long time ago, a friend who was studying mathematics at the
| time, approached me laughing hysterically and showed me a page
| in an "intro to C"-style book. It showed an example of how one
| would write a "get circumference of a circle" function. At the
| top of the code, there was a #define for the value of pi.
|
| The text describing the code said something like this about why
| pi is #defined and not included directly in the expression:
|
| "We define pi as a constant for two reasons: 1) it makes the
| expressions using it more readable 2) should the value of pi
| ever change, we will only have to change it in one place in the
| code"
| lqet wrote:
| This #define would have almost been required by law in
| Indiana: https://en.wikipedia.org/wiki/Indiana_Pi_Bill
| jonasenordin wrote:
| Add Pi to locale?
| abdusco wrote:
| This is just too good. It must be a joke on the author's part
| :D
| izietto wrote:
| Actually I find it pretty clever because pi value may
| change in a programming context, as can change its
| precision
| Hamuko wrote:
| What, you need more than 3.2f?
| fpoling wrote:
| On more serious note 22.0/7 or 3.1 in base 7 is a rather
| good approximation sufficient to quite a few problems and
| good to calculate things in one's head.
| crdrost wrote:
| Fun fact, 22/7 is only accurate to one part in 2,500, or
| 0.04%, netting you one extra digit (i.e 3.14 is less
| accurate with 3 digits but 3.142 is more).
|
| Most remember 3.14159 which is way better than 22/7, but
| pi has an even better unusual rational approximation,
| 355/113 gets you _two_ extra digits, it is accurate to
| one part in 12 million, or 0.000008% . So you have to go
| to 3.1415927 to beat it.
| cpeterso wrote:
| JPL answers the question: How Many Decimals of Pi Do We
| Really Need?
|
| > For JPL's highest accuracy calculations, which are for
| interplanetary navigation, we use 3.141592653589793.
|
| https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-
| decimal...
| jazzyjackson wrote:
| We've been wrong before :)
| spockz wrote:
| But why is it then 3.2f instead of 3.1? Seeing as 3.14 gets
| rounded to 3.1. Or is that part of the joke and the reason
| for why the value nee changing?
| masklinn wrote:
| Regardless of the actual answer, it can make sense to round
| it up depending on the use case e.g. if you're calculating
| the tolerances for a shaft onto which you place a disk,
| computing the disk diameter (and thus volume, and weight)
| to be larger than actual provides a safety margin which
| rounding down would not. Any additional safety margin added
| afterwards might not be sufficient in all round-down cases.
| avian wrote:
| Sorry, I should be more clear. This is a line from the
| header linked by the parent comment that reminded me of
| this story.
| spullara wrote:
| That #define is making fun of this:
|
| https://www.forbes.com/sites/kionasmith/2018/02/05/indianas-.
| ..
| [deleted]
| ChrisRR wrote:
| Depends if you're using biblical pi
| susam wrote:
| There is a similar joke in the book Learning Perl, 3rd
| Edition by Randal L. Schwartz and Tom Phoenix. I wrote a post
| about it here fourteen years ago:
| https://susam.net/blog/from-perl-to-pi.html . Quoting from
| the book:
|
| "It's easier to type $pi than , especially if you don't have
| Unicode. And it will be easy to maintain the program in case
| the value of ever changes."
|
| There is also a comment by Randal Schwartz in the comments
| section where he credits Tom Phoenix with that particular bit
| of humour.
| avian wrote:
| Good catch! I might be misremembering this event and it was
| actually Perl, not C. I see Learning Perl, 3rd Edition was
| published in 2001, which does put its publication at about
| the right time in early 00s.
| TonyTrapp wrote:
| As funny as it sounds, there is still a bit of truth in
| there, though: code might change from using floats to double
| for example, so you might want to replace single-precision
| constants by double-precision constants. Only need to replace
| a single pi constant in that case. :)
| brian_herman wrote:
| Can I do it with an #ifndef?
| AtlasBarfed wrote:
| Oh my the defines. What is language and semantics?
|
| "When I use a word," Humpty Dumpty said, in a rather scornful
| tone, "it means just what I choose it to mean - neither more
| nor less."
|
| Very ugly, ill-defined, uncodified democracy. No one tells you
| when you vote, no one counts up the votes, there are no
| official results.
|
| When you speak or communicate with someone else, you are voting
| with them. "I vote this word means X, for a poor/crude/coarse
| agreement of what X is the first place."
|
| And this is propaganda at its finest. Evil doublespeak: say one
| thing, and it actually is another. Snuck in.
|
| Fantastic.
|
| Related: https://github.com/Droogans/unmaintainable-code
___________________________________________________________________
(page generated 2022-02-21 23:02 UTC)