[HN Gopher] The lost language extensions of MetaWare's High C co...
       ___________________________________________________________________
        
       The lost language extensions of MetaWare's High C compiler (2023)
        
       Author : PaulDavisThe1st
       Score  : 201 points
       Date   : 2024-09-25 14:19 UTC (8 hours ago)
        
 (HTM) web link (duriansoftware.com)
 (TXT) w3m dump (duriansoftware.com)
        
       | stefanos82 wrote:
       | Previous submission with comments
       | https://news.ycombinator.com/item?id=38938402
        
         | PaulDavisThe1st wrote:
         | Seems that the author of the piece reposted the same contents
         | at a different URL yesterday. Odd.
        
           | fanf2 wrote:
           | Because cohost is shutting down
           | https://news.ycombinator.com/item?id=41492807
        
         | JdeBP wrote:
         | It has no doubt come up again here because Joe Groff just drew
         | attention to it today on the FediVerse.
         | 
         | * https://f.duriansoftware.com/@joe/113195961485703110
        
       | JdeBP wrote:
       | I wrote up the iterator-driven for back in 2011, because it was
       | one of those things that had been long-since forgotten about;
       | along with what it would look like were it to be incorporated
       | into the (then) C++ standard.
       | 
       | I am fortunate enough to own a copy of the High C/C++ Language
       | Reference in English. (-:
       | 
       | * http://jdebp.uk./FGA/metaware-iterator-driven-for.html
       | 
       | * http://jdebp.uk./Proposals/metaware-iterator-driven-for.html
        
         | jkcxn wrote:
         | Do you know how the break/return would get compiled down to?
         | Would the yield function need to be transformed to return a
         | status code and checked at the callsite?
        
           | JdeBP wrote:
           | It's a non-local goto, also a MetaWare language extension,
           | out of the anonymous nested function that the for statement
           | body becomes to (effectively) an anonymous label right after
           | the for statement.
           | 
           | Another part of the High C/C++ Language Reference describes
           | non-local labels and jumps to them. It doesn't go into great
           | detail, but it does talk about stack unwinding, so expect
           | something similar to how High C/C++ implemented throwing
           | exceptions.
        
           | torginus wrote:
           | Not sure, but imo you could do it with basically reversing
           | the call/return mechanism - that is, whenever the iterator
           | function returns, it saves its state to the stack, just like
           | if it would during a function call, and conversely, when the
           | outside context hands back the control to the iterator, it
           | would restore its state, analogous to how a return from an
           | outside context would work.
        
             | JdeBP wrote:
             | That's not at all how MetaWare implemented iterator-driven
             | for, though.
             | 
             | As Joe Groff said in the headlined post, MetaWare
             | implemented it by turning the nested body of the for
             | statement into an anonymous nested function, which is
             | called back (through a "full function pointer") from the
             | iterator function whenever there's a "yield()" in that
             | latter.
             | 
             | So there's no "whenever the iterator function returns". It
             | only returns when it has finished. The body of the for
             | statement is called by and returns to the iterator
             | function, which is in its turn called by and returns to the
             | function that the for statement is in.
             | 
             | All of the "saving state to the stack" that happens is just
             | the quite normal mechanics of function calling, with merely
             | some special mechanics to pass around a pointer to the
             | lexically outer function's activation record (which is why
             | a "full function pointer" is not a plain "function
             | pointer") as a hidden parameter so that the (anonymous)
             | lexically inner function knows where the outer one's
             | automatic storage duration variables are.
             | 
             | MetaWare also had non-local goto from within nested
             | functions back out into lexically enclosing scopes, and
             | since the for statement body is a nested function, it's
             | just a question of employing that already available
             | implementation mechanism (which in turn does the same sorts
             | of things as throwing exceptions does, unwinding the stack
             | through the iterator function) for break/continue/return
             | (and of course goto) inside the for body.
        
         | Y_Y wrote:
         | Is that supposed to be an upside-down smiley face?
        
       | amszmidt wrote:
       | So ... GCC has had some of these for ages.
       | Pascal lets you match a range of values with case low..high;
       | wouldn't it be great if C had that feature? High C does, another
       | feature standard C and C++ never adopted.
       | 
       | https://gcc.gnu.org/onlinedocs/gcc/Case-Ranges.html
       | Nested functions
       | 
       | https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html
       | Generators
       | 
       | GCC doesn't do those --- looks like a fun feature though!
       | 
       | My favourite, which was sadly removed was doing:
       | foo ? zork : bork = 123;
       | 
       | Oh well...
        
         | JdeBP wrote:
         | Not "ages" in comparison to how long MetaWare had them. High C
         | had this stuff back in the early 1990s and 1980s.
         | 
         | The headlined article doesn't mention it, but High C/C++ had
         | modules all of those years ago, too. Anybase literals, as well.
         | Tom Pennello participated in the standardization efforts back
         | then, too, but none of this stuff made it in.
        
           | mananaysiempre wrote:
           | Is the manual online anywhere? The one for version 1.2 on
           | Bitsavers[1] doesn't mention any of those.
           | 
           | [1] https://bitsavers.computerhistory.org/pdf/metaware/High_C
           | _La...
        
             | JdeBP wrote:
             | I suspect not. I am lucky enough to be consulting this
             | copy: https://mastodonapp.uk/@JdeBP/113199154771277394
             | 
             | (-:
        
             | mezentius wrote:
             | A later version is available here (along with the compiler
             | itself): https://winworldpc.com/product/metaware-high-c-
             | cpp/33x
        
         | pistoleer wrote:
         | You can still do                   *(foo ? &zork : &bork) =
         | 123;
        
         | adastra22 wrote:
         | GCC's different implementation is mentioned in the article.
        
         | qalmakka wrote:
         | GCC nested functions are atrocious and deserve being banned
         | from existence. Like the article rightfully says they've been
         | implemented using weird hacks that make basically impossible to
         | use them safely. There's a reason why Clang has categorically
         | refused to implement them.
        
         | ronsor wrote:
         | > foo ? zork : bork = 123;
         | 
         | That's kind of horrifying.
        
       | mananaysiempre wrote:
       | As a side note, these days you can fake named arguments in C, if
       | you're OK with every argument being 0 by default:
       | // Declaration:       void plot(float xlo, float xhi, float ylo,
       | float yhi, float xinc, float yinc);       struct plot_a { float
       | xlo, xhi, ylo, yhi, xinc, yinc; };       static inline void
       | plot_i(struct plot_a _a) {           // inline thunk to allow
       | arguments to be passed in registers           plot(_a.xlo,
       | _a.xhi, _a.ylo, _a.yho, _a.xinc, _a.yinc);       }       #define
       | plot(...) (plot_i((struct plot_a){ __VA_ARGS__ }))            //
       | Call:       plot(alo, ahi, blo*2.0, bhi*2.0, .yinc = y, .xinc =
       | f(x+z));
        
         | ori_b wrote:
         | Note: this breaks if you want to pass struct literals:
         | plot((myfoo){x,y})
         | 
         | Macros will take the struct literals as multiple parameters:
         | plot(           .arg0=(myfoo){x,           .arg1=y}         )
         | 
         | C macros are best left unused when possible.
        
           | mananaysiempre wrote:
           | Nope! In general, that can be a problem, but not for this
           | specific technique:                 $ cpp -P       void
           | plot(float xlo, float xhi, float ylo, float yhi, float xinc,
           | float yinc);       struct plot_a { float xlo, xhi, ylo, yhi,
           | xinc, yinc; };       static inline void plot_i(struct plot_a
           | _a) {           // inline thunk to allow arguments to go in
           | registers           plot(_a.xlo, _a.xhi, _a.ylo, _a.yho,
           | _a.xinc, _a.yinc);       }       #define plot(...)
           | (plot_i((struct plot_a){ __VA_ARGS__ }))
           | plot((myfoo){x,y})       plot(.yinc=(myfoo){x,y})       ^D
           | [...]       (plot_i((struct plot_a){ (myfoo){x,y} }))
           | (plot_i((struct plot_a){ .yinc=(myfoo){x,y} }))
           | 
           | You could argue this is excessively clever, but when you need
           | it, you really need it, so it could deserve known idiom
           | status in the right situation.
        
           | CamperBob2 wrote:
           | _C macros are best left unused when possible._
           | 
           | Blame the committee for failing to specify an obvious and
           | widely-demanded feature like named parameters.
           | 
           | The only explanation is that the people in charge of the
           | language don't write much code.
        
             | szundi wrote:
             | Or they do and don't want to learn stuff as "everything can
             | be done the old way anyway"
        
             | jimbob45 wrote:
             | There are a lot of syntactic sugar improvements the
             | committee could make that they simply refuse to. Named
             | parameters and function pointer syntax are compile-time
             | fixes that would have zero runtime costs, yet it's 2024 and
             | we've hardly budged from ANSI C.
        
               | CamperBob2 wrote:
               | Exactly. I actually think default parameters are
               | hazardous without named-parameter support. When they
               | added one, IMO they should have added the other as well,
               | so that you can specify exactly which non-default
               | parameters you're passing.
        
           | szundi wrote:
           | Say that to the Linux kernel or any embedded system
        
             | ori_b wrote:
             | I've written both kernel code and embedded systems. It's
             | easier to maintain the code when the preprocessor is
             | avoided.
        
       | notorandit wrote:
       | I for one think that we have lost a number of good opportunities
       | to make C language a better and .ore powerful one.
       | 
       | IMHO what I would need with C is a powerful pre-processor like
       | jinja2 and some symbol manipulation features too.
        
         | pjmlp wrote:
         | Including having proper slices, but not even one of the
         | language authors was able to change WG14 mind on the matter.
         | 
         | That is what happens to languages that leave their authors
         | behind and embrace design by committee.
        
           | AlbertoGP wrote:
           | I'm interested in that, do you happen to have a link or
           | reference to that discussion?
        
             | pjmlp wrote:
             | This has the old days, what exists is Dennis's fat pointers
             | proposal.
             | 
             | https://www.bell-labs.com/usr/dmr/www/vararray.html
        
       | bhouston wrote:
       | Who was the genius behind these features? Someone was incredibly
       | forward looking in that company. Too bad it never got out into
       | the world and impacted the language standards. It is surprising
       | to see that so long ago.
       | 
       | Also previously covered here on Hacker News:
       | https://news.ycombinator.com/item?id=38938402
       | 
       | Is there a PDF copy of this somewhere?
        
         | dfawcus wrote:
         | Bitsavers has a copy of the HC 1.2 reference manual (1985)
         | 
         | It describes underscores in numbers, case ranges, named
         | parameters, nested functions, and full function variables.
         | https://bitsavers.org/pdf/metaware/High_C_Language_Reference_Ma
         | nual_1.2_Nov85.pdf
         | 
         | See Appendix A - 50 odd pages from the end of the file.
        
         | layer8 wrote:
         | CLU had _for_ loops with iterators (generators) and _yield_ in
         | the mid-late 1970s [0]. The Icon programming language around
         | the same time had similar generator features [1] (with _yield_
         | spelled "suspend"). Ada (1983) also had such features I
         | believe. These weren't completely unknown language features.
         | 
         | [0] https://publications.csail.mit.edu/lcs/pubs/pdf/MIT-LCS-
         | TR-2...
         | 
         | [1] https://dl.acm.org/doi/pdf/10.1145/800055.802034
        
           | Agingcoder wrote:
           | Ada didn't have generators back then and still doesn't ( I
           | think its being considered for inclusion soon ). But it had
           | all the other features.
        
         | jmspring wrote:
         | Metaware was a prolific compiler company based out of Santa
         | Cruz in the 80s/90s. Loved what they did, they also had a very
         | interesting culture. I knew about them through _cough_ _shady_
         | _cough_ sites when learning and writing code back in the day.
        
         | canucker2016 wrote:
         | Yeah, it's sad to see those features as novel features instead
         | of part of the standard list of features most programming
         | languages supply.
         | 
         | There's a link to the compiler manuals at
         | https://winworldpc.com/product/metaware-high-c-cpp/33x
         | 
         | The C manual PDF file mentions a 2007 copyright.
        
       | omoikane wrote:
       | This seems way ahead of its time, especially with generators.
       | Maybe Fujitsu was able to just do it because they didn't bother
       | with any of the lengthy standardization processes, but that's
       | probably also why all these extensions seemed relatively unknown
       | and had to be rediscovered and reinvented in modern C/C++ decades
       | later.
        
         | ahoka wrote:
         | C could have been a much nicer language if it wasn't captured
         | by people who insisted that not even two's complement can be in
         | the standard.
        
           | dxuh wrote:
           | I guess those people were parts of the embedded software
           | industry before 2000 (maybe today, I don't know). It's a very
           | good thing that C, the lingua franca of modern computing,
           | actually runs on everything and not just on the stuff we use
           | to browse the internet.
        
           | int_19h wrote:
           | I don't think it was unreasonable at the time ANSI did the
           | standardization originally. As with Common Lisp, they had to
           | consider the numerous already-existing implementations and
           | code written with them in mind.
        
         | JdeBP wrote:
         | It was not Fujitsu. It was MetaWare, which had a fair degree of
         | experience with compilers. It had a contemporaneous Pascal
         | compiler, which was quite well known, and Pascal already had
         | nested functions.
        
         | int_19h wrote:
         | Coroutines and generators were already well-understood then
         | (see Icon!), so I think it is indeed mostly about not having to
         | worry about standardization.
        
       | pjmlp wrote:
       | Old memories, I had access to the MS-DOS version of it, still
       | preferred Borland though.
        
       | JoshTriplett wrote:
       | For anyone wondering why the string literals in the pictured
       | examples end with Y=n rather than \n, it looks like these code
       | examples were written in Shift-JIS, and Shift-JIS puts Y= where
       | ASCII has \\.
        
         | nazgulsenpai wrote:
         | Similarly, Japanese DOS prompts are C:Y= instead of C:\
        
         | Eduard wrote:
         | The author didn't provide information about when this book came
         | out, nor is there any information to find on it. But I think at
         | the book's release, Shift-JIS (the standard) didn't exist yet.
         | 
         | Rather JIS X 0201 (
         | https://en.m.wikipedia.org.org/wiki/JIS_X_0201 ) was used, on
         | which Shift-JIS is based.
        
         | layer8 wrote:
         | This was originally just JIS Roman [0], the Japanese ASCII
         | variant from 1969. Shift-JIS is what much later then added
         | double-byte character set support.
         | 
         | [0] https://en.wikipedia.org/wiki/JIS_X_0201
        
         | zzo38computer wrote:
         | The problem with that is that the ASCII code for a backslash is
         | also used as the second byte in a 2 byte character in Shift-
         | JIS, which can sometimes cause Japanese string literals to not
         | work properly in C. EUC-JP is better for this purpose, because
         | it does not have that problem. (Using Shift-JIS with Pascal
         | also does not have this problem, if you are using the (* *)
         | comments instead of the { } comments.)
        
       | lexicality wrote:
       | Content aside, I'm fascinated by the typography in this book.
       | It's simultaneously beautiful and horrendous!
       | 
       | I don't know enough about Japanese orthography or keming rules to
       | be sure but it looks very much like they took a variable width
       | font with both kanji and latin characters and then hard formatted
       | it into fixed width cells?
       | 
       | Either way, it's nice that the code examples aren't in 8pt font
       | like a lot of the books I have...
        
         | _delirium wrote:
         | That's pretty common in Japanese typography that mixes in Latin
         | characters, especially in older fonts. Pretty decent
         | explanation here:
         | https://www.reddit.com/r/typography/comments/vvfmpu/comment/...
        
       | AdmiralAsshat wrote:
       | Question: Was the book from the screenshots composed in Japanese,
       | or composed in English and then translated into Japanese?
       | 
       | Since it's apparently from Fujitsu, I could see it being the
       | former, but if so, I'm impressed with the quality of the English
       | in the printf statements and code comments from non-native
       | English speakers.
        
         | layer8 wrote:
         | It's funny they're using different fonts within the string
         | literals.
        
       | dsp_person wrote:
       | Are there are good unofficial gcc plugins/extensions out there?
       | It would be cool to extend C with a thing or two without adopting
       | a full blown compiler like C2 or C3.
        
       | Sad90sNerd wrote:
       | These extensions are Ada features.
       | 
       | Ada has:
       | 
       | - Labels in the form of: Call (Param_A => 1, Param_B => "Foo");
       | 
       | - Underscores can be used in numbers of any base: X : Integer :=
       | 1_000;
       | 
       | - Nested subprograms
       | 
       | - Range based tests
        
       | zzo38computer wrote:
       | The file does not display because the browser insists on percent-
       | encoding the apostrophe but the server insists that the
       | apostrophe should not be percent-encoded, therefore resulting in
       | an error message that it won't redirect properly. I can download
       | the file properly with curl, though.
       | 
       | I think these are good ideas.
       | 
       | - Underscores in numeric literals: I think it is a good idea and
       | is also what I had wanted to do before, too. (It should be
       | allowed in hexadecimal as well as decimal)
       | 
       | - Case ranges: GNU C has this feature, too.
       | 
       | - Named arguments: This is possible with GNU C, although it
       | doesn't work without writing it to handle this (although you can
       | use macros to allow it to work with existing functions). You can
       | pass a structure, either directly to the function, or using a
       | macro containing a ({ }) block which extracts the values from the
       | structure and passes them to the function (the compiler will
       | hopefully optimize out this block and just pass the values
       | directly). You can then use the named initialization syntax
       | (which also allows arguments without named), and GNU C also
       | allows you to have duplicates in which case only one of them will
       | work, which allows you to use macros to provide default values.
       | (I have tested this and it works.)
       | 
       | - Nested functions: GNU C also has it, but does not have the
       | "full function value" like this one does, and I think it might be
       | helpful. Nonlocal exits can also be helpful. (I also think the
       | GNU's nested functions could be improved, by allowing them to be
       | declared as "static" and/or "register" in order to avoid the need
       | of trampoline implementations, although "static" and "register"
       | would both have their own additional restrictions; "static" can't
       | access local variables and functions from the function it is
       | contained in unless they are also declared as "static", and
       | "register" means the address can't be taken (therefore allowing
       | the compiler to pass the local variables as arguments to the
       | nested function).)
       | 
       | - Generator functions: I like this too and I think that it is
       | useful (I had wanted things like this before, too). It is also
       | interesting how it can work well with the nested functions.
       | 
       | There are some other things that I also think should be added
       | into a C compiler (in addition to existing GNU extensions), such
       | as:
       | 
       | - Allowing structures to contain members declared as "static".
       | This is a global value whose name is scoped to the strucure
       | within the file being compiled (so, like anything else declared
       | as static, the name is not exported), so any accesses will access
       | the single shared value. Even in the case of e.g. (x->y) if y is
       | a static member then x does not need to be dereferenced so it is
       | OK if it is a null pointer.
       | 
       | - Scoped macros, which work after the preprocessor works. It may
       | be scoped to a function, a {} block inside of a function, a file,
       | a structure, etc. The macro is only expanded where that name is
       | in scope, and not in contexts where a new name is expected (e.g.
       | the name of a variable or argument being declared) (in this case
       | the macro is no longer in scope).
       | 
       | - Allow defining aliases. The name being aliased can be any
       | sequence of bytes (that is valid as a name on the target
       | computer), even if it is not otherwise valid in C (e.g. due to
       | being a reserved word). Any static declaration that does not
       | declare the value may declare the alias.
       | 
       | - Compile-time execution (with explicit declaration).
       | 
       | - Custom output sections, which can be used or moved into
       | standard sections in a portable way. These sections might not
       | even be mapped, and may have assertions, alignment, overlapping,
       | etc.
       | 
       | - Allow functions to be declared as "register". If a function is
       | declared as "static register" (so that the name is not exported),
       | then the compiler is allowed to change the calling convention to
       | work better with the rest of the program.
        
       | WalterBright wrote:
       | D (also in Das BetterC) has:
       | 
       | 1. underscores in literals:                   int a = 1_234_567;
       | 
       | 2. case ranges:                   case 5 .. case 6:
       | 
       | 3. named arguments:                   void test(int a, int b);
       | void foo() { test(b:3, a:4); }
       | 
       | 4. nested functions:                   int foo(int i) {
       | int plus(int a) { return a + i; }           return plus(3);
       | }
       | 
       | 5. static nested functions:                   int foo(int i) {
       | static int plus(int a) { return a + i; }           return
       | plus(3);         }              Error: `static` function
       | `test.foo.plus` cannot access variable `i` in frame of function
       | `test.foo`
       | 
       | 6. a feature similar to generators
       | https://dlang.org/spec/statement.html#foreach_over_struct_an...
        
         | nialv7 wrote:
         | I was thinking about D the whole way while reading this. I just
         | know I am going to see Walter Bright in the comments XD.
        
           | hermanhermitage wrote:
           | Every time Walter posts it reminds me my dream language would
           | simply be C with
           | https://www.digitalmars.com/articles/C-biggest-mistake.html
           | and probably go lang style interfaces. Maybe a little less UB
           | and some extensions for memory safety proofs.
        
             | WalterBright wrote:
             | That's why DasBetterC has done very well! You could call it
             | C with array bounds checking.
             | 
             | I occasionally look at statistics on the sources of bugs
             | and security problems in released software. Array bounds
             | overflows far and away are the top cause.
             | 
             | Why aren't people just sick of array overflows? In the
             | latest C and C++ versions, all kinds of new features are
             | trumpeted, but again no progress on array overflows.
             | 
             | I can confidently say that in the 2 decades of D in
             | production use, the incidence of array overflows has
             | dropped to essentially zero. (To trigger a runtime array
             | overflow, you have to write @system code and throw a
             | compiler switch.)
             | 
             | The solution for C I proposed is backwards compatible, and
             | does not make existing code slower.
             | 
             | It would be the greatest feature added to C, singularly
             | worth more than all the other stuff in C23.
        
               | actionfromafar wrote:
               | I don't always agree but I'll join you on this particular
               | hill!
        
       | hgs3 wrote:
       | Related: The 'lcc-win' C compiler added operator overloading,
       | default function arguments, and function overloading (see
       | "generic functions") [1]. The Plan 9 C compiler introduced
       | several language extensions, some of which, like anonymous
       | structs/unions would eventually be incorporated into the C
       | standard. Present day GCC accepts the -fplan9-extensions flag [2]
       | which enables some nifty features, like automatically converting
       | a struct pointer to an anonymous field for function calls and
       | assignments.
       | 
       | [1] https://lcc-win32.services.net/C-Tutorial.pdf
       | 
       | [2] https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html
        
       ___________________________________________________________________
       (page generated 2024-09-25 23:00 UTC)