[HN Gopher] The lost language extensions of MetaWare's High C co...
___________________________________________________________________
The lost language extensions of MetaWare's High C compiler (2023)
Author : PaulDavisThe1st
Score : 201 points
Date : 2024-09-25 14:19 UTC (8 hours ago)
(HTM) web link (duriansoftware.com)
(TXT) w3m dump (duriansoftware.com)
| stefanos82 wrote:
| Previous submission with comments
| https://news.ycombinator.com/item?id=38938402
| PaulDavisThe1st wrote:
| Seems that the author of the piece reposted the same contents
| at a different URL yesterday. Odd.
| fanf2 wrote:
| Because cohost is shutting down
| https://news.ycombinator.com/item?id=41492807
| JdeBP wrote:
| It has no doubt come up again here because Joe Groff just drew
| attention to it today on the FediVerse.
|
| * https://f.duriansoftware.com/@joe/113195961485703110
| JdeBP wrote:
| I wrote up the iterator-driven for back in 2011, because it was
| one of those things that had been long-since forgotten about;
| along with what it would look like were it to be incorporated
| into the (then) C++ standard.
|
| I am fortunate enough to own a copy of the High C/C++ Language
| Reference in English. (-:
|
| * http://jdebp.uk./FGA/metaware-iterator-driven-for.html
|
| * http://jdebp.uk./Proposals/metaware-iterator-driven-for.html
| jkcxn wrote:
| Do you know how the break/return would get compiled down to?
| Would the yield function need to be transformed to return a
| status code and checked at the callsite?
| JdeBP wrote:
| It's a non-local goto, also a MetaWare language extension,
| out of the anonymous nested function that the for statement
| body becomes to (effectively) an anonymous label right after
| the for statement.
|
| Another part of the High C/C++ Language Reference describes
| non-local labels and jumps to them. It doesn't go into great
| detail, but it does talk about stack unwinding, so expect
| something similar to how High C/C++ implemented throwing
| exceptions.
| torginus wrote:
| Not sure, but imo you could do it with basically reversing
| the call/return mechanism - that is, whenever the iterator
| function returns, it saves its state to the stack, just like
| if it would during a function call, and conversely, when the
| outside context hands back the control to the iterator, it
| would restore its state, analogous to how a return from an
| outside context would work.
| JdeBP wrote:
| That's not at all how MetaWare implemented iterator-driven
| for, though.
|
| As Joe Groff said in the headlined post, MetaWare
| implemented it by turning the nested body of the for
| statement into an anonymous nested function, which is
| called back (through a "full function pointer") from the
| iterator function whenever there's a "yield()" in that
| latter.
|
| So there's no "whenever the iterator function returns". It
| only returns when it has finished. The body of the for
| statement is called by and returns to the iterator
| function, which is in its turn called by and returns to the
| function that the for statement is in.
|
| All of the "saving state to the stack" that happens is just
| the quite normal mechanics of function calling, with merely
| some special mechanics to pass around a pointer to the
| lexically outer function's activation record (which is why
| a "full function pointer" is not a plain "function
| pointer") as a hidden parameter so that the (anonymous)
| lexically inner function knows where the outer one's
| automatic storage duration variables are.
|
| MetaWare also had non-local goto from within nested
| functions back out into lexically enclosing scopes, and
| since the for statement body is a nested function, it's
| just a question of employing that already available
| implementation mechanism (which in turn does the same sorts
| of things as throwing exceptions does, unwinding the stack
| through the iterator function) for break/continue/return
| (and of course goto) inside the for body.
| Y_Y wrote:
| Is that supposed to be an upside-down smiley face?
| amszmidt wrote:
| So ... GCC has had some of these for ages.
| Pascal lets you match a range of values with case low..high;
| wouldn't it be great if C had that feature? High C does, another
| feature standard C and C++ never adopted.
|
| https://gcc.gnu.org/onlinedocs/gcc/Case-Ranges.html
| Nested functions
|
| https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html
| Generators
|
| GCC doesn't do those --- looks like a fun feature though!
|
| My favourite, which was sadly removed was doing:
| foo ? zork : bork = 123;
|
| Oh well...
| JdeBP wrote:
| Not "ages" in comparison to how long MetaWare had them. High C
| had this stuff back in the early 1990s and 1980s.
|
| The headlined article doesn't mention it, but High C/C++ had
| modules all of those years ago, too. Anybase literals, as well.
| Tom Pennello participated in the standardization efforts back
| then, too, but none of this stuff made it in.
| mananaysiempre wrote:
| Is the manual online anywhere? The one for version 1.2 on
| Bitsavers[1] doesn't mention any of those.
|
| [1] https://bitsavers.computerhistory.org/pdf/metaware/High_C
| _La...
| JdeBP wrote:
| I suspect not. I am lucky enough to be consulting this
| copy: https://mastodonapp.uk/@JdeBP/113199154771277394
|
| (-:
| mezentius wrote:
| A later version is available here (along with the compiler
| itself): https://winworldpc.com/product/metaware-high-c-
| cpp/33x
| pistoleer wrote:
| You can still do *(foo ? &zork : &bork) =
| 123;
| adastra22 wrote:
| GCC's different implementation is mentioned in the article.
| qalmakka wrote:
| GCC nested functions are atrocious and deserve being banned
| from existence. Like the article rightfully says they've been
| implemented using weird hacks that make basically impossible to
| use them safely. There's a reason why Clang has categorically
| refused to implement them.
| ronsor wrote:
| > foo ? zork : bork = 123;
|
| That's kind of horrifying.
| mananaysiempre wrote:
| As a side note, these days you can fake named arguments in C, if
| you're OK with every argument being 0 by default:
| // Declaration: void plot(float xlo, float xhi, float ylo,
| float yhi, float xinc, float yinc); struct plot_a { float
| xlo, xhi, ylo, yhi, xinc, yinc; }; static inline void
| plot_i(struct plot_a _a) { // inline thunk to allow
| arguments to be passed in registers plot(_a.xlo,
| _a.xhi, _a.ylo, _a.yho, _a.xinc, _a.yinc); } #define
| plot(...) (plot_i((struct plot_a){ __VA_ARGS__ })) //
| Call: plot(alo, ahi, blo*2.0, bhi*2.0, .yinc = y, .xinc =
| f(x+z));
| ori_b wrote:
| Note: this breaks if you want to pass struct literals:
| plot((myfoo){x,y})
|
| Macros will take the struct literals as multiple parameters:
| plot( .arg0=(myfoo){x, .arg1=y} )
|
| C macros are best left unused when possible.
| mananaysiempre wrote:
| Nope! In general, that can be a problem, but not for this
| specific technique: $ cpp -P void
| plot(float xlo, float xhi, float ylo, float yhi, float xinc,
| float yinc); struct plot_a { float xlo, xhi, ylo, yhi,
| xinc, yinc; }; static inline void plot_i(struct plot_a
| _a) { // inline thunk to allow arguments to go in
| registers plot(_a.xlo, _a.xhi, _a.ylo, _a.yho,
| _a.xinc, _a.yinc); } #define plot(...)
| (plot_i((struct plot_a){ __VA_ARGS__ }))
| plot((myfoo){x,y}) plot(.yinc=(myfoo){x,y}) ^D
| [...] (plot_i((struct plot_a){ (myfoo){x,y} }))
| (plot_i((struct plot_a){ .yinc=(myfoo){x,y} }))
|
| You could argue this is excessively clever, but when you need
| it, you really need it, so it could deserve known idiom
| status in the right situation.
| CamperBob2 wrote:
| _C macros are best left unused when possible._
|
| Blame the committee for failing to specify an obvious and
| widely-demanded feature like named parameters.
|
| The only explanation is that the people in charge of the
| language don't write much code.
| szundi wrote:
| Or they do and don't want to learn stuff as "everything can
| be done the old way anyway"
| jimbob45 wrote:
| There are a lot of syntactic sugar improvements the
| committee could make that they simply refuse to. Named
| parameters and function pointer syntax are compile-time
| fixes that would have zero runtime costs, yet it's 2024 and
| we've hardly budged from ANSI C.
| CamperBob2 wrote:
| Exactly. I actually think default parameters are
| hazardous without named-parameter support. When they
| added one, IMO they should have added the other as well,
| so that you can specify exactly which non-default
| parameters you're passing.
| szundi wrote:
| Say that to the Linux kernel or any embedded system
| ori_b wrote:
| I've written both kernel code and embedded systems. It's
| easier to maintain the code when the preprocessor is
| avoided.
| notorandit wrote:
| I for one think that we have lost a number of good opportunities
| to make C language a better and .ore powerful one.
|
| IMHO what I would need with C is a powerful pre-processor like
| jinja2 and some symbol manipulation features too.
| pjmlp wrote:
| Including having proper slices, but not even one of the
| language authors was able to change WG14 mind on the matter.
|
| That is what happens to languages that leave their authors
| behind and embrace design by committee.
| AlbertoGP wrote:
| I'm interested in that, do you happen to have a link or
| reference to that discussion?
| pjmlp wrote:
| This has the old days, what exists is Dennis's fat pointers
| proposal.
|
| https://www.bell-labs.com/usr/dmr/www/vararray.html
| bhouston wrote:
| Who was the genius behind these features? Someone was incredibly
| forward looking in that company. Too bad it never got out into
| the world and impacted the language standards. It is surprising
| to see that so long ago.
|
| Also previously covered here on Hacker News:
| https://news.ycombinator.com/item?id=38938402
|
| Is there a PDF copy of this somewhere?
| dfawcus wrote:
| Bitsavers has a copy of the HC 1.2 reference manual (1985)
|
| It describes underscores in numbers, case ranges, named
| parameters, nested functions, and full function variables.
| https://bitsavers.org/pdf/metaware/High_C_Language_Reference_Ma
| nual_1.2_Nov85.pdf
|
| See Appendix A - 50 odd pages from the end of the file.
| layer8 wrote:
| CLU had _for_ loops with iterators (generators) and _yield_ in
| the mid-late 1970s [0]. The Icon programming language around
| the same time had similar generator features [1] (with _yield_
| spelled "suspend"). Ada (1983) also had such features I
| believe. These weren't completely unknown language features.
|
| [0] https://publications.csail.mit.edu/lcs/pubs/pdf/MIT-LCS-
| TR-2...
|
| [1] https://dl.acm.org/doi/pdf/10.1145/800055.802034
| Agingcoder wrote:
| Ada didn't have generators back then and still doesn't ( I
| think its being considered for inclusion soon ). But it had
| all the other features.
| jmspring wrote:
| Metaware was a prolific compiler company based out of Santa
| Cruz in the 80s/90s. Loved what they did, they also had a very
| interesting culture. I knew about them through _cough_ _shady_
| _cough_ sites when learning and writing code back in the day.
| canucker2016 wrote:
| Yeah, it's sad to see those features as novel features instead
| of part of the standard list of features most programming
| languages supply.
|
| There's a link to the compiler manuals at
| https://winworldpc.com/product/metaware-high-c-cpp/33x
|
| The C manual PDF file mentions a 2007 copyright.
| omoikane wrote:
| This seems way ahead of its time, especially with generators.
| Maybe Fujitsu was able to just do it because they didn't bother
| with any of the lengthy standardization processes, but that's
| probably also why all these extensions seemed relatively unknown
| and had to be rediscovered and reinvented in modern C/C++ decades
| later.
| ahoka wrote:
| C could have been a much nicer language if it wasn't captured
| by people who insisted that not even two's complement can be in
| the standard.
| dxuh wrote:
| I guess those people were parts of the embedded software
| industry before 2000 (maybe today, I don't know). It's a very
| good thing that C, the lingua franca of modern computing,
| actually runs on everything and not just on the stuff we use
| to browse the internet.
| int_19h wrote:
| I don't think it was unreasonable at the time ANSI did the
| standardization originally. As with Common Lisp, they had to
| consider the numerous already-existing implementations and
| code written with them in mind.
| JdeBP wrote:
| It was not Fujitsu. It was MetaWare, which had a fair degree of
| experience with compilers. It had a contemporaneous Pascal
| compiler, which was quite well known, and Pascal already had
| nested functions.
| int_19h wrote:
| Coroutines and generators were already well-understood then
| (see Icon!), so I think it is indeed mostly about not having to
| worry about standardization.
| pjmlp wrote:
| Old memories, I had access to the MS-DOS version of it, still
| preferred Borland though.
| JoshTriplett wrote:
| For anyone wondering why the string literals in the pictured
| examples end with Y=n rather than \n, it looks like these code
| examples were written in Shift-JIS, and Shift-JIS puts Y= where
| ASCII has \\.
| nazgulsenpai wrote:
| Similarly, Japanese DOS prompts are C:Y= instead of C:\
| Eduard wrote:
| The author didn't provide information about when this book came
| out, nor is there any information to find on it. But I think at
| the book's release, Shift-JIS (the standard) didn't exist yet.
|
| Rather JIS X 0201 (
| https://en.m.wikipedia.org.org/wiki/JIS_X_0201 ) was used, on
| which Shift-JIS is based.
| layer8 wrote:
| This was originally just JIS Roman [0], the Japanese ASCII
| variant from 1969. Shift-JIS is what much later then added
| double-byte character set support.
|
| [0] https://en.wikipedia.org/wiki/JIS_X_0201
| zzo38computer wrote:
| The problem with that is that the ASCII code for a backslash is
| also used as the second byte in a 2 byte character in Shift-
| JIS, which can sometimes cause Japanese string literals to not
| work properly in C. EUC-JP is better for this purpose, because
| it does not have that problem. (Using Shift-JIS with Pascal
| also does not have this problem, if you are using the (* *)
| comments instead of the { } comments.)
| lexicality wrote:
| Content aside, I'm fascinated by the typography in this book.
| It's simultaneously beautiful and horrendous!
|
| I don't know enough about Japanese orthography or keming rules to
| be sure but it looks very much like they took a variable width
| font with both kanji and latin characters and then hard formatted
| it into fixed width cells?
|
| Either way, it's nice that the code examples aren't in 8pt font
| like a lot of the books I have...
| _delirium wrote:
| That's pretty common in Japanese typography that mixes in Latin
| characters, especially in older fonts. Pretty decent
| explanation here:
| https://www.reddit.com/r/typography/comments/vvfmpu/comment/...
| AdmiralAsshat wrote:
| Question: Was the book from the screenshots composed in Japanese,
| or composed in English and then translated into Japanese?
|
| Since it's apparently from Fujitsu, I could see it being the
| former, but if so, I'm impressed with the quality of the English
| in the printf statements and code comments from non-native
| English speakers.
| layer8 wrote:
| It's funny they're using different fonts within the string
| literals.
| dsp_person wrote:
| Are there are good unofficial gcc plugins/extensions out there?
| It would be cool to extend C with a thing or two without adopting
| a full blown compiler like C2 or C3.
| Sad90sNerd wrote:
| These extensions are Ada features.
|
| Ada has:
|
| - Labels in the form of: Call (Param_A => 1, Param_B => "Foo");
|
| - Underscores can be used in numbers of any base: X : Integer :=
| 1_000;
|
| - Nested subprograms
|
| - Range based tests
| zzo38computer wrote:
| The file does not display because the browser insists on percent-
| encoding the apostrophe but the server insists that the
| apostrophe should not be percent-encoded, therefore resulting in
| an error message that it won't redirect properly. I can download
| the file properly with curl, though.
|
| I think these are good ideas.
|
| - Underscores in numeric literals: I think it is a good idea and
| is also what I had wanted to do before, too. (It should be
| allowed in hexadecimal as well as decimal)
|
| - Case ranges: GNU C has this feature, too.
|
| - Named arguments: This is possible with GNU C, although it
| doesn't work without writing it to handle this (although you can
| use macros to allow it to work with existing functions). You can
| pass a structure, either directly to the function, or using a
| macro containing a ({ }) block which extracts the values from the
| structure and passes them to the function (the compiler will
| hopefully optimize out this block and just pass the values
| directly). You can then use the named initialization syntax
| (which also allows arguments without named), and GNU C also
| allows you to have duplicates in which case only one of them will
| work, which allows you to use macros to provide default values.
| (I have tested this and it works.)
|
| - Nested functions: GNU C also has it, but does not have the
| "full function value" like this one does, and I think it might be
| helpful. Nonlocal exits can also be helpful. (I also think the
| GNU's nested functions could be improved, by allowing them to be
| declared as "static" and/or "register" in order to avoid the need
| of trampoline implementations, although "static" and "register"
| would both have their own additional restrictions; "static" can't
| access local variables and functions from the function it is
| contained in unless they are also declared as "static", and
| "register" means the address can't be taken (therefore allowing
| the compiler to pass the local variables as arguments to the
| nested function).)
|
| - Generator functions: I like this too and I think that it is
| useful (I had wanted things like this before, too). It is also
| interesting how it can work well with the nested functions.
|
| There are some other things that I also think should be added
| into a C compiler (in addition to existing GNU extensions), such
| as:
|
| - Allowing structures to contain members declared as "static".
| This is a global value whose name is scoped to the strucure
| within the file being compiled (so, like anything else declared
| as static, the name is not exported), so any accesses will access
| the single shared value. Even in the case of e.g. (x->y) if y is
| a static member then x does not need to be dereferenced so it is
| OK if it is a null pointer.
|
| - Scoped macros, which work after the preprocessor works. It may
| be scoped to a function, a {} block inside of a function, a file,
| a structure, etc. The macro is only expanded where that name is
| in scope, and not in contexts where a new name is expected (e.g.
| the name of a variable or argument being declared) (in this case
| the macro is no longer in scope).
|
| - Allow defining aliases. The name being aliased can be any
| sequence of bytes (that is valid as a name on the target
| computer), even if it is not otherwise valid in C (e.g. due to
| being a reserved word). Any static declaration that does not
| declare the value may declare the alias.
|
| - Compile-time execution (with explicit declaration).
|
| - Custom output sections, which can be used or moved into
| standard sections in a portable way. These sections might not
| even be mapped, and may have assertions, alignment, overlapping,
| etc.
|
| - Allow functions to be declared as "register". If a function is
| declared as "static register" (so that the name is not exported),
| then the compiler is allowed to change the calling convention to
| work better with the rest of the program.
| WalterBright wrote:
| D (also in Das BetterC) has:
|
| 1. underscores in literals: int a = 1_234_567;
|
| 2. case ranges: case 5 .. case 6:
|
| 3. named arguments: void test(int a, int b);
| void foo() { test(b:3, a:4); }
|
| 4. nested functions: int foo(int i) {
| int plus(int a) { return a + i; } return plus(3);
| }
|
| 5. static nested functions: int foo(int i) {
| static int plus(int a) { return a + i; } return
| plus(3); } Error: `static` function
| `test.foo.plus` cannot access variable `i` in frame of function
| `test.foo`
|
| 6. a feature similar to generators
| https://dlang.org/spec/statement.html#foreach_over_struct_an...
| nialv7 wrote:
| I was thinking about D the whole way while reading this. I just
| know I am going to see Walter Bright in the comments XD.
| hermanhermitage wrote:
| Every time Walter posts it reminds me my dream language would
| simply be C with
| https://www.digitalmars.com/articles/C-biggest-mistake.html
| and probably go lang style interfaces. Maybe a little less UB
| and some extensions for memory safety proofs.
| WalterBright wrote:
| That's why DasBetterC has done very well! You could call it
| C with array bounds checking.
|
| I occasionally look at statistics on the sources of bugs
| and security problems in released software. Array bounds
| overflows far and away are the top cause.
|
| Why aren't people just sick of array overflows? In the
| latest C and C++ versions, all kinds of new features are
| trumpeted, but again no progress on array overflows.
|
| I can confidently say that in the 2 decades of D in
| production use, the incidence of array overflows has
| dropped to essentially zero. (To trigger a runtime array
| overflow, you have to write @system code and throw a
| compiler switch.)
|
| The solution for C I proposed is backwards compatible, and
| does not make existing code slower.
|
| It would be the greatest feature added to C, singularly
| worth more than all the other stuff in C23.
| actionfromafar wrote:
| I don't always agree but I'll join you on this particular
| hill!
| hgs3 wrote:
| Related: The 'lcc-win' C compiler added operator overloading,
| default function arguments, and function overloading (see
| "generic functions") [1]. The Plan 9 C compiler introduced
| several language extensions, some of which, like anonymous
| structs/unions would eventually be incorporated into the C
| standard. Present day GCC accepts the -fplan9-extensions flag [2]
| which enables some nifty features, like automatically converting
| a struct pointer to an anonymous field for function calls and
| assignments.
|
| [1] https://lcc-win32.services.net/C-Tutorial.pdf
|
| [2] https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html
___________________________________________________________________
(page generated 2024-09-25 23:00 UTC)