[HN Gopher] C99 doesn't need function bodies, or 'VLAs are Turin...
       ___________________________________________________________________
        
       C99 doesn't need function bodies, or 'VLAs are Turing complete'
        
       Author : lsof
       Score  : 225 points
       Date   : 2022-08-04 14:47 UTC (8 hours ago)
        
 (HTM) web link (lemon.rip)
 (TXT) w3m dump (lemon.rip)
        
       | jmount wrote:
       | Wowzers, I thought that was gong to be some template re-write
       | expansion trick. Nope, just willing to run nearly arbitrary code
       | if you wrap it correctly (not a complaint!).
        
       | camel-cdr wrote:
       | This is only tangential to the article:
       | 
       | C isn't Turing complete without `fseek` (as far as I can tell).
       | 
       | Turing completes requires you to be able to read/write from an
       | infinite tape (essentially infinite memory). This isn't possible
       | in C, because `sizeof` is a constant expression, thus limiting
       | the size of any type, and importantly also pointer type, to a
       | finite number, thus making the addressable memory finite.
       | 
       | From what I can tell, the only way you could theoretically access
       | an infinite tape is using `fseek` with `SEEK_CUR`.
       | 
       | There might be more shenanigans possible with `stdio.h`, but I'm
       | pretty sure C can't be Turing complete without the `stdio.h`
       | function (assuming no special language extensions).
       | 
       | If anybody is wondering, the preprocessor isn't Turing complete
       | either, because you might theoretically have infinite memory, but
       | then you only have finite recursion, which also isn't enough for
       | Turing completeness. File iteration can have infinite recursion,
       | but can also only carry over finite state between `#include`s.
       | 
       | Edit: I'm not talking about any real world implementation, but
       | rather about the theoretical bounds of the C abstract machine.
        
         | lelanthran wrote:
         | > This isn't possible in C, because `sizeof` is a constant
         | expression, thus limiting the size of any type, and importantly
         | also pointer type, to a finite number, thus making the
         | addressable memory finite.
         | 
         | Who said it had to be finite _in theory_? `size_t` is defined
         | in terms of the implementation, not in terms of bit-widths.
         | 
         | Sure, size_t is finite in practice, but then again, so are all
         | other programming languages. In theory, there is no upper limit
         | on size_t.
        
           | camel-cdr wrote:
           | `size_t` is defined as
           | 
           | > the unsigned integer type of the result of the sizeof
           | operator (https://port70.net/~nsz/c/c11/n1570.html#7.19p2)
           | 
           | `sizeof` returns the following:
           | 
           | > If the type of the operand is a variable length array type,
           | the operand is evaluated; otherwise, the operand is not
           | evaluated and the result is an integer constant.
           | (https://port70.net/~nsz/c/c11/n1570.html#6.5.3.4p2)
           | 
           | So `sizeof(size_t)` must be a concrete integer constant for
           | any given C implementation, it can't change at runtime.
           | 
           | As far as I'm aware, there isn't an instance of an infinitely
           | large integer, even in mathematics, there are finite integers
           | and there is the concept of infinity.
        
         | dmitrygr wrote:
         | Your argument sort of imploded on itself...your claim that only
         | a file can be considered tape is insane because memory is just
         | as usable as tape and you can keep extending memory available
         | to you using sbrk() . If you're going to claim the memory is
         | finite, well so are files. (Also lseek is not part of any C
         | spec)
         | 
         | And yes neither is infinite, so every computer is just a DFA,
         | and lseek changes nothing. But given the amount of memory that
         | exists we approximate and say they are Turing machines since
         | there is enough memory to do most what we need.
        
           | camel-cdr wrote:
           | My argument is that `fseek` (not `lseek`) is the only way for
           | standard C to access a infinite tape, because it allows
           | relative seeking in a file. `fseek(file, 1, SEEK_CUR)` to
           | advance and `fseek(file, -1, SEEK_CUR)` to go back.
        
             | kps wrote:
             | The tape wouldn't have to be provided as a single flat
             | file. You could, for instance, use something like a pair of
             | files, one for control and one for data:
             | fputc('L', tape_control);         c = fgetc(tape_data);
             | fputc(d, tape_data);
             | 
             | or a single file with controls disjoint from the tape
             | symbols.
        
               | camel-cdr wrote:
               | This would be possible, but it feels like cheating :-)
               | 
               | Idk how to express this properly, but I feel like there
               | is a difference between this and the fseek approach.
               | 
               | In a similar vain, you could say that a certain kind of
               | undefined behavior controls the tape, or writing to a
               | specific part of memory does.
               | 
               | I suppose this would fall under adding additional
               | compiler extension.
        
               | kps wrote:
               | > ... _or writing to a specific part of memory does._
               | 
               | Memory-mapped devices are pretty common. C's original
               | platform controlled tape drives by writing to memory:
               | https://www.tuhs.org/cgi-
               | bin/utree.pl?file=V7/usr/sys/dev/tc...
        
             | dmitrygr wrote:
             | Pointer++
             | 
             | Pointer---
             | 
             | These do the same to pointers...
        
               | camel-cdr wrote:
               | No, pointers always have a finite size, because sizeof is
               | a constant expression. A theoretically fseek isn't bound
               | by this, the compiler could implement it with an infinite
               | tape.
        
               | _gabe_ wrote:
               | A pointer is just a number that points to addressable
               | memory. Pointers don't even point to actual memory these
               | days anyways. They're just glorified indices into OS
               | tables that eventually map to actual memory. You can
               | increase a pointer to be 128 bits big, or 256 bits big if
               | you wanted to. It doesn't change anything, because it's
               | just an index.
               | 
               | Here's a clarification on Wikipedia that expands on this:
               | 
               | > nearly all programming languages are Turing complete if
               | the limitations of finite memory are ignored.
               | 
               | And files are just as finite as a pointer. You can't
               | address a location in a file bigger than sizeof(void*) on
               | a modern computer. If you wanted to, you would need to
               | add special support[1] and doing this is equivalent to
               | what you would do to support a bigger memory range for
               | pointers.
               | 
               | I'm curious, do you think other languages are Turing
               | complete? As far as I know, all modern programming
               | languages use a 64 bit addressable memory space maximum.
               | 
               | [0]: https://en.m.wikipedia.org/wiki/Turing_machine
               | 
               | [1]: https://stackoverflow.com/questions/4933561/what-is-
               | the-rela...
        
               | camel-cdr wrote:
               | > You can increase a pointer to be 128 bits big, or 256
               | bits big if you wanted to. It doesn't change anything,
               | because it's just an index.
               | 
               | Yes, you can increase the pointer size arbitrarily, but
               | it is always constant for a single C implementation, you
               | can't increase the pointer size at run time.
               | 
               | Essentially you recompiling and running a program and
               | choosing an implementation with a successively larger
               | pointer size is Turing complete, but it requires you as a
               | special actor, and I'm only concerned for a what a single
               | implementation can do.
               | 
               | To put it in another way, solving the halting problem is
               | trivial for any C program + a fixed C implementation,
               | that doesn't have special language extensions.
               | 
               | > I'm curious, do you think other languages are Turing
               | complete?
               | 
               | The easiest example would be brainfuck, because you can
               | just move the tape around in it, and a theoretical
               | implementation could trivially be hoked up to a
               | theoretical infinite tape.
               | 
               | > As far as I know, all modern programming languages use
               | a 64 bit addressable memory space maximum
               | 
               | I don't know enough about how other languages are
               | defined, but I could imagine that Java allows for an
               | implementation with infinite memory, because it doesn't
               | impose any requirements on how the pointers are
               | implemented under the hood.
        
               | dmitrygr wrote:
               | You're spouting nonsense. File offset is guaranteed to be
               | expressible by the integral type off_t. Thus also
               | limited.
        
               | camel-cdr wrote:
               | You are aware that `off_t` isn't in the C standard?
               | 
               | The standard has `fgetpos`:
               | 
               | > The fgetpos function stores the current values of the
               | parse state (if any) and file position indicator for the
               | stream pointed to by stream in the object pointed to by
               | pos
               | 
               | > If a file can support positioning requests (such as a
               | disk file, as opposed to a terminal), then a file
               | position indicator associated with the stream is
               | positioned at the start (character number zero) of the
               | file, unless the file is opened with append mode in which
               | case it is implementation-defined whether the file
               | position indicator is initially positioned at the
               | beginning or the end of the file.
               | 
               | Meaning `fgetpos` mustn't work for files that don't
               | support `positioning requests`, whiles relative offsets
               | using `fseek` could still work.
        
               | dmitrygr wrote:
               | ok, that is just more nonsense...
               | 
               | but let's go with that. fgetpos stores offset in an
               | fpos_t object...in memory...memory is finite as you
               | admit...thus the length of this fpos_t object must be
               | finite...thus there is a limit to how many bits fpos_t
               | may contain, thus it can address only finite file
               | length...
        
               | camel-cdr wrote:
               | > ...thus there is a limit to how many bits fpos_t may
               | contain
               | 
               | Yes, but my point was that a `fgetpos` mustn't work for
               | all FILE pointers, take for example `stdin`.
               | 
               | The standard quotes from my above comment show that a
               | FILE pointer mustn't support "positioning requests", so
               | fgetpos mustn't work.
        
         | munificent wrote:
         | You could just make a doubly-linked list that dynamically grows
         | new nodes (using malloc()) at either end on demand in order to
         | implement an infinite tape.
         | 
         | (Of course on a real machine the malloc() will fail at some
         | point.)
        
           | camel-cdr wrote:
           | My point was that this isn't possible, because the
           | addressable memory, whiles possible humongous, is always
           | finite, because the sizeof pointers is constant. So at some
           | point `malloc` must return the same address, even if you had
           | theoretically infinite memory available, there is no way to
           | address that in C.
        
             | munificent wrote:
             | Ah, good point!
        
         | arcbyte wrote:
         | You are pedantically correct and wrong. Even with fseek there
         | are only so many atoms in the universe and you'd eventually run
         | into limited memory. Rather than go so esoteric as to say that
         | only programs structured with fseek are Turing complete, we
         | generally just make the jump from languages able to use
         | arbitrarily sized RAM to assuming infinite RAM and say youre
         | turing complete enough for most purposes.
        
           | camel-cdr wrote:
           | > Even with fseek there are only so many atoms in the
           | universe and you'd eventually run into limited memory
           | 
           | I should've explained that I'm not talking about a physical
           | implementation, but about the theoretical bounds of the C
           | abstract machine.
        
             | Phrodo_00 wrote:
             | Talking theoretically, since size_t is defined as:
             | 
             | > size_t can store the maximum size of a theoretically
             | possible object of any type (including array).
             | 
             | In a proper theoretical turing machine, size_t would allow
             | you to create and address arbitrarily large arrays too.
        
               | camel-cdr wrote:
               | `sizeof(size_t)` and `size_t x; sizeof x` must be
               | constant expressions. It is possible to create a c
               | implementation where the size of a pointer is arbitrarily
               | large, but it can't grow/isn't infinite.
               | 
               | For any give C implementation, it is thus trivial to
               | theoretically solve the halting problem (assuming no
               | stdio.h stuff is used).
        
               | Phrodo_00 wrote:
               | Speaking in so much the abstract that it becomes absurd
               | (which also can be used to prove your point, but bear
               | with me here), since `size_t` is required to hold the
               | size of any object, and a turing machine has infinite
               | memory, then size_t would also need to be infinitely big,
               | and `sizeof(size_t)` would be a constant returning
               | infinity.
               | 
               | So, I don't think it having to be a constant is a problem
               | as much as having to deal with infinitely big numbers
               | inside of infinite memory (which may or may not be a
               | contradiction, depending on axioms used to define a
               | turing machine. I still need to work my way through
               | annotated turing).
               | 
               | Of course, things do get a lot simpler and grounded using
               | IO functions like you said earlier.
        
               | camel-cdr wrote:
               | I don't think infinitely large integer constant area
               | construct that makes sense.
               | 
               | As far as I'm aware, there isn't an instance of an
               | infinitely large integer, even in mathematics, there are
               | finite integers and there is the concept of infinity.
               | (And size_t is defined as an unsigned integer type)
        
         | cypress66 wrote:
         | I'm sure someone more knowledgeable will point out the flaw,
         | but couldn't Turing's machine definition be changed from
         | infinite tape, to arbitrarily large tape, and thus solve this
         | technicality?
        
           | camel-cdr wrote:
           | Let me put it that way.
           | 
           | Given any theoretical C implementation and a strictly
           | confirming program (that doesn't do io) I can theoretically
           | trivially solve the halting problem.
           | 
           | Since the possible memory is constant for any given
           | implementation, I can use the following algorithm:
           | 
           | 1. Execute one instruction of the program
           | 
           | 2. If the program terminated, goto 4.
           | 
           | 3. insert entire program state into a hash table
           | if it is already in the hash table, goto 5.
           | otherwise, goto 1.
           | 
           | 4. The Program terminates
           | 
           | 5. The program never terminates
           | 
           | The above algorithm solves the halting problem of any C
           | program with a constant memory bound in finite time.
        
         | jwilk wrote:
         | See "Subtleties of the ANSI/ISO C standard" <https://www.open-
         | std.org/jtc1/sc22/wg14/www/docs/n1637.pdf>, section V "C is not
         | Turing complete". They discuss file I/O briefly too, but I
         | don't follow their reasoning.
        
           | camel-cdr wrote:
           | Yes, this is basically my argument.
           | 
           | They address that `ftell` must return a finite value, but
           | don't mention that `ftell` can fail for specific FILEs:
           | 
           | > If successful, the ftell function returns the current value
           | of the file position indicator for the stream
           | 
           | But FILEs don't necessarily have a "file position indicator":
           | 
           | > If a file can support positioning requests (such as a disk
           | file, as opposed to a terminal), then a file position
           | indicator associated with the stream is positioned at the
           | start (character number zero) of the file, unless the file is
           | opened with append mode in which case it is implementation-
           | defined whether the file position indicator is initially
           | positioned at the beginning or the end of the file.
           | 
           | They don't seem to discuss the topic further.
           | 
           | > Therefore we will here only consider C programs without
           | I/O. (Our argument can be adapted to show that the programs
           | that restrict I/O to reading stdin and writing stdout are
           | Turing incomplete as well, but we will not do so here.)
           | 
           | So they don't address the problem of IO completely.
        
         | [deleted]
        
         | wgd wrote:
         | The issue of memory bounds is commonly handwaved away. Note
         | that your desktop computer is technically not Turing complete
         | either, since it only has access to a finite amount of
         | memory+disk storage, and is thus a (very large) finite state
         | machine since there are only a finite number of states it can
         | be in.
        
       | beeforpork wrote:
       | But why? Why is this VLA parameter defined this way? It seems
       | totally bizarre and unnecessary, but I suppose it must have been
       | added to the standard to solve some kind of problem? Is the
       | proposal for this feature available and gives some insight?
        
         | mananaysiempre wrote:
         | You can try trawling through the document log[1], but a big
         | part of the C99 deliberations is either inaccessible or (like
         | the drafts) explicitly kept secret from anyone but the
         | committee.
         | 
         | [1] https://www.open-
         | std.org/jtc1/sc22/wg14/www/wg14_document_lo...
        
         | TazeTSchnitzel wrote:
         | Allowing arbitrary expressions allows self-documenting
         | signatures:                 void concat_strs(         int
         | str1_len,         const char str1[str1_len],         int
         | str2_len,         const char str2[str2_len],         char
         | out_str[str1_len + str2_len],       );              void
         | manipulate_array(         array_dim dim,         int
         | arr[dim.x][dim.y],       );
         | 
         | Supporting things like printf() was probably not specifically
         | desired, but it would be difficult to define it in such a way
         | that it accepts all reasonable expressions and doesn't accept
         | any "unreasonable" ones.
        
         | ufo wrote:
         | I speculate it's for consistency with VLAs that are not
         | function arguments, which is the more common use for VLAs.
         | 
         | The original purpose of VLAs is to let you stack-allocate an
         | array with a length that is not known at compile time. In ANSI
         | C you must heap-allocate such arrays, using malloc.
        
         | kllrnohj wrote:
         | seems like to avoid the introduction of constant expressions
         | (constexpr) like is in C++. Which is why c++ doesn't have this
         | issue, even though it can also take expressions as the
         | expressions must be resolvable at compile-time.
        
       | jimmaswell wrote:
       | > that declaration is functionally equivalent, and compatible to
       | the following:
       | 
       | Nice to see some vindication - I remember saying this and IRC
       | pedants trying to argue up and down that they're absolutely not
       | the same
        
       | ynfnehf wrote:
       | I once implemented FizzBuzz using this trick
       | https://www.reddit.com/r/C_Programming/comments/qqazh8/fizzb...
        
         | inglor_cz wrote:
         | Wow, this is wizardry.
        
       | jnxx wrote:
       | > Side note: even though we can't return, main is the exception
       | to the rule that reaching the closing } of a function returning
       | non-void is verboten,
       | 
       | Isn't it legal to fall of the end of the end of a non-void
       | function, only just defined as UB to _use_ the return value?
        
         | tedunangst wrote:
         | Correct.
         | 
         | > If the } that terminates a function is reached, and the value
         | of the function call is used by the caller, the behavior is
         | undefined.
        
       | pjmlp wrote:
       | Thanks to them being yet another attack vector and funny stuff
       | like on this post, got demoted to optional on C11.
       | 
       | Additionally Google spent several years paying to clean up the
       | Linux kernel from all VLA occurrences.
       | 
       | https://www.phoronix.com/news/Linux-Kills-The-VLA
        
         | ynfnehf wrote:
         | This usage of VLAs is once again mandatory for compilers to
         | support, since C23. From Wikipedia: "Variably-modified types
         | (but not VLAs which are automatic variables allocated on the
         | stack) become a mandatory feature".
        
           | pjmlp wrote:
           | Oh well...
        
             | ynfnehf wrote:
             | At least it is still optional to allow for stack allocated
             | VLAs. Which is the attack vector you mentioned.
        
         | option_key wrote:
         | >Thanks to them being yet another attack vector and funny stuff
         | like on this post, got demoted to optional on C11.
         | 
         | Sadly, the C committee doesn't really understand what was wrong
         | with VLAs and a sizable group of its members wants to make them
         | mandatory again:
         | 
         | https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2921.pdf
         | ("Does WG14 want to make VLAs fully mandatory in C23")
        
           | chrisseaton wrote:
           | > wants to make them mandatory again
           | 
           | What does 'mandatory' mean? Like if I write a C compiler
           | without them... what are they going to do about it?
        
             | mgraczyk wrote:
             | Code that complies with the standard will be rejected by
             | your compiler. The effect would probably be that few people
             | would use your compiler.
        
             | Inityx wrote:
             | Most mainstream compilers aim to be standards compliant
        
           | planede wrote:
           | What's wrong with VLAs is their syntax. It really shouldn't
           | use the same syntax as regular C arrays, otherwise they would
           | be fine, maybe with a scary enough keyword. They are more
           | generic than alloca too, alloca being scoped to the function,
           | while VLAs being scoped to the innermost block scope that
           | contains them.
        
             | pjmlp wrote:
             | Syntax, no protection against stack corruption,...
        
               | planede wrote:
               | You can corrupt the stack without VLAs just fine. What
               | else?
        
               | einpoklum wrote:
               | With VLAs:
               | 
               | 1. The stack-smashing pattern is simple, straightforward
               | and sure to be used often. Other ways to smash the stack
               | require some more "effort"...
               | 
               | 2. It's not just _you_ who can smash the stack. It's the
               | fact that anyone who calls your function will smash the
               | stack if they pass some large numeric value.
        
               | rwmj wrote:
               | VLAs make it a lot easier to corrupt the stack by
               | accident. Unless you're quite a careful coder, stuff
               | like:                 f (size_t n)       {         char
               | str[n];
               | 
               | leads to a possible exploit where the input is
               | manipulated so n is large, causing a DoS attack (at best)
               | or full exploit at worse. I'm not saying that banning
               | VLAs solves every problem though.
               | 
               | However the main reason we forbid VLAs in all our code is
               | because thread stacks (particularly on 32 bit or in
               | kernel) are quite limited in depth and so you want to be
               | careful with stack frame size. VLAs make it harder to
               | compute and thus check stack frame sizes at compile time,
               | making the -Wstack-usage warning less effective. Large
               | arrays get allocated on the heap instead.
        
               | Dylan16807 wrote:
               | How does a huge VLA corrupt the stack? If there's not
               | enough space but code keeps going then isn't that a
               | massive bug with your compiler or runtime?
        
               | petters wrote:
               | Welcome to the world of undefined behavior. Anything can
               | happen....
        
               | mtlmtlmtlmtl wrote:
               | I think this is a common misunderstanding about UB. It's
               | not that anything can happen, just that the standard
               | doesn't specify what happens, meaning whatever happens is
               | compiler/architecture/OS dependent. So you can't depend
               | on UB in portable code. But something definite will
               | happen, given the current state of the system. After all,
               | if it didn't, these things wouldn't be exploitable
               | either.
        
               | Dylan16807 wrote:
               | What is undefined about a large VLA? It shouldn't be
               | undefined.
               | 
               | According to wikipedia "C11 does not explicitly name a
               | size-limit for VLAs"
        
               | anyfoo wrote:
               | Okay. How do you tell the kernel that? Sure, the kernel
               | will have put a guard page or more at the end of the
               | stack, so that if you regularly push onto the stack, you
               | will eventually hit a guard page and things will blow up
               | appropriately.
               | 
               | But what if the length of your variable length array is,
               | say, gigabytes, you've blown way past the guard pages,
               | and your pointer is now in non-stack kernel land.
               | 
               | You'd have to check the stack pointer all the time to be
               | sure, that's prohibitive performance-wise. Ironically,
               | x86 kind of had that in hardware back when segmentation
               | was still used.
        
               | Dylan16807 wrote:
               | I think the normal pattern is a stack probe every page or
               | so when there's a sufficiently large allocation. There's
               | no need to check the stack pointer all the time.
               | 
               | But that's not my point. If the compiler/runtime knows it
               | will blow up if you have an allocation over 4KB or so,
               | then it needs to do something to mitigate or reject
               | allocations like that.
        
               | anyfoo wrote:
               | > I think the normal pattern is a stack probe every page
               | or so when there's a sufficiently large allocation.
               | 
               | What exactly are you doing there, in kernel code?
               | 
               | > But that's not my point. If the compiler/runtime knows
               | it will blow up if you have an allocation over 4KB or so,
               | then it needs to do something to mitigate or reject
               | allocations like that.
               | 
               | Do what exactly? Just reject stack allocations that are
               | larger than a guard page? And keep book of past
               | allocations? A lot of that needs to happen at runtime,
               | since the compiler doesn't know the size with VLAs.
        
               | jasonhansel wrote:
               | Doesn't a similar DoS risk (from allowing users to
               | allocate arbitrarily large amounts of memory) also apply
               | to the heap? You shouldn't be giving arbitrary user-
               | supplied ints to malloc either.
        
               | lelanthran wrote:
               | > Doesn't a similar DoS risk (from allowing users to
               | allocate arbitrarily large amounts of memory) also apply
               | to the heap?
               | 
               | DoS Risk? No one cares too much about that - the problem
               | with VLAs is stack smashing, which then allows aribtrary
               | user-supplied code to be executed.
               | 
               | You cannot do that with malloc() and friends.
        
               | zajio1am wrote:
               | > stuff like ... leads to a possible exploit where the
               | input is manipulated so n is large
               | 
               | The same is true for most recursive calls, should
               | recursion be also banned in programming languages?
        
               | rhexs wrote:
               | When writing secure C? In most cases, absolutely.
        
               | mtlmtlmtlmtl wrote:
               | That's not really a fair comparison though. Recursion is
               | strictly necessary to implement several algorithms. Even
               | if "banned" from the language, you would have to simulate
               | it using a heap allocated stack or something to do
               | certain things.
               | 
               | None of this applies to VLA arguments.
        
               | 10000truths wrote:
               | It's not strictly necessary precisely _because_ all
               | recursions can be  "simulated" with a heap allocated
               | stack. And in fact, the "simulated" approach is almost
               | always better, from both a performance and a maintenance
               | perspective.
        
               | chjj wrote:
               | You shouldn't be writing C if you're not a careful coder.
        
               | samatman wrote:
               | And if you're a careful coder writing C, you should give
               | the VLA the stink eye unless it's proving its worth.
        
               | pjmlp wrote:
               | Yeah, right.
               | 
               | https://msrc-blog.microsoft.com/2019/07/16/a-proactive-
               | appro...
               | 
               | https://research.google/pubs/pub46800/
               | 
               | https://support.apple.com/guide/security/memory-safe-
               | iboot-i...
               | 
               | Maybe you could give an helping hand to Microsoft, Apple
               | and Google, they are in need of carefull C coders.
        
               | chjj wrote:
               | I'm not sure if you intentionally missed my point.
               | Everything in C requires careful usage. VLAs aren't
               | special: they're just yet another feature which must be
               | used carefully, if used at all.
               | 
               | Personally, I don't use them, but I don't find "they're
               | unsafe" to be a convincing reason for why they shouldn't
               | be included in the already-unsafe language. Saying
               | they're unnecessary might be a better reason.
        
               | fpoling wrote:
               | VLAs are unsafe in the worst kind of way as it is not
               | possible to query when it is safe to use them. alloca()
               | at least in theory can return null stack overflow, but
               | there is no such provision with VLA.
        
               | pjmlp wrote:
               | The goal should be to reduce the amount of sharp edges,
               | not increase them even further.
        
               | kllrnohj wrote:
               | Hint, that means nobody should be writing C.
        
               | krallja wrote:
               | Where is the lie?
        
               | marcosdumay wrote:
               | Too bad we have all that legacy C code that won't just
               | reappear by itself on a safer language.
               | 
               | That means there are a lot of not careful enough
               | developers (AKA, human ones) that will write a lot of C
               | just because they need some change here or there.
        
               | pjmlp wrote:
               | What about not adding even more ways how we should avoid
               | using C?
        
               | arinlen wrote:
               | > _What about not adding even more ways how we should
               | avoid using C?_
               | 
               | That's a mute point for C's target audience because they
               | already understand that they need to be mindful of what
               | the language does.
        
               | mjcohen wrote:
               | What the heck. It's "moot", not "mute".
        
               | wizofaus wrote:
               | I'm curious, are there accents in which those two words
               | are homophones? Given the US tendency to pronounce
               | new/due/tune as noo/doo/toon I can imagine some might say
               | mute as moot but I can't find anything authoritative
               | online.
        
               | jcranmer wrote:
               | According to Wikipedia, East Anglia does universal yod-
               | dropping, so mute/moot would be homophonic. (See
               | https://en.wiktionary.org/wiki/Appendix:English_dialect-
               | depe...).
               | 
               | Personally, I haven't come across anyone who pronounces
               | 'mute' without the /j/.
        
               | danuker wrote:
               | They are not perfect homophones. There is a slight i (IPA
               | j) in "mute".
               | 
               | https://en.wiktionary.org/wiki/mute
               | 
               | https://en.wiktionary.org/wiki/moot
        
               | pjmlp wrote:
               | That is like saying if sushi knifes are already sharp
               | enough, there is no issue cutting fish with a samurai
               | sword instead, except at least with the knife maybe the
               | damage isn't as bad.
        
               | Koshkin wrote:
               | I like your comparison of a C programmer with a samurai.
        
               | pjmlp wrote:
               | Including that most of them end up doing Seppuku on their
               | applications.
        
               | agumonkey wrote:
               | while we're hugging them from behind
        
               | phibz wrote:
               | It's more like the C programmer is a sushi master. They
               | can make a delicious, beautifully crafted snack. But if
               | the wrong ingredients are used you'll get very sick.
        
               | samatman wrote:
               | The difference between the largest sushi knives and a
               | katana is more about who wields them than the blade
               | involved.
        
               | pjmlp wrote:
               | One ends up cutting quite a few pieces either way.
        
           | jcelerier wrote:
           | the only result of banning VLAs is to force everyone to use
           | alloca, which is even less safe.
           | 
           | exhibit A: https://lists.freedesktop.org/archives/mesa-
           | commit/2020-Dece...
           | 
           | exhibit B: https://github.com/neovim/neovim/issues/5229
           | 
           | exhibit C: https://github.com/sailfishos-mirror/llvm-
           | project/commit/6be...
           | 
           | etc etc
        
             | tedunangst wrote:
             | Nobody is _forced_ to use alloca, which is not less safe,
             | only equally disastrous. Just use malloc, already.
        
               | jcelerier wrote:
               | ah yes, why didn't I think of it, let me just try:
               | #include <cstdlib>         #include <span>
               | __attribute__((annotate("realtime")))         void
               | process_floats(std::span<float> vec)          {
               | auto filter = (float*) malloc(sizeof(float) *
               | vec.size());                    /* fill filter with
               | values */                    for(int i = 0; i <
               | vec.size(); i++)             vec[i] *= filter[i];
               | free(filter);         }              $ stoat-compile++ -c
               | foo.cpp -emit-llvm -std=c++20         $ stoat foo.bc
               | Parsing 'foo.bc'...              Error #1:
               | process_floats(std::span<float, 18446744073709551615ul>)
               | _Z14process_floatsSt4spanIfLm18446744073709551615EE
               | ##The Deduction Chain:         ##The Contradiction
               | Reasons:          - malloc : NonRealtime (Blacklist)
               | - free : NonRealtime (Blacklist)
               | 
               | oh noes :((
        
               | tedunangst wrote:
               | Here's a nickel, kid.
               | 
               | The bullshit about oh my embedded systems doesn't have
               | dynamic memory is bullshit. You either know how big your
               | stack is and how many elements there are, and you make
               | the array that big. Or you don't know and you're fucked.
               | 
               | You can't clever your way out of not knowing how big to
               | make the array with magic stack fairy pretend dynamic
               | memory. You can only fuck up. Is there room for 16
               | elements? The array is 16. Is there room for 32? It's 32.
        
               | dataflow wrote:
               | I think the parent comment was about malloc not being
               | real-time? Not about storage space.
               | 
               | Though I do wonder why there can't be a form of malloc
               | that allocates in a stack like fashion in real time to
               | satisfy the formal verifier?
        
               | remexre wrote:
               | > Though I do wonder why there can't be a form of malloc
               | that allocates in a stack like fashion in real time
               | 
               | I think that's basically what the LLVM SafeStack pass
               | does -- stack variables that might be written out of
               | bounds are moved to a separate stack so they can't smash
               | the return address.
        
               | kllrnohj wrote:
               | Real time also generally means your input sizes are
               | bounded and known, otherwise the algorithm itself isn't
               | realtime and malloc isn't the reason why.
               | 
               | But strictly speaking the only problem is a malloc/free
               | that can lock (you can end up with priority inversion).
               | So a lock-free malloc would be realtime just fine, it
               | doesn't have to be stack growth only.
        
               | dataflow wrote:
               | > Real time also generally means your input sizes are
               | bounded and known, otherwise the algorithm itself isn't
               | realtime and malloc isn't the reason why.
               | 
               | I think you meant to say something else? Real-time is a
               | property of the system indicating a constraint on the
               | latency from the input to the output--it doesn't
               | constrain the input itself. (Otherwise even 'cat'
               | wouldn't be real-time!)
        
               | jcelerier wrote:
               | > You either know how big your stack is
               | 
               | that's in most systems I target a run-time property, not
               | a compile-time one
        
       | svnpenn wrote:
       | This kind of insanity is exactly why I don't use stuff like C
       | anymore.
       | 
       | It's a landmark and incredible language, but it's chock full of
       | potholes and foot-shotguns (footguns that blow off your entire
       | leg). Even huge companies can't get it right, which makes sense.
       | It started as a language basically without guardrails, and
       | because of the extreme deference to the holy backward
       | compatibility, it's more or less always going to be that.
        
         | coliveira wrote:
         | C has many problems, but I don't see this as one. Very few
         | people even realize this functionality exists, and I don't see
         | anyone promoting its use.
        
         | MayeulC wrote:
         | I think that's more of an issue with C++.
         | 
         | Usually developers stick to known patterns and subsets. Sure,
         | you can write "clever" code that's more aproppriate for IOCCC
         | than as system-critical code, but that's the case with most
         | programming languages.
         | 
         | There is such a thing as "bad code" and "good code". A lot of
         | code linters can catch "code smells", and features can often be
         | disabled at the compiler level.
         | 
         | On the other hand, not having to resort to inline assembler or
         | writing 500x the amount of code in some cases is why we have
         | such features. Complex features are useful in some (complex)
         | cases, but are not to be overused.
        
         | lelanthran wrote:
         | > It's a landmark and incredible language, but it's chock full
         | of potholes and foot-shotguns (footguns that blow off your
         | entire leg).
         | 
         | There are very few footguns in C, compared to (say) C++ (or
         | Python).
         | 
         | I can guarantee you that no employed C developer is putting
         | IOCCC type code into shipping products. Pick any language you
         | like, and turn up the code golfing to 11, and you'll be equally
         | horrified.
        
         | jmount wrote:
         | I agree. The incredibly semantics-hostile optimizer ("undefined
         | means I can do anything", whereas in old C "undefined" just
         | mean you were no longer sure what number was the result of an
         | overflow) just takes the cake.
        
           | tooltower wrote:
           | What old C do you mean? I can't think of any version where
           | undefined had a defined meaning
        
             | hxtk wrote:
             | Not defined as part of the standard, but before compilers
             | got as smart about their optimizations, it was easier to
             | have behavior that was technically undefined but could be
             | reasoned about in practice.
             | 
             | Now that compilers know cleverer optimizations, undefined
             | behavior is often impossible to reason about because the
             | compiler can change your logic into something else that is
             | more optimal and is equivalent to your logic only in well-
             | defined cases.
        
             | marssaxman wrote:
             | It's not that the behavior was defined by the C standard,
             | but that you could confidently predict what a given
             | compiler would generate, for a given platform, so it was
             | quite normal to write programs which made productive use of
             | officially-undefined behavior when that was the behavior
             | you wanted. (It may well have _been_ defined by that
             | particular compiler 's documentation.)
             | 
             | Nowadays, people are used to thinking entirely inside the
             | abstraction provided by the language spec, but that was not
             | the case in the '80s and early '90s. The boundary between
             | the C language model and the underlying machine
             | architecture was porous. You would include snippets of
             | assembly language in your C code, if you wanted to do
             | something more quickly than you thought the compiler could
             | do it; and your C code would take full advantage of your
             | knowledge about the underlying memory layouts, calling
             | conventions, register usage, and so forth.
             | 
             | It did not really matter that there were gaps in the "C
             | machine" abstraction because nobody was really programming
             | against the "C machine"; they were programming against
             | their actual hardware, using C as a tool for generating
             | machine code.
        
               | tremon wrote:
               | _you could confidently predict what a given compiler
               | would generate, for a given platform_
               | 
               | But then you're no longer writing Standard C. You're
               | writing compiler-flavoured C, which is another source of
               | footguns for projects that outlive the compiler (or
               | version) they were originally written for. Which is fine,
               | as you say, for embedded-style projects that only target
               | one processor model and one compiler version. But I don't
               | think that applies to many of today's projects.
        
             | badsectoracula wrote:
             | You can read the C89 rationale here[0] but in general the
             | point of undefined behavior was (and maybe still is, i
             | didn't check other rationales) partly to let
             | implementations not bother with catching hard-to-catch
             | errors for things that could actually happen and partly to
             | allow for implementation extensions for things they didn't
             | want to or couldn't define.
             | 
             | In addition the entire idea of introducing undefined,
             | unspecified and implementation-defined behaviors was to let
             | existing implementations do, for the most part, whatever
             | they were already doing while still being standards
             | conformant (ok, the rationale's exact words is to "allow a
             | certain variety among implementations", but in practice C
             | compilers already existed in 1989 and the companies behind
             | them most likely wanted to call them as "C89 conformant"
             | without having to make significant changes).
             | 
             | C89 didn't define undefined behavior because that wouldn't
             | make sense, but it did define what it means and going by
             | the C89 rationale about what it was meant to be used for,
             | clearly the idea wasn't the extremist "breaking your code
             | at the slight whiff of UB because optimizations" but
             | "letting you do things that we can't or don't want to
             | define while keeping our own hands clear".
             | 
             | The "letting you" bit is important which is why they have
             | the distinction between "strictly conforming program" and
             | "conforming program" (i.e. minus the "strict") - which
             | essentially has the former only be for "maximally portable"
             | programs and the latter being "whatever conforming
             | implementations accept", with conforming implementations
             | being any C implementation that can compile strictly
             | conforming programs - regardless of any extensions the
             | implementation may have as long as these do not affect the
             | strictly conforming programs.
             | 
             | In other words it was C89 Committee's way of saying "a
             | (conforming) C program is basically anything a C compiler
             | compiles as long as said C compiler also compiles C
             | programs that adhere to the strict conformance we defined
             | here" - which BTW flies in the face of the entire idea that
             | introducing a single instance of "undefined behavior" makes
             | the entire program not "valid C" anymore (after all program
             | with undefined behavior can still be a conforming program
             | as long as it is accepted by a compiler that also accepts
             | strictly conforming programs).
             | 
             | This is the sort of circular self-referencing logic you get
             | when committees try to standardize something that already
             | has a bunch of not necessarily compatible implementations
             | while also trying to not ruffle the feathers of the
             | companies behind them too much.
             | 
             | It'd be an amusing tale if only some people (who you can
             | ignore anyway) didn't get into fights about what page x,
             | paragraph y, verse z of the Standard[1] say and decades
             | later funneling that logic into compilers (which are
             | somewhat harder to ignore) that break existing working code
             | while Bible thumping their standards book whenever someone
             | goes "WTF, this thing used to work before i upgraded the
             | compiler"[3]
             | 
             | [0] http://www.lysator.liu.se/c/rat/title.html
             | 
             | [1] Capitalization Intentional
             | 
             | [2] Yes, C was considered one at some point :-P
             | 
             | [3] "No, it is not valid C, it couldn't have worked. You
             | clearly imagined it."
        
       | ezoe wrote:
       | This is just too much of insanity I can witness today.
        
       | a-dub wrote:
       | the fact that you can do recursion before even entering the
       | function is amusing. not THAT strange though, i imagine the
       | compiler just gloms a preamble onto the executing function's
       | stack frame.
       | 
       | amusing to think that the goal of them is probably to make it
       | easier avoid buffer overruns, but then they can just be extended
       | themselves to cause similar problems anyway.
        
         | legalcorrection wrote:
         | You are entering the function. It's not in the function body in
         | the source file, but the code is almost certainly inserted at
         | the beginning of the function in the compiled output.
        
           | [deleted]
        
       | tjoff wrote:
       | Excellent, didn't know that you could reference previous function
       | arguments :)
       | 
       | Though I do feel that this is something different from VLAs (no
       | critique of the article though!). In my head at least the key
       | factor about VLA is that you can allocate an array on the stack
       | when the size isn't known at compile time. But that is not what
       | the author does.
       | 
       | A VLA as specified in a function definition or a declaration
       | (such as: _void f(int n, float v[n]);_ ) is known at compile
       | time. And even if trickery is used to do a runtime-calculation of
       | [n] the array itself is not allocated on the stack. And thus also
       | not subject of the common criticism of VLAs.
       | 
       | So, at least in my opinion the arguments (discussed in this
       | threadl) regarding VLAs is mostly orthogonal to the syntax
       | (ab)used here.
        
       | ufo wrote:
       | They mention that while loops are limited because C doesn't have
       | tail recursion. However, in my experience gcc and clang are
       | pretty decent at tail recursion. It happens via the -foptimize-
       | sibling-calls setting, which is enabled by default on -O2 or
       | higher.
       | 
       | The caveat is that the standard doesn't _guarantee_ these
       | optimizations. But there are some non-standard __attribute__
       | declarations that can help with that.
        
         | [deleted]
        
         | monkpit wrote:
         | > opiltimization
         | 
         | Is this some portmanteau for "compile time optimization"?
        
           | ufo wrote:
           | Nah, it was just a typo. But I like your idea!
        
           | SavageBeast wrote:
           | On the topic of "compile time optimization" ... There was
           | once a merry bunch of very excellent and brilliant crack
           | smokers at Google who created a project called GWT. The
           | raison d'etre of the project was simply "We know you can
           | write JS but Java has more guardrails - why don't you write
           | Java and we'll transpile it down to better JS than you're
           | smart enough to write because JS basically sucks and is
           | always changing and browsers are F'd".
           | 
           | Before it was all said and done you could enable a "deep
           | compile" option that would even look at your java code for
           | cases that would never execute - rarely execute (you're a
           | dumb human right?) - etc and build an opinionated JS runtime
           | around it for performance and size (size complexity? Space-
           | Time-Trade-Off).
           | 
           | I was enamored with GWT for quite some time and developed a
           | high level of skill using the tool. I still miss it frankly.
           | People who seek to write one language and execute another are
           | fundamentally insane. Even though this is basically necessary
           | with a lang like C (one step above shifting bits and
           | understanding instructions etc) those people still amaze me.
           | 
           | I'm glad I wasn't born with the requisite intellect to go
           | down such rabbit holes myself. This makes me feel dumb and be
           | OK with it at the same time.
        
             | mrkeen wrote:
             | > People who seek to write one language and execute another
             | are fundamentally insane.
             | 
             | That would be people who use compilers.
        
               | chucksmash wrote:
               | You're technically right, which is the best kind of not
               | invited to happy hour.
        
               | SavageBeast wrote:
               | I wrote more or less the same reply but opted not to post
               | it - well done on you sir!
        
         | lsof wrote:
         | Well, I meant that loops are limited within the normal
         | constraints of the C standard, which doesn't offer tail call
         | elimination, but indeed as you point out it is easy to work
         | around it in practice using clang or gcc optimization passes.
        
       | google234123 wrote:
       | What is the best alternative when I want to allocate some
       | reasonably sized array on the stack and don't want to always
       | reserve the worst case size? alloca?
        
         | enriquto wrote:
         | You don't need an alternative. VLAs are perfectly appropriate
         | for that use case.
        
         | tedunangst wrote:
         | Reserve the maximum every time.
        
       | AshamedCaptain wrote:
       | Similar to making all your computations in the expressions for
       | default arguments in python (and/or C++) and leaving the function
       | bodies empty. Fancy, but not that mindboggling. What may surprise
       | people is how and when these expressions are evaluated since they
       | differ between languages.
        
         | jstanley wrote:
         | It might not be _mind-boggling_ , but it is highly unusual to
         | people who "think in C". This is not C! This kind of stuff is
         | why purists stick to C89.
         | 
         | I'm uneasy about variable-length arrays at the best of times.
         | It hadn't even occurred to me that you might have a variable-
         | length array as a function argument.
        
           | AlotOfReading wrote:
           | Serious question, but what "purists" remain on ANSI C? Even
           | Linux has abandoned it as of 5.19.
           | 
           | The only times I use it are when I'm writing something that
           | requires ridiculous portability even to bizarre platforms and
           | compilers. C99 is far more ergonomic in comparison, even if
           | you have to avoid using some of the truly terrible design
           | decisions that came along for the ride.
        
             | arinlen wrote:
             | > _Serious question, but what "purists" remain on ANSI C?_
             | 
             | It means nothing. It's a baseless attemt at an appeal to
             | authority that has no merit.
        
             | alar44 wrote:
             | Embedded probably.
        
               | AlotOfReading wrote:
               | That's where I work. I haven't seen c89 in years outside
               | those "extreme portability" projects like Linux and curl.
        
           | jsmith45 wrote:
           | VLAs in a prototype are sensible, because they decays to a
           | pointer, and is thus functioning purely as documentation.
           | 
           | In the function definition, it seems to basically also decay
           | to a pointer, except that the compiler adds basically an
           | assertion that the value in the brackets is greater than zero
           | to the to the top of the function. That actually seems quite
           | weird to me.
           | 
           | I'd have been fine with the VLA acting like a proper local
           | VLA with respect to things like sizeof, in which case the
           | assertion makes sense. Or I'd have also been fine with it
           | totally decaying to a pointer, making it just documentation.
           | This half-way in between state is quite weird.
        
             | unnah wrote:
             | Note that sizeof is useful on the inner dimensions of
             | multi-dimensional array arguments: the following prints the
             | value of m.                   int foo(int n, int m, double
             | x[n][m]) {             printf("sizeof(x[0])=%zd\n",
             | sizeof(x[0])/sizeof(double));         }
        
           | silon42 wrote:
           | or RIIR is suddenly preferable.
        
           | Koshkin wrote:
           | > _purists stick to C89_
           | 
           | But the _true Scotsman_ ... uh, purist sticks to the K&R C as
           | C89 has already lost some of the original elegance and
           | simplicity. (Never mind that it's a challenge since the
           | general community has moved on.)
        
           | MayeulC wrote:
           | Oh, but designated initializers are so useful!
        
           | arinlen wrote:
           | > _This is not C! This kind of stuff is why purists stick to
           | C89._
           | 
           | This personal assertion holds no water.
           | 
           | People hold/held onto C89 because compilers like Visual
           | Studio's C compiler failed to support anything beyond C89 for
           | ages.
           | 
           | https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-.
           | ..
           | 
           | It's my understanding that Microsoft adopted a role in the C
           | standardization committee that was a kin to sabotaging any
           | update beyond C89.
        
             | pjmlp wrote:
             | Microsoft saw no value in supporting C when C++ is a better
             | option [0], they caved in because Microsoft Loves Linux
             | (alongside key FOSS projects) implies loving C as well.
             | 
             | Note that they aren't supporting the optional annexes and
             | stuff like C atomics aren't supported.
             | 
             | [0] - https://herbsutter.com/2012/05/03/reader-qa-what-
             | about-vc-an...
        
         | Izkata wrote:
         | > Similar to making all your computations in the expressions
         | for default arguments in python
         | 
         | Python default arguments are evaluated only once at function
         | definition, not every function call. It's the source of one
         | interesting WTF for anyone who assumes otherwise:
         | def foo(a=[]):         a.append(1)         return a
         | >>> foo()       [1]       >>> foo()       [1, 1]       >>>
         | foo()       [1, 1, 1]
        
         | omoikane wrote:
         | It's not quite the same as default arguments in that default
         | arguments are evaluated at run time, whereas array lengths for
         | C++ (and not C) needs to be a compile time constant expression.
         | #include <stdio.h>              // "puts" makes "f" not
         | constexpr.         constexpr int f() { return puts("hello"); }
         | // error: size of array 'argv' is not an integral constant-
         | expression         int main(int argc, char *argv[f()]) {}
        
           | MayeulC wrote:
           | As you point out, VLAs in C are evaluated at run time too,
           | making it very similar to abusing default arguments.
        
         | planede wrote:
         | C++ doesn't allow accessing other function arguments in a
         | default argument, is in some sense the VLA trick is more
         | powerful.
        
         | josefx wrote:
         | > What may surprise people is how and when these expressions
         | are evaluated since they differ between languages.
         | 
         | What could possibly be surprising about reusing the exact same
         | object on every function call, especially when you default to
         | an empty list, set or dict. Getting an actual empty list is as
         | easy as defaulting to None and explicitly checking for it, so
         | there isn't even a reason to do it any differently. /s
         | 
         | Who the hell thought that this was sane behavior in a language
         | with mostly mutable objects?
        
           | a-dub wrote:
           | i don't think it's a matter of someone thinking it was sane
           | behavior, it's just what falls out from the rest of the rules
           | of the language. it seems that any attempt to fix it would be
           | convenient but inconsistent.
        
           | guipsp wrote:
           | You had me in the first half.
        
       | SavageBeast wrote:
       | I haven't thought about C in years - that one where author passes
       | in a printf as a char array element and de-references it to
       | execute it ... gives me chills. Y'all kids have fun - Im going to
       | stick with my VM over here and call it a day. This makes even
       | modern JS look sane by comparison. Excellent write up too.
        
         | nneonneo wrote:
         | Nit: the char array doesn't get dereferenced at all. The entire
         | computation happens as a side effect of computing the length of
         | the array. This is more of a case where there's an unexpected
         | expression context that can be abused for fun.
        
           | SavageBeast wrote:
           | The fact the array element isn't in quotes bothers me to some
           | degree - with or without its crazy but the fact that the
           | compiler just accepts it as a language construct as opposed
           | to a value makes me want to drink another beer. But again Im
           | thinking dereferencing here - and as you said thats not the
           | case.
        
             | nneonneo wrote:
             | It's not an array element :) The printf statement is part
             | of the length expression, i.e. it's setting the length of
             | the array to the result of the printf call. So it is indeed
             | a "value".
             | 
             | This isn't much different from writing something like this
             | in JS:                   var a = [];
             | a[console.log("Hello"), 1] = 42;
             | 
             | except that this indexes the array as opposed to setting
             | its length.
        
               | SavageBeast wrote:
               | I spent the past 10 minutes figuring out what to search
               | for (C is very rusty here) regarding C arrays and
               | initialization. Now that you point it out, it seems
               | obvious. It certainly wasn't obvious when I read it
               | though. This is some really obtuse use of a language here
               | - hilariously so really. The code in the array is
               | executed as a function that determines the array size and
               | I see that now - thanks. If someone on my team did this
               | for any reason I'm not sure If Id shoot them or put them
               | in charge of something more important. If no other thing
               | - THIS is why code reviews exist. Still it's an
               | impressive use of a compiler. Thanks for nit picking -
               | this has been very entertaining!
        
               | lololol0l wrote:
        
         | asguy wrote:
         | > I haven't thought about C in years ... Im going to stick with
         | my VM over here and call it a day.
         | 
         | What's your VM written in?
        
           | kllrnohj wrote:
           | Probably not C regardless. How many widely used VMs are in C?
           | Python is about it, right?
        
           | Banana699 wrote:
           | Safe Rust.
           | 
           | https://deno.land/
        
             | asguy wrote:
             | Except for most of deno that's written in Unsafe C++.
             | 
             | https://v8.dev/
        
               | Banana699 wrote:
               | A JS runtime is a lot more than the core engine, or else
               | nodejs is just a thin wrapper over V8 as well, which is
               | obviously absurd. So without a count of the source lines
               | for each language's portion of the implementation (very
               | rough but acceptable first approximation of how much is
               | implemented in each), you can't claim C++ is doing most
               | of the heavy lifting and be taken seriously.
               | 
               | PS. the worst C++ code is still a hell of a lot more safe
               | than the best C code. This is easy to convince yourself
               | of by noting that C++ mostly supersets C and only
               | deviates to add _more_ safety and static checking, not
               | less. So it 's a strict improvment over C, not by a lot,
               | but an improvment nonetheless.
               | 
               | Here are other examples for VMs not written in any of
               | those 2 braindead languages though:
               | 
               | [1] Bun : JS runtime in Zig. https://bun.sh/
               | 
               | [2] Squeak : Smalltalk VM written in Smalltalk.
               | http://www.vpri.org/pdf/tr1997001_backto.pdf
               | 
               | [3] PyPy : Not actually a single VM but an entire
               | framework\toolchain to write VMs, most famous of which is
               | one for Python. The language used to write VMs for the
               | framework is a subset of python called Rpython.
               | https://doc.pypy.org/en/latest/
               | 
               | [4] Maxine Virtual Machine : a JVM written entirely in
               | Java. It's not the only one.
               | https://en.wikipedia.org/wiki/Maxine_Virtual_Machine ;
               | https://news.ycombinator.com/item?id=15733645
               | 
               | Modern VM research is far beyond the 50 year old
               | assembler that thinks itself a programming language. The
               | future is here, just not very evenly distributed.
        
               | asguy wrote:
               | It was intended as a joke, but your response is a massive
               | cope. Have you actually looked at the production
               | C/C++/assembly code bases you're depending on?
               | 
               | Get a grip.
        
         | lsof wrote:
         | That printf exploit is quite impressive I think.
        
       | charcircuit wrote:
       | >The astute reader might point out that these two versions of sum
       | are not equivalent because the recursive definition may cause a
       | stack overflow for large enough values of n. This is
       | unfortunately true, and the major hurdle for the practicality of
       | disembodied C, but does not preclude Turing completeness (an
       | ideal Turing machine has infinite memory at its disposal).
       | 
       | It does preclude Turing completeness. The stack has an upperbound
       | of the size of a pointer times the word size.
        
         | addaon wrote:
         | Does it actually? Or does the number of elements on the stack
         | /that you have taken the address of/ have that upper bound?
         | 
         | That is, could you build a compliant C implementation that
         | implements a stack using an infinite memory (e.g. in a delay
         | loop between the user and a mirror moving away through space),
         | where each entry in the stack contains a convenient sized value
         | (word, byte, whatever) and a tag? If the tag is clear, the
         | value the contained directly in the infinite memory; if the tag
         | is set, then the value is indirected into a fixed-sized memory.
         | On taking the address of an item on the stack, if it is not
         | already indirected, relocate it and indirect.
         | 
         | Of course the mechanism for doing stack unwinding on return
         | needs to use relative operations ("drop five elements") and not
         | chase frame pointers, but that seems trivial.
         | 
         | I admit I'm not super-familiar with all the details of the C
         | specification, but it's not obvious to me that this would
         | violate spec, and it would be Turing complete.
        
           | fhars wrote:
           | Yes it does, C is defined in term of implementation defined,
           | but finite sized pointers and integers, so the computational
           | model of C is a finite state machine. But the size of the
           | state space is so friggin' HUGE that that argument is
           | completely irrelevant for all practical purposes.
        
             | addaon wrote:
             | Is this true? C is defined in terms of the operations that
             | can be performed on objects in memory, which includes the
             | concept of memory addresses and address manipulation; but
             | I'm not aware of anything that would require that objects
             | that do not have their address taken have unique addresses,
             | or addresses at all. This is the same use of the as-if
             | principle that allows so many local variables to live their
             | entire life in registers. An infinite stack to support
             | unlimited recursion seems possible within the C machine.
        
               | tedunangst wrote:
               | The C standard specifies that there is a constant
               | UINTPTR_MAX.
        
               | addaon wrote:
               | Yes. But, as I mention in the proposed implementation,
               | this limits the number of unique addresses in an
               | implementation. Values that do not need addresses do not
               | consume this limited resource. There would be a limit on
               | the number of items on the stack that can have their
               | address taken, but I can't see why this would limit the
               | number of items on the stack.
               | 
               | In e.g. the fizzbuzz implementation mentioned in another
               | comment [1], even without tail call optimization the
               | stack growth implied by the recursive call does not
               | require the number of /addressed/ items on the stack to
               | grow. (Thinking about it, I believe this is an equivalent
               | statement to "the tail call optimization is valid.")
               | 
               | [1] https://old.reddit.com/r/C_Programming/comments/qqazh
               | 8/fizzb...
        
       ___________________________________________________________________
       (page generated 2022-08-04 23:00 UTC)