[HN Gopher] Pnut: A C to POSIX shell compiler you can trust
       ___________________________________________________________________
        
       Pnut: A C to POSIX shell compiler you can trust
        
       Author : feeley
       Score  : 160 points
       Date   : 2024-07-24 00:22 UTC (22 hours ago)
        
 (HTM) web link (pnut.sh)
 (TXT) w3m dump (pnut.sh)
        
       | o11c wrote:
       | It's a bad sign when I immediately look at the screenshot and see
       | quoting bugs.
        
         | laurenth wrote:
         | Author here,
         | 
         | Because all shell variables in code generated by pnut are
         | numbers, variables never contain whitespace or special
         | characters and don't need to be quoted. We considered quoting
         | all variable expansions as this is generally seen as best
         | practice in shell programming, but thought it hurt readability
         | and decided not to.
         | 
         | If you think there are other issues, please let me know!
        
           | taviso wrote:
           | I think they're talking about the cp example, doesn't seem
           | like it would handle filenames with spaces!
           | 
           | Super neat project, btw!
        
             | laurenth wrote:
             | You're right, thanks for the bug report. It should now be
             | fixed :)
        
       | cozzyd wrote:
       | Can finally port systemd to shell to quell the rebellion.
        
         | carapace wrote:
         | Damned if that isn't the funniest thing I've heard in a long
         | time.
        
       | akoboldfrying wrote:
       | I was puzzled by the example C function containing pointers. Do I
       | understand correctly that you implement pointers in shell by
       | having a shell variable _0 for the first "byte" of "memory", a
       | shell variable _1 for the second, etc.?
        
         | laurenth wrote:
         | Author here,
         | 
         | That's correct! Unlike Bash and other modern shells, the POSIX
         | standard doesn't include arrays or any other data structures.
         | The way we found around this limitation is to use arithmetic
         | expansion and indexed shell variables (that are starting with
         | `_` as you noted) to get random memory access.
        
           | thesnide wrote:
           | I used almost the same idea, but with files in my
           | https://github.com/steveschnepp/shlibs
        
           | osmsucks wrote:
           | Since I experimented with something similar in the past to
           | mimick multidimensional arrays: depending on the
           | implementation this can absolutely _kill_ performance. IIRC,
           | Dash does a linear lookup of variable names, so when you
           | create tons of variables each lookup starts taking longer and
           | longer.
        
             | n_plus_1_acc wrote:
             | I hope you're not compiling C to sh for performance
             | reasons.
        
               | osmsucks wrote:
               | It's not about performance, it's about viability. If the
               | result is so slow that it's unusable, it doesn't matter
               | how portable it ends up being.
        
               | laurenth wrote:
               | We haven't found this to be an issue for Pnut. One of the
               | metric we use for performance is how much time it takes
               | to bootstrap Pnut, and dash takes around a minute which
               | is about the time taken by bash. This is with Pnut
               | allocating around 150KB of memory when compiling itself,
               | showing that Dash can still be useful even when hundreds
               | of KBs are allocated.
               | 
               | One thing we did notice is that subshells can be a
               | bottleneck when the environment is large, and so we
               | avoided subshells as much as possible in the runtime
               | library. Did you observe the same in your testing?
        
       | forrestthewoods wrote:
       | Hrmmm. But why?
       | 
       | Quite frankly I think Bash scripting is awful and frequently wish
       | shell scripts were written in a real and debuggable language. For
       | anything non-trivial that is.
       | 
       | I feel like I'd rather write C and compile it with Cosmopolitan C
       | to give me a cross-platform binary than this.
       | 
       | Neat project. Definitely clever. But it's headed in the opposite
       | direction from what I'd prefer...
        
         | marcodiego wrote:
         | Master Foo once said to a visiting programmer: "There is more
         | Unix-nature in one line of shell script than there is in ten
         | thousand lines of C."
         | 
         | The programmer, who was very proud of his mastery of C, said:
         | "How can this be? C is the language in which the very kernel of
         | Unix is implemented!"
         | 
         | Master Foo replied: "That is so. Nevertheless, there is more
         | Unix-nature in one line of shell script than there is in ten
         | thousand lines of C."
         | 
         | The programmer grew distressed. "But through the C language we
         | experience the enlightenment of the Patriarch Ritchie! We
         | become as one with the operating system and the machine,
         | reaping matchless performance!"
         | 
         | Master Foo replied: "All that you say is true. But there is
         | still more Unix-nature in one line of shell script than there
         | is in ten thousand lines of C."
         | 
         | The programmer scoffed at Master Foo and rose to depart. But
         | Master Foo nodded to his student Nubi, who wrote a line of
         | shell script on a nearby whiteboard, and said: "Master
         | programmer, consider this pipeline. Implemented in pure C,
         | would it not span ten thousand lines?"
         | 
         | The programmer muttered through his beard, contemplating what
         | Nubi had written. Finally he agreed that it was so.
         | 
         | "And how many hours would you require to implement and debug
         | that C program?" asked Nubi.
         | 
         | "Many," admitted the visiting programmer. "But only a fool
         | would spend the time to do that when so many more worthy tasks
         | await him."
         | 
         | "And who better understands the Unix-nature?" Master Foo asked.
         | "Is it he who writes the ten thousand lines, or he who,
         | perceiving the emptiness of the task, gains merit by not
         | coding?"
         | 
         | Upon hearing this, the programmer was enlightened.
        
           | forrestthewoods wrote:
           | And then the programmer had to debug a hundred line shell
           | script and they realized it should have all been written in
           | Python or Rust instead.
           | 
           | Master Foo is shorthand for Fool.
        
             | binary132 wrote:
             | Shell is just one way. There's nothing that says we can't
             | do better than shell, but what it's good at is saving
             | programmer time when the need isn't there for more, and
             | Rust is definitely not good at that.
        
               | forrestthewoods wrote:
               | My rule of thumb:                   Shell: <= 5 lines
               | Python: <= 500 lines         Rust: > 500 lines
               | 
               | Although to be honest I'd be perfectly happy if Shell was
               | restricted to single line commands only.
               | 
               | I've wasted a lot of time and energy deciphering
               | undebuggable shell scripts that were written to "save
               | programmer time". Not a fan.
        
               | therein wrote:
               | This rule of thumb is clearly too simplified, even as far
               | as the definition goes.
               | 
               | Sometimes you just want to execute 50 lines with little
               | logic.
               | 
               | Sometimes you just have some simple logic that needs to
               | be repeated.
               | 
               | Sometimes that logic is complicated, sometimes it is not.
        
               | forrestthewoods wrote:
               | Sometimes someone writes 50 lines of simple logic. And
               | then sometimes someone else needs to figure out why it's
               | not working. That person gets very cranky and wastes a
               | lot of time when those "simple" 50 lines aren't
               | debuggable.
               | 
               | If shell scripting didn't exist I would be totally fine
               | with that. There are _far_ more scripts that I wish were
               | written in a real language than the other way around.
        
               | eichin wrote:
               | My rule (and the code review policy I impose) emphasizes
               | complexity instead - a 50 line shell script is great if
               | it doesn't use if or case. (It's not so much of a strict
               | rule as "once you're nesting conditionals, or using any
               | shell construct that really needs a comment to explain
               | the shell and not your code, you should probably already
               | have switched to python." This is in parallel with "error
               | handling in this case is critical, do you _really_ think
               | your bash is accurate enough? ")
               | 
               | I wasn't the strictest reviewer (most feared, sure, but
               | not strictest) at least partly because my personal line
               | for "oh that bit of shell is obvious" is _way_ too high.
        
             | shric wrote:
             | Master Foo long predates Python and Rust.
        
               | forrestthewoods wrote:
               | Masters live to be surpassed by their students. Just
               | because something was best in class in the 80s doesn't
               | mean it should still be used.
        
               | thesnide wrote:
               | Very true, but also student hubris is legendary. Which is
               | perfectly fine, as we all know successful students.
               | 
               | But let's not blind ourselves with the survivor bias. Not
               | everything new and very bright will succeed the test of
               | time.
               | 
               | So let's take evrything with a grain of salt, and wait
               | until the time has choosen its champions. Which might not
               | be the best technology as we learned
        
           | donatj wrote:
           | I was going to cite this on reading the parent comment after
           | reading it. Was very glad to see you beat me to it!
        
           | VitoVan wrote:
           | http://catb.org/~esr/writings/unix-koans/ten-thousand.html
        
           | wruza wrote:
           | This koan shows the power of a one-liner, not shell scripting
           | in general. Both Master Foo and Nubi would agree that a
           | string/array manipulating function in bash isn't worth their
           | time when python exists.
        
         | luism6n wrote:
         | I'm not the OP, but I think the goal is to make it cross
         | architecture. Cross platform C compiler would give you cross OS
         | compatibility, but chip architecture would still be fixed, I
         | think.
         | 
         | I.e., you can take your compiled.sh and run in an obscure
         | processor with an obscure OS, as long as it's POSIX, it should
         | work...
        
           | MobiusHorizons wrote:
           | I believe the goal is to defeat the compiler trust thought
           | exercise where a malicious compiler could replicate itself
           | when being asked to compile the compiler. Since this produces
           | human readable code instead of assembly, the idea is it
           | allows bootstrapping a trusted compiler, since pnut.sh and
           | any output shell executables are directly auditable.
           | 
           | I suppose the trust moves to the shell executable then, but
           | at least you could run the bootstrapping with multiple shells
           | and expect identical output.
        
             | laurenth wrote:
             | That's the idea!
             | 
             | As you point out, it moves the trust from the binary to the
             | shell executable, but the shell is already a key piece of
             | any build process and requires a minimum level of trust.
             | The technique of bootstrapping on multiple shells and
             | comparing the outputs is known as Double Diverse
             | Compiling[0] and we think POSIX shell is particularly
             | suited for this use case since it has so many
             | implementations from different and likely independent
             | sources.
             | 
             | The age and stability of the POSIX shell standard also play
             | in our favor. Old shell binaries should be able bootstrap
             | Pnut, and those binaries may be less likely to be
             | compromised as the trusting trust attack was less known at
             | that time, akin to low-background steel[1] that was made
             | before nuclear bombs contaminated the atmosphere and steel
             | produced after that time.
             | 
             | 0: https://dwheeler.com/trusting-trust/ 1:
             | https://en.wikipedia.org/wiki/Low-background_steel
        
         | wahern wrote:
         | I don't know about the specific motivations for this project,
         | but if you're curious about why work like this might have
         | serious real-world relevance beyond scratching an itch, idle
         | exploration, or meeting a research paper quota, you can look to
         | similar work and literature:
         | 
         | GNU Mes: https://www.gnu.org/software/mes/
         | 
         | Stage0: https://bootstrapping.miraheze.org/wiki/Stage0
         | 
         | Ribbit (same authors): https://github.com/udem-dlteam/ribbit
         | 
         | stage0-posix: https://github.com/oriansj/stage0-posix
         | 
         | Bootstrappable Builds: https://bootstrappable.org/
         | 
         | See also this LWN article about bootstrappable and reproducible
         | builds: https://lwn.net/Articles/841797/ It contains a plethora
         | of interesting links.
        
         | oguz-ismail wrote:
         | > Hrmmm. But why?
         | 
         | because Bash goes brrrr
        
       | andrewf wrote:
       | Looking forward to the point where this can build autoconf. It's
       | great that the generated ./configure script is portable but if I
       | want to make substantial changes to the project I need to find a
       | binary for my machine (and version differences can be quite
       | substantial)
        
         | akdev1l wrote:
         | This is going further into the hell that is shell-generated
         | scripts that culminated in the xz-utils attack.
         | 
         | We would benefit from steering away from auto-generated
         | scripts. Autoconf included.
        
         | jcranmer wrote:
         | > Looking forward to the point where this can build autoconf.
         | 
         | Autoconf is a perl program that turns (heavily customized) m4
         | files into shell scripts. How does a C compiler help there?
        
           | andrewf wrote:
           | > Autoconf is a perl program
           | 
           | Oof, did not realize.
        
       | rubicks wrote:
       | I can't wait to see the shell equivalents for ptrace, setjmp, and
       | dlopen.
        
         | actionfromafar wrote:
         | Do you _really_?
         | 
         | Maybe then I can also interest you in an exception handler for
         | DOS batch scripts:
         | 
         | https://stackoverflow.com/a/55501133/193892
        
       | wahern wrote:
       | This is very cool, regardless of how serious it was intended to
       | be taken. Before base-64 encoders/decoders became more common as
       | preinstalled commands in the environments I found myself on, I
       | wrote a base64 utility in mostly pure POSIX shell:
       | https://25thandClement.com/~william/2023/base64.sh
       | 
       | If this project had existed I might have opted to compile my
       | C-based base-64 encoder and decoder routines, suitably tweaked
       | for pnut's limitations.
       | 
       | I say base64.sh is mostly pure not because it relies on shell
       | extensions, but because the only non-builtins it depends on are
       | od(1) or, alternatively, dd(1) to assist with binary I/O. And
       | preferably od(1), as reading certain control characters, like
       | NUL, into a shell variable is especially dubious. The encoder is
       | designed to operate on a stream of decimal encoded bytes. (See
       | decimals_fast for using od to encode stdin to decimals, and
       | decimals_slow for using dd for the same.)
       | 
       | It looks like pnut uses `read -r` for reading input. In addition
       | to NULs and related raw byte issues, I was worried about chunking
       | issues (e.g. truncation or errors) on binary data, e.g. no
       | newlines within LINE_BUF bytes. Have you tested binary I/O much?
       | Relatedly, how many different shell implementations have you
       | tested your core scheme with? In addition to bash, dash, and
       | various incarnations of /bin/sh on the BSDs, I also tested
       | base64.sh with Solaris' system shells (ksh88 and ksh93
       | derivatives), as well as AIX's (ksh88 derivative). AIX had some
       | odd quirks with pipelines even with plain text I/O.
       | (Unfortunately Polar Home is gone, now, so I have no easy way to
       | play with AIX; maybe that's for the better.)
        
         | laurenth wrote:
         | One of the example we include is a base64 encoder/decoder:
         | https://github.com/udem-
         | dlteam/pnut/blob/main/examples/compiled/base64.sh
         | 
         | It doesn't support NULs as you pointed out, but it's
         | interesting to see similarities between your implementation and
         | the one generated by Pnut.
         | 
         | Because we use `read -r`, we haven't tested reading binary
         | files. Fortunately, the shell's `printf` function can emit all
         | 256 characters so Pnut can at least output binary files. This
         | makes it possible for Pnut to have a x86 backend for the use of
         | reproducible builds.
         | 
         | Regarding the use of `read`, one constraint we set ourselves
         | when writing Pnut is to not use any external utilities,
         | including those that are specified by the POSIX standard (other
         | than `read` and `printf`). This maximizes portability of the
         | code generated by Pnut and is enough for the reproducible build
         | use case.
         | 
         | We're still looking for ways to integrate existing shell code
         | with C. One way this can be done is through the use of the
         | `#include_shell` directive which includes existing shell code
         | in the generated shell script. This makes it possible to call
         | the necessary utilities to read raw bytes without having Pnut
         | itself depends on less portable utilities.
        
           | teo_zero wrote:
           | Sorry, but since the very goal of base64 is to encode
           | "uncomfortable" bytes, saying that your example doesn't work
           | with uncomfortable bytes is like providing a fibonacci demo
           | that only works with arguments less than 3, or a clock that
           | only shows correct time twice a day.
           | 
           | I'd choose a different example to showcase pnut.
        
             | wahern wrote:
             | In the context of what it seems to be _primarily_
             | attempting to achieve, _assisting_ in the bootstrapping of
             | more complex environments directly or indirectly dependent
             | on C, I found the base64 example (more so the SHA-256
             | example in the same directory) quite interesting and
             | evidence of the sophistication of pnut notwithstanding the
             | limitations. And as was pointed out, it wouldn 't be
             | difficult to hack in the ability to read binary data: just
             | swap in a replacement for the getchar routine, such as I've
             | done with od. In fact, that ease is one of the most
             | fascinating aspects of this project--they've built a
             | conceptually powerful execution model for the shell that
             | can be directly targeted when compiling C code, as opposed
             | to indirection through an intermediate VM (e.g. a P-code
             | interpreter in shell). It has it's limitations, but those
             | can be addressed. Given the constraints, the foundation is
             | substantial and powerful even from a utilitarian
             | perspective.
             | 
             | When people discuss Turing completeness and related
             | concepts one of the unstated caveats is that neither the
             | concept itself, nor most solutions or environments,
             | meaningfully address the problem of I/O with the external
             | environment. pnut is kind of exceptional in this regard,
             | even with the limitations.
        
       | theamk wrote:
       | If you are wondering how it handles C-only functions.. it does
       | not.
       | 
       | open(..., O_RDWR | O_EXCL) -> runtime error, "echo "Unknow file
       | mode" ; exit 1"
       | 
       | lseek(fd, 1, SEEK_HOLE); -> invalid code (uses undefined _lseek)
       | 
       | socket(AF_UNIX, SOCK_STREAM, 0); -> same (uses undefined _socket)
       | 
       | looking closer at "cp" and "cat" examples, write() call does not
       | handle errors at all. Forget about partial writes, it does not
       | even return -1 on failures.
       | 
       | "Compiler you can Trust", indeed... maybe you can trust it to get
       | all the details wrong?
        
         | PhilipRoman wrote:
         | Implementation issues aside, while technically it should be
         | possible to seek a file descriptor from shell through a
         | suitable helper program in C, I believe none of the POSIX
         | utilities provide this facility
        
           | oguz-ismail wrote:
           | _head_ , _read_ , and _sed_ can be used for seeking forward
           | according to POSIX (see the INPUT FILES section here  <https:
           | //pubs.opengroup.org/onlinepubs/9799919799/utilities/V...>).
           | I doubt non-GNU implementations support it though.
        
             | Someone wrote:
             | If it's in POSIX, chances are the BSDs implement it, too.
             | 
             | I think seeking a specific number of bytes and then writing
             | data there will be a problem, though.
             | 
             | For seeking _n_ bytes, _read_ nor _sed_ will work; they
             | work with lines.
             | 
             |  _sed_ is the only one of those that can write, and POSIX
             | doesn't appear to have the _-i_ option for in-place editing
             | (https://pubs.opengroup.org/onlinepubs/9699919799/utilities
             | /s...)
             | 
             | So, I think _head_ for seeking followed by _sed_ (or _ed_
             | or _vi_ , but _sed_ is the simpler tool, I think) for
             | replacing the first _n_ characters, redirecting to a temp
             | file and then doing a _mv_ is your only option.
             | 
             | Advantage will be that writes will be atomic; disadvantage
             | that it will be slow
        
             | rwmj wrote:
             | head was used for this purpose in the xz backdoor.
        
             | hun3 wrote:
             | I think dd might be more reliable. (Is dd POSIX?)
        
         | x5a17ed wrote:
         | maybe access to libc functions can be achieved through
         | something like <https://github.com/taviso/ctypes.sh>. Although
         | that very specific implementation seems to require explicitly
         | bash and is not broadly POSIX Shell compatible as Pnut wants to
         | be.
        
         | Cloudef wrote:
         | There seems to be libc in the repo but many functions are TODO
         | https://github.com/udem-dlteam/pnut/tree/main/portable_libc
         | 
         | Otherwise the builtins seems to be here
         | https://github.com/udem-dlteam/pnut/blob/main/runtime.sh
         | 
         | FYI all your functions are not "C functions", but rather POSIX
         | functions. I did not expect it to be complete, but it's still
         | impressive for what it is.
        
           | westurner wrote:
           | There are Linux ports of the plan9 `syscall` binary, which is
           | presumably necessary to implement parts of libc with shell
           | scripts: https://stackoverflow.com/questions/10196395/os-
           | system-calls...
           | 
           | I don't remember there being a way to keep a server listening
           | on a /dev/tcp/$ip/$port port, for sockets from shell scripts
           | with shellcheck at least
        
       | kxndnenfn wrote:
       | This is quite interesting! Without having dug deeper into it,
       | seeing the human readable output I assume quite different
       | semantics from C?
       | 
       | The C to shell transpiler I'm aware of will output unreadable
       | code (elvm using 8cc with sh backend)
        
       | 1vuio0pswjnm7 wrote:
       | "Because Pnut can be distributed as a human-readable shell script
       | (`pnut.sh`), it can serve as the basis for a reproducible build
       | system. With a POSIX compliant shell, `pnut.sh` is sufficiently
       | powerful to compile itself and, with some effort,
       | [TCC](https://bellard.org/tcc/). Because TCC can be used to
       | bootstrap GCC, this makes it possible to bootstrap a fully
       | featured build toolchain from only human-readable source files
       | and a POSIX shell.
       | 
       | Because Pnut doesn't support certain C features used in TCC, Pnut
       | features a native code backend that supports a larger subset of
       | C99. We call this compiler `pnut-exe`, and it can be compiled
       | using `pnut.sh`. This makes it possible to compile `pnut-exe.c`
       | using `pnut.sh`, and then compile TCC, all from a POSIX shell."
       | 
       | Anywhere we can see a step-by-step demo of this process.
       | 
       | Curious if the authors tried NetBSD or OpenBSD, or using another
       | small C compiler, e.g., pcc.
       | 
       | Historically, tcc was problematic for NetBSD and its forks. Not
       | sure about today, but tcc is _still_ in NetBSD pkgsrc WIP which
       | suggests problems remain.
        
       | dsp_person wrote:
       | I use linux-vt-setcolors in my startup, which would be a bit more
       | convenient if it was a shell script instead of C, but it uses
       | ioctl.
       | 
       | Trying to compile with this tool fails with "comp_glo_decl:
       | unexpected declaration"
        
       | gojomybeloved wrote:
       | Love this!
        
       | teo_zero wrote:
       | Just to be clear, the input must be written in a subset of C,
       | because many constructs are not recognized, like unsigned types,
       | static variables, [] arrays, etc.
       | 
       | Is there a plan to remove such limitations?
        
         | blueflow wrote:
         | These are restrictions of the target language and there isn't
         | much pnut can do about this.
        
           | fulafel wrote:
           | Surely unsigned (aka modulo) arithmetic and arrays are
           | expressible in shell script?
           | 
           | edit: For reference, someone's take on building out better
           | bash-like array functionality in posix shell:
           | https://github.com/friendly-bits/POSIX-arrays (there's only
           | very rudimentary array support built-in to posix sh,
           | basically working with stuff in $@ using set -- arg1 arg2..)
        
       | voidUpdate wrote:
       | When I'm told that "I can trust" something that I feel like I had
       | no reason to distrust, it makes me feel even more suspicious of
       | it
        
         | leni536 wrote:
         | https://www.smbc-comics.com/comic/2008-09-15
        
         | throwaway2037 wrote:
         | Yeah, I cringed when I saw that too. It violates an important
         | rule of selling: Never tell the customer "Trust me".
        
         | tzot wrote:
         | Perhaps you're old enough to remember the Sledge's[1] motto:
         | "Trust me... I know what I'm doing." HHBS Perusing the pnut
         | site I did not understand either why this is software I can
         | trust.
         | 
         | [1] https://www.imdb.com/title/tt0090525/
        
         | Q-Q3 wrote:
         | Hi there! I believe the mention of "trust" is related to the
         | paper _Reflections on Trusting Trust_ by Ken Thompson
         | https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...
         | Though I do think the tagline used could definitely be improved
         | from a marketing standpoint.
        
       | osmsucks wrote:
       | I'm writing something similar, but it's based on its own
       | scripting language. The idea of transpiling C sounds appealing
       | but impractical: how do they plan to compile, say, things using
       | mmap, setjmp, pthreads, ...? It would be better to clearly
       | promise only a restricted subset of C.
        
       | vermon wrote:
       | If the end goal is portability for C, would Cosmopolitan Libc be
       | a better choice because it supports a lot more features and
       | probably runs faster?
        
         | Y_Y wrote:
         | I cant run cosmolibc on Android, for example. Then again this
         | converter is somewhat limited and didn't accept any of the
         | IOCCC code I gave it.
        
           | hnlmorg wrote:
           | > I cant run cosmolibc on Android, for example.
           | 
           | You can:
           | 
           | https://justine.lol/cosmo3/
           | 
           | > After nearly one year of development, I'm pleased to
           | announce our version 3.0 release of the Cosmopolitan library.
           | [...] we invented a new linker that lets you build fat
           | binaries which can run on these platforms: AMD ... ARM64
           | 
           | https://github.com/jart/cosmopolitan/releases/tag/3.5.3
           | 
           | > This release fixes Android support. You can now run LLMs on
           | your phone using Cosmopolitan software like llamafile. See
           | 78d3b86 for further details. Thank you @aj47 (techfren.net)
           | for bug reports and and testing efforts.
        
             | Y_Y wrote:
             | Thanks for the link!
             | 
             | My comment was based on cloning master yesterday and trying
             | to build redbean but hitting what looks like
             | https://github.com/jart/cosmopolitan/issues/940
             | 
             | Indeed it lioks like the commit you mentioned should have
             | fixed the issue with the pointer having too many bits for
             | the weird kernel used on android and some raspis. Fingers
             | crossed that release works.
             | 
             | edit:
             | 
             | Testing that release on Termux 118, stock Android 14 on a
             | moto g73 5G (XT2237-2):                   ~/cosmopolitan $
             | uname -a         Linux localhost
             | 5.10.205-android12-9-00027-g4d6c07fc6342-ab11525972 #1 SMP
             | PREEMPT Mon Mar 4 18:49:33 UTC 2024 aarch64 Android
             | ~/cosmopolitan $ /data/data/com.termux/files/home/cosmopoli
             | tan/build/bootstrap/cocmd         ape error: /data/data/com
             | .termux/files/home/cosmopolitan/build/bootstrap/cocmd: prog
             | mmap failed w/ errno 12
        
             | actionfromafar wrote:
             | Can you run it on RISCV Android?!
        
               | hnlmorg wrote:
               | No, but Android on RISC-V isn't even considered stable.
               | So you'll be manually compiling a fair chunk of code to
               | get it running. Adding a few extra tools to your build
               | pipeline isn't going to be a deal breaker.
        
       | okaleniuk wrote:
       | I love things like these because they shake our perception of
       | normal loose. And who said our perception of normal doesn't
       | deserve a good shake?
       | 
       | A C to shell compiler might seem impractical, but you know what
       | is even more impractical? Having a separate language for a build
       | system. And yet, here we are. Using Shell, Make or CMake to build
       | a C program is only acceptable because is has always been so.
       | It's a "perceived normality" in the C world.
       | 
       | There is no good reason, however, CMake isn't a C library. With
       | build system being a library, we could write, read, and, most
       | importantly, debug build scripts just like any other part of the
       | buildable. We already have includeOS, why not includeMake?
        
         | bregma wrote:
         | Why would you need a screwdriver or a glass cutter if you
         | already have a hammer?
        
           | okaleniuk wrote:
           | With C, you have the whole toolbox and the toolbox factory.
        
             | defrost wrote:
             | Both the tweezers _and_ the bit flipping magnet .. and who
             | would want anything more?
        
               | thechao wrote:
               | Yeah -- C would be ok as a build system language if it
               | was easy to: invoke & manage subprocesses; build & manage
               | dynamic dependency graphs; and, easily work with file
               | metadata.
               | 
               | Or... work with me: Make does that, well enough.
        
             | evilotto wrote:
             | > So we stopped selling those [hammer factory] schematics
             | and started selling hammer-factory-building factories.
             | 
             | https://web.archive.org/web/20180722051250/http://discuss.j
             | o...
        
         | RHSeeger wrote:
         | > you know what is even more impractical? Having a separate
         | language for a build system
         | 
         | Why is it you think that?
        
         | jcelerier wrote:
         | > but you know what is even more impractical? Having a separate
         | language for a build system.
         | 
         | I disagree. For a very simple example it really makes life
         | easier to not have to care about quoting filenames in build
         | systems and just list a.c b.cpp etc., while you really want
         | strings to be quoted in normal programming languages. Build
         | systems that tried to be based on syntax of existing PLs (for
         | instance Meson, QBS) are a real PITA for me when compared to
         | CMake due to a lot of such affordances.
        
         | eichin wrote:
         | DSLs ("microlanguages", at the time) were a big idea in the
         | late 80s - by being expressive in ways closer to the problem,
         | they could leave out irrelevant things and the bugs they lead
         | to. (Do you really want to have to explicitly call malloc() in
         | your build tools? and does gdb really feel like the right tool
         | for debugging one?)
        
         | skinner927 wrote:
         | Have you tried Zig? Its build system is configured in the
         | language. It's actually a binary you build and run to build
         | your project. Obviously the standard library has facilities for
         | making building easy.
        
       | layer8 wrote:
       | Can you trust that it faithfully reproduces undefined behavior?
       | ;)
        
       | itvision wrote:
       | Instantly make your C code 200 times slower without any effort!
        
         | actionfromafar wrote:
         | I think it takes probably some effort, not all C programs will
         | compile on this thing.
        
         | chasil wrote:
         | It would actually be interesting to see how much faster dash is
         | than everything else.
        
           | throwaway2037 wrote:
           | Why is Dash frequently touted as so much faster than Bash?
           | What is different?
        
             | tzot wrote:
             | It is much simpler (and therefore less resource-hungry)
             | than bash.
        
             | chasil wrote:
             | On rhel9, this is a list of my installed shells. You might
             | notice that dash is smaller than ls (and the rest of the
             | shells).                 $ ll /bin/bash /bin/dash
             | /bin/ksh93 /bin/ls /bin/mksh       -rwxr-xr-x. 1 root root
             | 1389064 May  1 00:59 /bin/bash       -rwxr-xr-x. 1 root
             | root  128608 May  9  2023 /bin/dash       -rwxr-xr-x. 1
             | root root 1414912 Apr  9 07:26 /bin/ksh93       -rwxr-xr-x.
             | 1 root root  140920 Apr  8 08:20 /bin/ls       -rwxr-xr-x.
             | 1 root root  325208 Jan  9  2022 /bin/mksh            $ rpm
             | -qi dash | tail -4       Description :       DASH is a
             | POSIX-compliant implementation of /bin/sh that aims to be
             | as small as       possible. It does this without
             | sacrificing speed where possible. In fact, it is
             | significantly faster than bash (the GNU Bourne-Again SHell)
             | for most tasks.
        
           | laurenth wrote:
           | From our experience, ksh is generally faster, and dash sits
           | between ksh and bash. One reason is that dash stores
           | variables using a very small hash table with only 37
           | entries[0] meaning variable access quickly becomes linear as
           | memory usage grows. But even with that, dash is still
           | surprisingly fast -- when compiling `pnut.c` with `pnut.sh`,
           | dash comes in second place:                 ksh93: 31s
           | dash:  1m06s       bash:  1m19s       zsh:   >15m
           | 
           | [0]: https://git.kernel.org/pub/scm/utils/dash/dash.git/tree/
           | src/...
           | 
           | EDIT: ksh93, not ksh
        
             | AdmiralAsshat wrote:
             | People still use KornShell?
        
               | chasil wrote:
               | All of Android is still based on a pdksh-derivative known
               | as mksh, which is an enormous install base.
               | 
               | http://www.mirbsd.org/mksh.htm
               | 
               | OpenBSD switched their default shell to their own pdksh-
               | derivative known as oksh.
               | 
               | https://github.com/ibara/oksh
               | 
               | There was an effort to (re)start ksh93 development, but
               | AT&T halted this effort. The bugfixes from the failed
               | effort have moved back into Korn's last release.
               | 
               | https://github.com/ksh93/ksh/tree/dev
        
             | cb321 wrote:
             | For me `dash` compiles in just a few seconds. If you link
             | to a 1-line problem (here, #define VTABSIZE 39), then why
             | not boost that to 79 or 113, say, re-compile the shell and
             | re-run your benchmark? Might lead to a change in upstream
             | that could benefit everyone.
        
               | chasil wrote:
               | Or rework the array so realloc() can expand its size?
        
               | cb321 wrote:
               | Yes.. Another fine idea, just more work than a 2
               | character edit. :-)
        
       | atilaneves wrote:
       | I'm still figuring out why anyone would want to write a shell
       | script in C. That sounds like torture to me.
        
       | Retr0id wrote:
       | Can it do wrapping arithmetic?
       | 
       | The `sum` example doesn't seem to do wrapping, but signed int
       | overflow is technically UB so I guess they're fine not to.
       | 
       | Switching it to `unsigned int` gives me:
       | 
       | code.c:1:1 syntax error: unsupported type
        
       | iod wrote:
       | I am sorry if this comes off to be negative, but with every
       | example provided on the site, when compiled and then fed into
       | ShellCheck1, generates warnings about non-portable and ambiguous
       | problems with the script. What exactly are we supposed to trust?
       | 
       | 1 https://www.shellcheck.net
        
         | laurenth wrote:
         | It seems ShellCheck errs on the side of caution when checking
         | arithmetic expansions and some of its recommendations are not
         | relevant in the context they are given. For example, on
         | `cat.sh`, one of the lines that are marked in red is:
         | In examples/compiled/cat.sh line 7:         : $((_$__ALLOC =
         | $2)) # Track object size           ^-- SC1102 (error): Shells
         | disambiguate $(( differently or not at all. For $(command
         | substitution), add space after $( . For $((arithmetics)), fix
         | parsing errors.           ^-----------------^ SC2046 (warning):
         | Quote this to prevent word splitting.
         | ^--------------^ SC2205 (warning): (..) is a subshell. Did you
         | mean [ .. ], a test expression?                        ^--
         | SC2283 (error): Remove spaces around = to assign (or use [ ] to
         | compare, or quote '=' if literal).                          ^--
         | SC2086 (info): Double quote to prevent globbing and word
         | splitting.
         | 
         | It seems to be parsing the arithmetic expansion as a command
         | substitution, which then causes the analyzer to produce errors
         | that aren't relevant. ShellCheck's own documentation[0] mention
         | this in the exceptions section, and the code is generated such
         | that quoting and word splitting are not an issue (because
         | variables never contain whitespace or special characters).
         | 
         | It also warns about `let` being undefined in POSIX shell, but
         | `let` is defined in the shell script so it's a false positive
         | that's caused by the use of the `let` keyword specifically.
         | 
         | If you think there are other issues or ways to improve Pnut's
         | compatibility with Shellcheck, please let us know!
         | 
         | 0: https://www.shellcheck.net/wiki/SC1102
        
       | metadat wrote:
       | Also see this related submission from May, 2024:
       | 
       |  _Amber: Programming language compiled to Bash_
       | https://news.ycombinator.com/item?id=40431835 (318 comments)
       | 
       | ---
       | 
       | Pnut doesn't seem to differentiate between `int' and `int*'
       | function parameters. That's weird, and doesn't come across as
       | trustworthy at all! Shouldn't the use of pointers be disallowed
       | instead?                 int test1(int a, int len) {
       | return a;       }              int test2(int* a, int len) {
       | return a;       }
       | 
       | Both compile to the exact same thing:                 : $((len =
       | a = 0))       _test1() { let a $2; let len $3         : $(($1 =
       | a))         endlet $1 len a       }              : $((len = a =
       | 0))       _test2() { let a $2; let len $3         : $(($1 = a))
       | endlet $1 len a       }
       | 
       | The "runtime library" portion at the bottom of every script is
       | nigh unreadable.
       | 
       | Even still, it's a cool concept.
        
       | JoshTriplett wrote:
       | Several times I've found myself wishing for the reverse: a shell-
       | to-binary compiler or JIT.
        
       ___________________________________________________________________
       (page generated 2024-07-24 23:08 UTC)