[HN Gopher] Pnut: A C to POSIX shell compiler you can trust
___________________________________________________________________
Pnut: A C to POSIX shell compiler you can trust
Author : feeley
Score : 160 points
Date : 2024-07-24 00:22 UTC (22 hours ago)
(HTM) web link (pnut.sh)
(TXT) w3m dump (pnut.sh)
| o11c wrote:
| It's a bad sign when I immediately look at the screenshot and see
| quoting bugs.
| laurenth wrote:
| Author here,
|
| Because all shell variables in code generated by pnut are
| numbers, variables never contain whitespace or special
| characters and don't need to be quoted. We considered quoting
| all variable expansions as this is generally seen as best
| practice in shell programming, but thought it hurt readability
| and decided not to.
|
| If you think there are other issues, please let me know!
| taviso wrote:
| I think they're talking about the cp example, doesn't seem
| like it would handle filenames with spaces!
|
| Super neat project, btw!
| laurenth wrote:
| You're right, thanks for the bug report. It should now be
| fixed :)
| cozzyd wrote:
| Can finally port systemd to shell to quell the rebellion.
| carapace wrote:
| Damned if that isn't the funniest thing I've heard in a long
| time.
| akoboldfrying wrote:
| I was puzzled by the example C function containing pointers. Do I
| understand correctly that you implement pointers in shell by
| having a shell variable _0 for the first "byte" of "memory", a
| shell variable _1 for the second, etc.?
| laurenth wrote:
| Author here,
|
| That's correct! Unlike Bash and other modern shells, the POSIX
| standard doesn't include arrays or any other data structures.
| The way we found around this limitation is to use arithmetic
| expansion and indexed shell variables (that are starting with
| `_` as you noted) to get random memory access.
| thesnide wrote:
| I used almost the same idea, but with files in my
| https://github.com/steveschnepp/shlibs
| osmsucks wrote:
| Since I experimented with something similar in the past to
| mimick multidimensional arrays: depending on the
| implementation this can absolutely _kill_ performance. IIRC,
| Dash does a linear lookup of variable names, so when you
| create tons of variables each lookup starts taking longer and
| longer.
| n_plus_1_acc wrote:
| I hope you're not compiling C to sh for performance
| reasons.
| osmsucks wrote:
| It's not about performance, it's about viability. If the
| result is so slow that it's unusable, it doesn't matter
| how portable it ends up being.
| laurenth wrote:
| We haven't found this to be an issue for Pnut. One of the
| metric we use for performance is how much time it takes
| to bootstrap Pnut, and dash takes around a minute which
| is about the time taken by bash. This is with Pnut
| allocating around 150KB of memory when compiling itself,
| showing that Dash can still be useful even when hundreds
| of KBs are allocated.
|
| One thing we did notice is that subshells can be a
| bottleneck when the environment is large, and so we
| avoided subshells as much as possible in the runtime
| library. Did you observe the same in your testing?
| forrestthewoods wrote:
| Hrmmm. But why?
|
| Quite frankly I think Bash scripting is awful and frequently wish
| shell scripts were written in a real and debuggable language. For
| anything non-trivial that is.
|
| I feel like I'd rather write C and compile it with Cosmopolitan C
| to give me a cross-platform binary than this.
|
| Neat project. Definitely clever. But it's headed in the opposite
| direction from what I'd prefer...
| marcodiego wrote:
| Master Foo once said to a visiting programmer: "There is more
| Unix-nature in one line of shell script than there is in ten
| thousand lines of C."
|
| The programmer, who was very proud of his mastery of C, said:
| "How can this be? C is the language in which the very kernel of
| Unix is implemented!"
|
| Master Foo replied: "That is so. Nevertheless, there is more
| Unix-nature in one line of shell script than there is in ten
| thousand lines of C."
|
| The programmer grew distressed. "But through the C language we
| experience the enlightenment of the Patriarch Ritchie! We
| become as one with the operating system and the machine,
| reaping matchless performance!"
|
| Master Foo replied: "All that you say is true. But there is
| still more Unix-nature in one line of shell script than there
| is in ten thousand lines of C."
|
| The programmer scoffed at Master Foo and rose to depart. But
| Master Foo nodded to his student Nubi, who wrote a line of
| shell script on a nearby whiteboard, and said: "Master
| programmer, consider this pipeline. Implemented in pure C,
| would it not span ten thousand lines?"
|
| The programmer muttered through his beard, contemplating what
| Nubi had written. Finally he agreed that it was so.
|
| "And how many hours would you require to implement and debug
| that C program?" asked Nubi.
|
| "Many," admitted the visiting programmer. "But only a fool
| would spend the time to do that when so many more worthy tasks
| await him."
|
| "And who better understands the Unix-nature?" Master Foo asked.
| "Is it he who writes the ten thousand lines, or he who,
| perceiving the emptiness of the task, gains merit by not
| coding?"
|
| Upon hearing this, the programmer was enlightened.
| forrestthewoods wrote:
| And then the programmer had to debug a hundred line shell
| script and they realized it should have all been written in
| Python or Rust instead.
|
| Master Foo is shorthand for Fool.
| binary132 wrote:
| Shell is just one way. There's nothing that says we can't
| do better than shell, but what it's good at is saving
| programmer time when the need isn't there for more, and
| Rust is definitely not good at that.
| forrestthewoods wrote:
| My rule of thumb: Shell: <= 5 lines
| Python: <= 500 lines Rust: > 500 lines
|
| Although to be honest I'd be perfectly happy if Shell was
| restricted to single line commands only.
|
| I've wasted a lot of time and energy deciphering
| undebuggable shell scripts that were written to "save
| programmer time". Not a fan.
| therein wrote:
| This rule of thumb is clearly too simplified, even as far
| as the definition goes.
|
| Sometimes you just want to execute 50 lines with little
| logic.
|
| Sometimes you just have some simple logic that needs to
| be repeated.
|
| Sometimes that logic is complicated, sometimes it is not.
| forrestthewoods wrote:
| Sometimes someone writes 50 lines of simple logic. And
| then sometimes someone else needs to figure out why it's
| not working. That person gets very cranky and wastes a
| lot of time when those "simple" 50 lines aren't
| debuggable.
|
| If shell scripting didn't exist I would be totally fine
| with that. There are _far_ more scripts that I wish were
| written in a real language than the other way around.
| eichin wrote:
| My rule (and the code review policy I impose) emphasizes
| complexity instead - a 50 line shell script is great if
| it doesn't use if or case. (It's not so much of a strict
| rule as "once you're nesting conditionals, or using any
| shell construct that really needs a comment to explain
| the shell and not your code, you should probably already
| have switched to python." This is in parallel with "error
| handling in this case is critical, do you _really_ think
| your bash is accurate enough? ")
|
| I wasn't the strictest reviewer (most feared, sure, but
| not strictest) at least partly because my personal line
| for "oh that bit of shell is obvious" is _way_ too high.
| shric wrote:
| Master Foo long predates Python and Rust.
| forrestthewoods wrote:
| Masters live to be surpassed by their students. Just
| because something was best in class in the 80s doesn't
| mean it should still be used.
| thesnide wrote:
| Very true, but also student hubris is legendary. Which is
| perfectly fine, as we all know successful students.
|
| But let's not blind ourselves with the survivor bias. Not
| everything new and very bright will succeed the test of
| time.
|
| So let's take evrything with a grain of salt, and wait
| until the time has choosen its champions. Which might not
| be the best technology as we learned
| donatj wrote:
| I was going to cite this on reading the parent comment after
| reading it. Was very glad to see you beat me to it!
| VitoVan wrote:
| http://catb.org/~esr/writings/unix-koans/ten-thousand.html
| wruza wrote:
| This koan shows the power of a one-liner, not shell scripting
| in general. Both Master Foo and Nubi would agree that a
| string/array manipulating function in bash isn't worth their
| time when python exists.
| luism6n wrote:
| I'm not the OP, but I think the goal is to make it cross
| architecture. Cross platform C compiler would give you cross OS
| compatibility, but chip architecture would still be fixed, I
| think.
|
| I.e., you can take your compiled.sh and run in an obscure
| processor with an obscure OS, as long as it's POSIX, it should
| work...
| MobiusHorizons wrote:
| I believe the goal is to defeat the compiler trust thought
| exercise where a malicious compiler could replicate itself
| when being asked to compile the compiler. Since this produces
| human readable code instead of assembly, the idea is it
| allows bootstrapping a trusted compiler, since pnut.sh and
| any output shell executables are directly auditable.
|
| I suppose the trust moves to the shell executable then, but
| at least you could run the bootstrapping with multiple shells
| and expect identical output.
| laurenth wrote:
| That's the idea!
|
| As you point out, it moves the trust from the binary to the
| shell executable, but the shell is already a key piece of
| any build process and requires a minimum level of trust.
| The technique of bootstrapping on multiple shells and
| comparing the outputs is known as Double Diverse
| Compiling[0] and we think POSIX shell is particularly
| suited for this use case since it has so many
| implementations from different and likely independent
| sources.
|
| The age and stability of the POSIX shell standard also play
| in our favor. Old shell binaries should be able bootstrap
| Pnut, and those binaries may be less likely to be
| compromised as the trusting trust attack was less known at
| that time, akin to low-background steel[1] that was made
| before nuclear bombs contaminated the atmosphere and steel
| produced after that time.
|
| 0: https://dwheeler.com/trusting-trust/ 1:
| https://en.wikipedia.org/wiki/Low-background_steel
| wahern wrote:
| I don't know about the specific motivations for this project,
| but if you're curious about why work like this might have
| serious real-world relevance beyond scratching an itch, idle
| exploration, or meeting a research paper quota, you can look to
| similar work and literature:
|
| GNU Mes: https://www.gnu.org/software/mes/
|
| Stage0: https://bootstrapping.miraheze.org/wiki/Stage0
|
| Ribbit (same authors): https://github.com/udem-dlteam/ribbit
|
| stage0-posix: https://github.com/oriansj/stage0-posix
|
| Bootstrappable Builds: https://bootstrappable.org/
|
| See also this LWN article about bootstrappable and reproducible
| builds: https://lwn.net/Articles/841797/ It contains a plethora
| of interesting links.
| oguz-ismail wrote:
| > Hrmmm. But why?
|
| because Bash goes brrrr
| andrewf wrote:
| Looking forward to the point where this can build autoconf. It's
| great that the generated ./configure script is portable but if I
| want to make substantial changes to the project I need to find a
| binary for my machine (and version differences can be quite
| substantial)
| akdev1l wrote:
| This is going further into the hell that is shell-generated
| scripts that culminated in the xz-utils attack.
|
| We would benefit from steering away from auto-generated
| scripts. Autoconf included.
| jcranmer wrote:
| > Looking forward to the point where this can build autoconf.
|
| Autoconf is a perl program that turns (heavily customized) m4
| files into shell scripts. How does a C compiler help there?
| andrewf wrote:
| > Autoconf is a perl program
|
| Oof, did not realize.
| rubicks wrote:
| I can't wait to see the shell equivalents for ptrace, setjmp, and
| dlopen.
| actionfromafar wrote:
| Do you _really_?
|
| Maybe then I can also interest you in an exception handler for
| DOS batch scripts:
|
| https://stackoverflow.com/a/55501133/193892
| wahern wrote:
| This is very cool, regardless of how serious it was intended to
| be taken. Before base-64 encoders/decoders became more common as
| preinstalled commands in the environments I found myself on, I
| wrote a base64 utility in mostly pure POSIX shell:
| https://25thandClement.com/~william/2023/base64.sh
|
| If this project had existed I might have opted to compile my
| C-based base-64 encoder and decoder routines, suitably tweaked
| for pnut's limitations.
|
| I say base64.sh is mostly pure not because it relies on shell
| extensions, but because the only non-builtins it depends on are
| od(1) or, alternatively, dd(1) to assist with binary I/O. And
| preferably od(1), as reading certain control characters, like
| NUL, into a shell variable is especially dubious. The encoder is
| designed to operate on a stream of decimal encoded bytes. (See
| decimals_fast for using od to encode stdin to decimals, and
| decimals_slow for using dd for the same.)
|
| It looks like pnut uses `read -r` for reading input. In addition
| to NULs and related raw byte issues, I was worried about chunking
| issues (e.g. truncation or errors) on binary data, e.g. no
| newlines within LINE_BUF bytes. Have you tested binary I/O much?
| Relatedly, how many different shell implementations have you
| tested your core scheme with? In addition to bash, dash, and
| various incarnations of /bin/sh on the BSDs, I also tested
| base64.sh with Solaris' system shells (ksh88 and ksh93
| derivatives), as well as AIX's (ksh88 derivative). AIX had some
| odd quirks with pipelines even with plain text I/O.
| (Unfortunately Polar Home is gone, now, so I have no easy way to
| play with AIX; maybe that's for the better.)
| laurenth wrote:
| One of the example we include is a base64 encoder/decoder:
| https://github.com/udem-
| dlteam/pnut/blob/main/examples/compiled/base64.sh
|
| It doesn't support NULs as you pointed out, but it's
| interesting to see similarities between your implementation and
| the one generated by Pnut.
|
| Because we use `read -r`, we haven't tested reading binary
| files. Fortunately, the shell's `printf` function can emit all
| 256 characters so Pnut can at least output binary files. This
| makes it possible for Pnut to have a x86 backend for the use of
| reproducible builds.
|
| Regarding the use of `read`, one constraint we set ourselves
| when writing Pnut is to not use any external utilities,
| including those that are specified by the POSIX standard (other
| than `read` and `printf`). This maximizes portability of the
| code generated by Pnut and is enough for the reproducible build
| use case.
|
| We're still looking for ways to integrate existing shell code
| with C. One way this can be done is through the use of the
| `#include_shell` directive which includes existing shell code
| in the generated shell script. This makes it possible to call
| the necessary utilities to read raw bytes without having Pnut
| itself depends on less portable utilities.
| teo_zero wrote:
| Sorry, but since the very goal of base64 is to encode
| "uncomfortable" bytes, saying that your example doesn't work
| with uncomfortable bytes is like providing a fibonacci demo
| that only works with arguments less than 3, or a clock that
| only shows correct time twice a day.
|
| I'd choose a different example to showcase pnut.
| wahern wrote:
| In the context of what it seems to be _primarily_
| attempting to achieve, _assisting_ in the bootstrapping of
| more complex environments directly or indirectly dependent
| on C, I found the base64 example (more so the SHA-256
| example in the same directory) quite interesting and
| evidence of the sophistication of pnut notwithstanding the
| limitations. And as was pointed out, it wouldn 't be
| difficult to hack in the ability to read binary data: just
| swap in a replacement for the getchar routine, such as I've
| done with od. In fact, that ease is one of the most
| fascinating aspects of this project--they've built a
| conceptually powerful execution model for the shell that
| can be directly targeted when compiling C code, as opposed
| to indirection through an intermediate VM (e.g. a P-code
| interpreter in shell). It has it's limitations, but those
| can be addressed. Given the constraints, the foundation is
| substantial and powerful even from a utilitarian
| perspective.
|
| When people discuss Turing completeness and related
| concepts one of the unstated caveats is that neither the
| concept itself, nor most solutions or environments,
| meaningfully address the problem of I/O with the external
| environment. pnut is kind of exceptional in this regard,
| even with the limitations.
| theamk wrote:
| If you are wondering how it handles C-only functions.. it does
| not.
|
| open(..., O_RDWR | O_EXCL) -> runtime error, "echo "Unknow file
| mode" ; exit 1"
|
| lseek(fd, 1, SEEK_HOLE); -> invalid code (uses undefined _lseek)
|
| socket(AF_UNIX, SOCK_STREAM, 0); -> same (uses undefined _socket)
|
| looking closer at "cp" and "cat" examples, write() call does not
| handle errors at all. Forget about partial writes, it does not
| even return -1 on failures.
|
| "Compiler you can Trust", indeed... maybe you can trust it to get
| all the details wrong?
| PhilipRoman wrote:
| Implementation issues aside, while technically it should be
| possible to seek a file descriptor from shell through a
| suitable helper program in C, I believe none of the POSIX
| utilities provide this facility
| oguz-ismail wrote:
| _head_ , _read_ , and _sed_ can be used for seeking forward
| according to POSIX (see the INPUT FILES section here <https:
| //pubs.opengroup.org/onlinepubs/9799919799/utilities/V...>).
| I doubt non-GNU implementations support it though.
| Someone wrote:
| If it's in POSIX, chances are the BSDs implement it, too.
|
| I think seeking a specific number of bytes and then writing
| data there will be a problem, though.
|
| For seeking _n_ bytes, _read_ nor _sed_ will work; they
| work with lines.
|
| _sed_ is the only one of those that can write, and POSIX
| doesn't appear to have the _-i_ option for in-place editing
| (https://pubs.opengroup.org/onlinepubs/9699919799/utilities
| /s...)
|
| So, I think _head_ for seeking followed by _sed_ (or _ed_
| or _vi_ , but _sed_ is the simpler tool, I think) for
| replacing the first _n_ characters, redirecting to a temp
| file and then doing a _mv_ is your only option.
|
| Advantage will be that writes will be atomic; disadvantage
| that it will be slow
| rwmj wrote:
| head was used for this purpose in the xz backdoor.
| hun3 wrote:
| I think dd might be more reliable. (Is dd POSIX?)
| x5a17ed wrote:
| maybe access to libc functions can be achieved through
| something like <https://github.com/taviso/ctypes.sh>. Although
| that very specific implementation seems to require explicitly
| bash and is not broadly POSIX Shell compatible as Pnut wants to
| be.
| Cloudef wrote:
| There seems to be libc in the repo but many functions are TODO
| https://github.com/udem-dlteam/pnut/tree/main/portable_libc
|
| Otherwise the builtins seems to be here
| https://github.com/udem-dlteam/pnut/blob/main/runtime.sh
|
| FYI all your functions are not "C functions", but rather POSIX
| functions. I did not expect it to be complete, but it's still
| impressive for what it is.
| westurner wrote:
| There are Linux ports of the plan9 `syscall` binary, which is
| presumably necessary to implement parts of libc with shell
| scripts: https://stackoverflow.com/questions/10196395/os-
| system-calls...
|
| I don't remember there being a way to keep a server listening
| on a /dev/tcp/$ip/$port port, for sockets from shell scripts
| with shellcheck at least
| kxndnenfn wrote:
| This is quite interesting! Without having dug deeper into it,
| seeing the human readable output I assume quite different
| semantics from C?
|
| The C to shell transpiler I'm aware of will output unreadable
| code (elvm using 8cc with sh backend)
| 1vuio0pswjnm7 wrote:
| "Because Pnut can be distributed as a human-readable shell script
| (`pnut.sh`), it can serve as the basis for a reproducible build
| system. With a POSIX compliant shell, `pnut.sh` is sufficiently
| powerful to compile itself and, with some effort,
| [TCC](https://bellard.org/tcc/). Because TCC can be used to
| bootstrap GCC, this makes it possible to bootstrap a fully
| featured build toolchain from only human-readable source files
| and a POSIX shell.
|
| Because Pnut doesn't support certain C features used in TCC, Pnut
| features a native code backend that supports a larger subset of
| C99. We call this compiler `pnut-exe`, and it can be compiled
| using `pnut.sh`. This makes it possible to compile `pnut-exe.c`
| using `pnut.sh`, and then compile TCC, all from a POSIX shell."
|
| Anywhere we can see a step-by-step demo of this process.
|
| Curious if the authors tried NetBSD or OpenBSD, or using another
| small C compiler, e.g., pcc.
|
| Historically, tcc was problematic for NetBSD and its forks. Not
| sure about today, but tcc is _still_ in NetBSD pkgsrc WIP which
| suggests problems remain.
| dsp_person wrote:
| I use linux-vt-setcolors in my startup, which would be a bit more
| convenient if it was a shell script instead of C, but it uses
| ioctl.
|
| Trying to compile with this tool fails with "comp_glo_decl:
| unexpected declaration"
| gojomybeloved wrote:
| Love this!
| teo_zero wrote:
| Just to be clear, the input must be written in a subset of C,
| because many constructs are not recognized, like unsigned types,
| static variables, [] arrays, etc.
|
| Is there a plan to remove such limitations?
| blueflow wrote:
| These are restrictions of the target language and there isn't
| much pnut can do about this.
| fulafel wrote:
| Surely unsigned (aka modulo) arithmetic and arrays are
| expressible in shell script?
|
| edit: For reference, someone's take on building out better
| bash-like array functionality in posix shell:
| https://github.com/friendly-bits/POSIX-arrays (there's only
| very rudimentary array support built-in to posix sh,
| basically working with stuff in $@ using set -- arg1 arg2..)
| voidUpdate wrote:
| When I'm told that "I can trust" something that I feel like I had
| no reason to distrust, it makes me feel even more suspicious of
| it
| leni536 wrote:
| https://www.smbc-comics.com/comic/2008-09-15
| throwaway2037 wrote:
| Yeah, I cringed when I saw that too. It violates an important
| rule of selling: Never tell the customer "Trust me".
| tzot wrote:
| Perhaps you're old enough to remember the Sledge's[1] motto:
| "Trust me... I know what I'm doing." HHBS Perusing the pnut
| site I did not understand either why this is software I can
| trust.
|
| [1] https://www.imdb.com/title/tt0090525/
| Q-Q3 wrote:
| Hi there! I believe the mention of "trust" is related to the
| paper _Reflections on Trusting Trust_ by Ken Thompson
| https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...
| Though I do think the tagline used could definitely be improved
| from a marketing standpoint.
| osmsucks wrote:
| I'm writing something similar, but it's based on its own
| scripting language. The idea of transpiling C sounds appealing
| but impractical: how do they plan to compile, say, things using
| mmap, setjmp, pthreads, ...? It would be better to clearly
| promise only a restricted subset of C.
| vermon wrote:
| If the end goal is portability for C, would Cosmopolitan Libc be
| a better choice because it supports a lot more features and
| probably runs faster?
| Y_Y wrote:
| I cant run cosmolibc on Android, for example. Then again this
| converter is somewhat limited and didn't accept any of the
| IOCCC code I gave it.
| hnlmorg wrote:
| > I cant run cosmolibc on Android, for example.
|
| You can:
|
| https://justine.lol/cosmo3/
|
| > After nearly one year of development, I'm pleased to
| announce our version 3.0 release of the Cosmopolitan library.
| [...] we invented a new linker that lets you build fat
| binaries which can run on these platforms: AMD ... ARM64
|
| https://github.com/jart/cosmopolitan/releases/tag/3.5.3
|
| > This release fixes Android support. You can now run LLMs on
| your phone using Cosmopolitan software like llamafile. See
| 78d3b86 for further details. Thank you @aj47 (techfren.net)
| for bug reports and and testing efforts.
| Y_Y wrote:
| Thanks for the link!
|
| My comment was based on cloning master yesterday and trying
| to build redbean but hitting what looks like
| https://github.com/jart/cosmopolitan/issues/940
|
| Indeed it lioks like the commit you mentioned should have
| fixed the issue with the pointer having too many bits for
| the weird kernel used on android and some raspis. Fingers
| crossed that release works.
|
| edit:
|
| Testing that release on Termux 118, stock Android 14 on a
| moto g73 5G (XT2237-2): ~/cosmopolitan $
| uname -a Linux localhost
| 5.10.205-android12-9-00027-g4d6c07fc6342-ab11525972 #1 SMP
| PREEMPT Mon Mar 4 18:49:33 UTC 2024 aarch64 Android
| ~/cosmopolitan $ /data/data/com.termux/files/home/cosmopoli
| tan/build/bootstrap/cocmd ape error: /data/data/com
| .termux/files/home/cosmopolitan/build/bootstrap/cocmd: prog
| mmap failed w/ errno 12
| actionfromafar wrote:
| Can you run it on RISCV Android?!
| hnlmorg wrote:
| No, but Android on RISC-V isn't even considered stable.
| So you'll be manually compiling a fair chunk of code to
| get it running. Adding a few extra tools to your build
| pipeline isn't going to be a deal breaker.
| okaleniuk wrote:
| I love things like these because they shake our perception of
| normal loose. And who said our perception of normal doesn't
| deserve a good shake?
|
| A C to shell compiler might seem impractical, but you know what
| is even more impractical? Having a separate language for a build
| system. And yet, here we are. Using Shell, Make or CMake to build
| a C program is only acceptable because is has always been so.
| It's a "perceived normality" in the C world.
|
| There is no good reason, however, CMake isn't a C library. With
| build system being a library, we could write, read, and, most
| importantly, debug build scripts just like any other part of the
| buildable. We already have includeOS, why not includeMake?
| bregma wrote:
| Why would you need a screwdriver or a glass cutter if you
| already have a hammer?
| okaleniuk wrote:
| With C, you have the whole toolbox and the toolbox factory.
| defrost wrote:
| Both the tweezers _and_ the bit flipping magnet .. and who
| would want anything more?
| thechao wrote:
| Yeah -- C would be ok as a build system language if it
| was easy to: invoke & manage subprocesses; build & manage
| dynamic dependency graphs; and, easily work with file
| metadata.
|
| Or... work with me: Make does that, well enough.
| evilotto wrote:
| > So we stopped selling those [hammer factory] schematics
| and started selling hammer-factory-building factories.
|
| https://web.archive.org/web/20180722051250/http://discuss.j
| o...
| RHSeeger wrote:
| > you know what is even more impractical? Having a separate
| language for a build system
|
| Why is it you think that?
| jcelerier wrote:
| > but you know what is even more impractical? Having a separate
| language for a build system.
|
| I disagree. For a very simple example it really makes life
| easier to not have to care about quoting filenames in build
| systems and just list a.c b.cpp etc., while you really want
| strings to be quoted in normal programming languages. Build
| systems that tried to be based on syntax of existing PLs (for
| instance Meson, QBS) are a real PITA for me when compared to
| CMake due to a lot of such affordances.
| eichin wrote:
| DSLs ("microlanguages", at the time) were a big idea in the
| late 80s - by being expressive in ways closer to the problem,
| they could leave out irrelevant things and the bugs they lead
| to. (Do you really want to have to explicitly call malloc() in
| your build tools? and does gdb really feel like the right tool
| for debugging one?)
| skinner927 wrote:
| Have you tried Zig? Its build system is configured in the
| language. It's actually a binary you build and run to build
| your project. Obviously the standard library has facilities for
| making building easy.
| layer8 wrote:
| Can you trust that it faithfully reproduces undefined behavior?
| ;)
| itvision wrote:
| Instantly make your C code 200 times slower without any effort!
| actionfromafar wrote:
| I think it takes probably some effort, not all C programs will
| compile on this thing.
| chasil wrote:
| It would actually be interesting to see how much faster dash is
| than everything else.
| throwaway2037 wrote:
| Why is Dash frequently touted as so much faster than Bash?
| What is different?
| tzot wrote:
| It is much simpler (and therefore less resource-hungry)
| than bash.
| chasil wrote:
| On rhel9, this is a list of my installed shells. You might
| notice that dash is smaller than ls (and the rest of the
| shells). $ ll /bin/bash /bin/dash
| /bin/ksh93 /bin/ls /bin/mksh -rwxr-xr-x. 1 root root
| 1389064 May 1 00:59 /bin/bash -rwxr-xr-x. 1 root
| root 128608 May 9 2023 /bin/dash -rwxr-xr-x. 1
| root root 1414912 Apr 9 07:26 /bin/ksh93 -rwxr-xr-x.
| 1 root root 140920 Apr 8 08:20 /bin/ls -rwxr-xr-x.
| 1 root root 325208 Jan 9 2022 /bin/mksh $ rpm
| -qi dash | tail -4 Description : DASH is a
| POSIX-compliant implementation of /bin/sh that aims to be
| as small as possible. It does this without
| sacrificing speed where possible. In fact, it is
| significantly faster than bash (the GNU Bourne-Again SHell)
| for most tasks.
| laurenth wrote:
| From our experience, ksh is generally faster, and dash sits
| between ksh and bash. One reason is that dash stores
| variables using a very small hash table with only 37
| entries[0] meaning variable access quickly becomes linear as
| memory usage grows. But even with that, dash is still
| surprisingly fast -- when compiling `pnut.c` with `pnut.sh`,
| dash comes in second place: ksh93: 31s
| dash: 1m06s bash: 1m19s zsh: >15m
|
| [0]: https://git.kernel.org/pub/scm/utils/dash/dash.git/tree/
| src/...
|
| EDIT: ksh93, not ksh
| AdmiralAsshat wrote:
| People still use KornShell?
| chasil wrote:
| All of Android is still based on a pdksh-derivative known
| as mksh, which is an enormous install base.
|
| http://www.mirbsd.org/mksh.htm
|
| OpenBSD switched their default shell to their own pdksh-
| derivative known as oksh.
|
| https://github.com/ibara/oksh
|
| There was an effort to (re)start ksh93 development, but
| AT&T halted this effort. The bugfixes from the failed
| effort have moved back into Korn's last release.
|
| https://github.com/ksh93/ksh/tree/dev
| cb321 wrote:
| For me `dash` compiles in just a few seconds. If you link
| to a 1-line problem (here, #define VTABSIZE 39), then why
| not boost that to 79 or 113, say, re-compile the shell and
| re-run your benchmark? Might lead to a change in upstream
| that could benefit everyone.
| chasil wrote:
| Or rework the array so realloc() can expand its size?
| cb321 wrote:
| Yes.. Another fine idea, just more work than a 2
| character edit. :-)
| atilaneves wrote:
| I'm still figuring out why anyone would want to write a shell
| script in C. That sounds like torture to me.
| Retr0id wrote:
| Can it do wrapping arithmetic?
|
| The `sum` example doesn't seem to do wrapping, but signed int
| overflow is technically UB so I guess they're fine not to.
|
| Switching it to `unsigned int` gives me:
|
| code.c:1:1 syntax error: unsupported type
| iod wrote:
| I am sorry if this comes off to be negative, but with every
| example provided on the site, when compiled and then fed into
| ShellCheck1, generates warnings about non-portable and ambiguous
| problems with the script. What exactly are we supposed to trust?
|
| 1 https://www.shellcheck.net
| laurenth wrote:
| It seems ShellCheck errs on the side of caution when checking
| arithmetic expansions and some of its recommendations are not
| relevant in the context they are given. For example, on
| `cat.sh`, one of the lines that are marked in red is:
| In examples/compiled/cat.sh line 7: : $((_$__ALLOC =
| $2)) # Track object size ^-- SC1102 (error): Shells
| disambiguate $(( differently or not at all. For $(command
| substitution), add space after $( . For $((arithmetics)), fix
| parsing errors. ^-----------------^ SC2046 (warning):
| Quote this to prevent word splitting.
| ^--------------^ SC2205 (warning): (..) is a subshell. Did you
| mean [ .. ], a test expression? ^--
| SC2283 (error): Remove spaces around = to assign (or use [ ] to
| compare, or quote '=' if literal). ^--
| SC2086 (info): Double quote to prevent globbing and word
| splitting.
|
| It seems to be parsing the arithmetic expansion as a command
| substitution, which then causes the analyzer to produce errors
| that aren't relevant. ShellCheck's own documentation[0] mention
| this in the exceptions section, and the code is generated such
| that quoting and word splitting are not an issue (because
| variables never contain whitespace or special characters).
|
| It also warns about `let` being undefined in POSIX shell, but
| `let` is defined in the shell script so it's a false positive
| that's caused by the use of the `let` keyword specifically.
|
| If you think there are other issues or ways to improve Pnut's
| compatibility with Shellcheck, please let us know!
|
| 0: https://www.shellcheck.net/wiki/SC1102
| metadat wrote:
| Also see this related submission from May, 2024:
|
| _Amber: Programming language compiled to Bash_
| https://news.ycombinator.com/item?id=40431835 (318 comments)
|
| ---
|
| Pnut doesn't seem to differentiate between `int' and `int*'
| function parameters. That's weird, and doesn't come across as
| trustworthy at all! Shouldn't the use of pointers be disallowed
| instead? int test1(int a, int len) {
| return a; } int test2(int* a, int len) {
| return a; }
|
| Both compile to the exact same thing: : $((len =
| a = 0)) _test1() { let a $2; let len $3 : $(($1 =
| a)) endlet $1 len a } : $((len = a =
| 0)) _test2() { let a $2; let len $3 : $(($1 = a))
| endlet $1 len a }
|
| The "runtime library" portion at the bottom of every script is
| nigh unreadable.
|
| Even still, it's a cool concept.
| JoshTriplett wrote:
| Several times I've found myself wishing for the reverse: a shell-
| to-binary compiler or JIT.
___________________________________________________________________
(page generated 2024-07-24 23:08 UTC)