[HN Gopher] POSIX.1-2024 is published
       ___________________________________________________________________
        
       POSIX.1-2024 is published
        
       Author : phoebos
       Score  : 139 points
       Date   : 2024-06-14 11:37 UTC (11 hours ago)
        
 (HTM) web link (ieeexplore.ieee.org)
 (TXT) w3m dump (ieeexplore.ieee.org)
        
       | stephenr wrote:
       | I wonder how long till
       | https://pubs.opengroup.org/onlinepubs/9699919799/ has the new
       | revision.
        
       | a-french-anon wrote:
       | Some goodies for POSIX sh programmers:
       | 
       | * readlink/realpath
       | (https://austingroupbugs.net/view.php?id=1457)
       | 
       | * find -print0, xargs -0 and read -d
       | (https://austingroupbugs.net/view.php?id=243)
       | 
       | * find -iname (https://austingroupbugs.net/view.php?id=1031
       | 
       | * sed -E (https://austingroupbugs.net/view.php?id=528)
       | 
       | * set -o pipefail (https://austingroupbugs.net/view.php?id=789)
        
         | clausecker wrote:
         | Also c17 -G to create shared objects, SIGWINCH and
         | tcgetwinsize() to query the size of a terminal window, lots of
         | new shell features, make is now somewhat useful, gettext() and
         | associated commands are in, asprintf(), C17 support, strlcpy,
         | strlcat, and many more all new and exciting features.
        
           | a-french-anon wrote:
           | asprintf is a pretty cool one.
           | https://frippery.org/make/2024.html details some of the make
           | changes.
        
           | thechao wrote:
           | asprintf is a nice toy, but it should really take a context
           | and a "realloc" function pointer to be useful, in general.
           | Here's hoping for 2040!
        
             | tedunangst wrote:
             | How many other posix functions take allocators?
        
           | cperciva wrote:
           | _c17 -G to create shared objects_
           | 
           | Finally! It always seemed very strange to me that posix said
           | that shared objects were a thing and provided a rtld API for
           | using them, but never specified how to _create_ them.
        
         | dwheeler wrote:
         | I agree! You can blame me for some of those proposals :-), my
         | thanks to the POSIX team for getting this out the door.
         | 
         | The sed -E option makes it easy to _portably_ use extended
         | regular expressions.
         | 
         | The find -print0, xargs -0, and read -d provide portable ways
         | to securely process lists of files. They were already widely
         | implemented, but now they're officially part of the spec and
         | can be counted on being present in many other places.
        
           | stephenr wrote:
           | Thanks, those will indeed be useful. Looks like `pipefail` is
           | already in Dash https://salsa.debian.org/debian/dash/-/blame/
           | debian/unstable...
        
           | EuAndreh wrote:
           | wow, I just realized you're the same dwheeler from the work
           | on diverse double-compilation!
           | 
           | Thanks for the improvements on POSIX, I've read many issues
           | and discussions raised by you in the past couple of years.
           | 
           | If fact, I think it was one of yoir comments on make(1)'s
           | dynamic dependency graph that reassured me I had a correct
           | grasp on its execution model!
        
         | casey2 wrote:
         | I for one am overjoyed that i can now type readlink instead of
         | invoking a shell script. I know newbies will also be overjoyed
         | now that they can just read a LLONG_MAX page manual containing
         | the solution to all their problems
        
         | jwilk wrote:
         | Also $'...' strings:
         | https://austingroupbugs.net/view.php?id=249
        
       | carterschonwald wrote:
       | Is there a changelog for these standards?
        
         | diggan wrote:
         | Not an official one, as far as I'm aware. Maybe the Wikipedia
         | article is the best resource for that right now:
         | https://en.wikipedia.org/wiki/POSIX#Versions (but doesn't seem
         | to have info on POSIX.1-2024 yet)
        
         | crote wrote:
         | I keep getting surprised at how little the tech industry as a
         | whole seems to care about documentation. Way too many authors
         | just upload the new PDF to some website - of course overwriting
         | the old one.
         | 
         | As an implementer I'm often more interested in the _exact
         | changes_ than in the current wording. My product is already
         | supporting the old spec, what do I need to change to support
         | the new one? A redlined version is more valuable than the full
         | PDF. Bonus points if it actually comes with the reasoning
         | behind it so I don 't have to guess why some seemingly-
         | arbitrary change was made.
         | 
         | My dream documentation is a simple Markdown file (or similar)
         | stored in a git repository. It allows me to see the current
         | version, the old version, the diff, _and_ the commit messages
         | can even store the reasoning.
        
           | LegionMammal978 wrote:
           | Though with a sufficiently-gnarly release history, even the
           | Git repo might not tell the full story. I've recently been
           | tracing the history of one particular library that's been
           | around since the early 2000s, and hardly any of the versioned
           | Git tags correspond exactly to the files in the released
           | tarballs. Finding all the releases was quite tedious: the
           | tarballs were published on multiple websites, some tarballs
           | were updated in place (without changing the version number),
           | one website (which held some versions exclusive to it)
           | routinely deleted very old versions, and that website also no
           | longer exists outside the Internet Archive. Overall, some of
           | the releases have been totally lost to time, and the Git repo
           | is of no help in reconstructing them.
        
           | jwilk wrote:
           | POSIX folks "have no plans to move to a public git repository
           | for managing the development of the standard."
           | 
           | https://lore.kernel.org/linux-
           | man/04801FEA-3560-4BA5-93EF-76...
        
       | mike_hock wrote:
       | So, what's new?
        
       | susam wrote:
       | I am hoping this appears at
       | https://pubs.opengroup.org/onlinepubs/9699919799/ soon. This is
       | the link I use most often to go through the specification. In
       | fact, I owe a lot of my shell scripting skills to this online
       | resource.
       | 
       | As a specific example, the seemingly simple matter of when the
       | shell decides to split a string based on $IFS and when it does
       | not were quite confusing to me until I went through the
       | specification here:
       | https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...
       | 
       | For example, if                 a="foo bar"
       | 
       | then                 ls $a
       | 
       | will split the value into two fields (thus two arguments to ls).
       | Of course we should surround $a with double-quotes to avoid the
       | field splitting. However the following is fine:
       | case $a in
       | 
       | No field splitting occurs here. However, to be kind to your code
       | reviewer, you might want to double-quote this anyway for the sake
       | of simplicity and consistency. Behaviour like this is specified
       | in sections "Field Splitting" and "Case Conditional Construct" of
       | the aforementioned link. Specification documents like this were
       | formative in in my journey toward learning to write shell scripts
       | confidently.
        
         | thebeardisred wrote:
         | The note on the front page of the Austin Groups' website says
         | as much with regards to publication:
         | 
         | > June 14, 2024: IEEE Std 1003.1-2024 has been published by
         | IEEE. The Open Group Base Specifications, Issue 8 has been
         | published by The Open Group. At this stage only PDF is
         | available. The HTML edition to follow soon.
         | 
         | https://www.opengroup.org/austin/
        
         | lordmauve wrote:
         | > However the following is fine: > > case $a in > > No field
         | splitting occurs here
         | 
         | This kind of bullshit is how I made a career rewriting people's
         | buggy shell scripts in Python
        
           | kstrauser wrote:
           | Wow, no kidding. I've been writing little shell scripts to do
           | random things for literally decades and this is the first
           | time I heard about it.
           | 
           | I also rewrite my stuff in Python as soon as it becomes
           | nontrivial.
        
         | chubot wrote:
         | I think many people are even more surprised by this:
         | x=$a  # not split!  It means the same thing as x="$a"
         | 
         | They were taught that you have to quote everything, which is a
         | reasonable rule to follow, but it's not true.
         | 
         | ---
         | 
         | I never wrote about this on the Oils blog
         | (https://www.oilshell.org/ ), but the post would be titled:
         | 
         |  _Shell Has Context Sensitive Evaluation_
         | 
         | Basically the two contexts you should think of are:
         | 
         | (1) EVAL WORD SEQUENCE
         | 
         | This occurs in 2 places in POSIX shell:                  ls
         | $x$y   # simple command is a sequence of words              for
         | i in $x$y; do echo $i; done  # for loop
         | 
         | And 1 place in bash:                  a=( $x$y )  # array
         | literal
         | 
         | In these cases, the shell "wants" a sequence of strings, not a
         | single one. So it does splitting.
         | 
         | ---
         | 
         | (2) EVAL WORD TO STRING
         | 
         | But there are many other contexts where the shell does not
         | "want" a sequence of strings.
         | 
         | It wants a SINGLE string. And conversely, it actually JOINS
         | arrays of strings, rather than splitting.
         | 
         | Usually "$@" is an array / sequence of strings, while $@ or $*
         | is a string, roughly speaking.
         | 
         | But the shell doesn't want sequences of strings in MANY cases,
         | e.g.                   a=$@  # I only want 1 string here, so I
         | JOIN rather than splitting              echo hi > "$@"   #
         | redirect arg (not all shells agree though!)              case
         | "$@" in ... esac  # as you point out
         | 
         | So the bottom line is that variables aren't really strings OR
         | arrays of strings. Whatever the shell wants, it converts it to.
         | 
         | And shells also DISAGREE on the specifics of those rules. POSIX
         | shell has the array "$@", but arrays in general are not in
         | POSIX.
         | 
         | ---
         | 
         | And even worse, think about this case:                  local
         | x=$a
         | 
         | Does it behave like an assignment, which wants a single string?
         | 
         | Or does it behave like a simple command, which wants a
         | sequence?
         | 
         | You can look at it both ways. The bottom line is that
         | assignment builtins are special and they don't follow the
         | normal rules of simple commands. Shells have differed, but
         | POSIX decided on this awhile ago.
         | 
         | ---
         | 
         | This is all of course mind numbing trivia that has no real
         | reason for existing ... YSH fixes it, and it's now pure native
         | C++, no more Python.
         | 
         |  _YSH Doesn 't Require Quoting Everywhere_ -
         | https://www.oilshell.org/blog/2021/04/simple-word-eval.html
         | (Oil was renamed to YSH since this blog post was written)
         | 
         |  _Simple Word Evaluation in Unix Shell_ -
         | https://www.oilshell.org/release/latest/doc/simple-word-eval...
         | 
         | In YSH you can tell just by looking it's a single string or an
         | array.                   ls $a  # identical to ls "$a"
         | ls @myarray  # splice an array
         | 
         | It never "molests" your variables. There's no auto-conversion,
         | and you can upgrade to those rules with                   shopt
         | --set ysh:upgrade
        
           | parasense wrote:
           | Sometimes when I'm being lazy, say for example I have a
           | variable with some folder/file absolute path... I'll convert
           | the slashes to spaces, and word split there, sending the
           | results into the stack.                   $ readarray -d '/'
           | <<<${PWD:1:-1}         $ echo ${MAPFILE[@]}
           | 
           | and you get a nice list of folders you can push/pop as you
           | wish...
        
           | unnah wrote:
           | That's also one of the more notable incompatibilities of zsh
           | - by default it treats $a the same as "$a", and you're
           | supposed to use arrays if you want multiple words. Although
           | I'm not an expert - maybe ysh and zsh differ in the details
           | here?
        
             | chubot wrote:
             | Yup, in terms of ls $a -- YSH happens to be like zsh. OSH
             | is compatible with bash and does the POSIX word splitting,
             | but YSH is not.
             | 
             | In general YSH is pretty different than zsh though -- it's
             | more of a Python- JS-like language with structured data,
             | e.g.                   ysh$ var a = ['list', 'of' strings']
             | ysh$ write -- @a         list         of         strings
             | 
             | In zsh I still think that's the pretty obscure "${a[@]}"
             | rather than @a.
             | 
             | Arrays are also "flat" in zsh -- you can't have an array of
             | arrays, because there's no garbage collector. But YSH has
             | arbitrarily nested JSON-like data structures, and JSON
             | serialization built in.
             | 
             | I need to put some code examples on the home page, but for
             | now - https://www.oilshell.org/release/latest/doc/ysh-
             | tour.html
        
             | arp242 wrote:
             | You can use $=a to get word splitting in zsh.
             | 
             | While incompatible, the zsh behaviour makes a lot more
             | sense.
             | 
             | You can use "setopt sh_word_split" to get the POSIX
             | behaviour.
             | 
             | Or split explicitly with ${(s: :)a}, ${(s:SPLIT-ON-
             | THIS:)a}, etc. (not compatible with anything but zsh).
        
         | thomashabets2 wrote:
         | I make a habit to always quote strings with "${a?}". That way a
         | typo variable won't blindly go ahead and do the wrong thing.
        
           | matrss wrote:
           | I can recommend starting every bash script with
           | set -euxo pipefail
           | 
           | The "u" has basically the same effect as the question mark,
           | but for every variable usage.
        
             | ko1nksm wrote:
             | set -o pipefail is now POSIX compliant. Use it in all POSIX
             | shell scripts, not just bash scripts.
        
               | matrss wrote:
               | Oh wow, I didn't know that. I am pretty sure I had a sh
               | trip up on it just recently, so I thought it still was
               | bash only. Yes, use it everywhere possible.
        
               | yjftsjthsd-h wrote:
               | Emphasis on _now_ - it 's new in this version, so
               | probably expect a little bit of delay before it's
               | actually available everywhere. But yes, excellent
               | addition and I'm happy to have it available more broadly
               | now.
        
             | cesarb wrote:
             | I do the same, except for the "x" (that is, "set -euo
             | pipefail"); depending on what you're doing, "set -x" might
             | be helpful, might be too much noise, or it could even break
             | things which were not expecting the extra output (and in
             | the worst case, it might end up echoing secret tokens into
             | your build logs).
        
               | matrss wrote:
               | I find it to be a good default when writing a script, but
               | yes, it can get noisy and potentially leak stuff, if that
               | is what your script deals with. That hasn't been a
               | concern in the settings I usually use it, though.
               | 
               | I am not sure how it could break anything though, unless
               | you are parsing stderr of your script in a subsequent
               | step, which would seem unusual anyway.
        
             | tommiegannert wrote:
             | Oh, it seems set -u was fixed with regards to arrays in
             | 2011 or so:
             | https://stackoverflow.com/questions/7577052/unbound-
             | variable...
             | 
             | Maybe I can start using it again. (I think I noticed that
             | issue while I was at Google, and they used an older version
             | of Bash.)
        
             | Maledictus wrote:
             | I prefer set -Eeuxo pipefail, so my ERR trap is inherited
             | in functions.
        
       | dveeden2 wrote:
       | Where is POSIX actually useful today?
       | 
       | Is it mostly for shell scripts? Aren't people targetting bash or
       | basic bourne shell features intead of posix? Is shellcheck
       | checking for best practices instead of POSIX compliance?
       | 
       | And for other applications (GUI, servers, etc) strict POSIX
       | compliance might be too restrictive?
       | 
       | And with many things being Linux (or Linux-like like WSL) the
       | need for this might be less?
       | 
       | Are Android and/or iOS fully POSIX compliant?
       | 
       | Any good blog or presentation describing the current state of
       | POSIX?
        
         | susam wrote:
         | > Aren't people targetting bash or basic bourne shell features
         | intead of posix?
         | 
         | I know many banks still have AIX systems with shells like
         | ksh89, ksh93, etc. as the default shell. So if a shell script
         | is written to work with a POSIX shell (instead of a particular
         | shell), it has a better chance of running on such systems.
         | 
         | Also, on Debian, the default non-interactive shell is dash [1].
         | This is the Debian Almquist Shell (dash). It is a POSIX-
         | compliant shell derived from ash. So again, if we write system
         | scripts for Debian and want it to run on Debian without any
         | hassle, it makes sense to write the system scripts to conform
         | to POSIX shell. Although shellcheck cannot perform full POSIX
         | compliance check at this time, it is still a pretty good tool
         | that can help with checking compliance with dash in particular.
         | 
         | [1]: https://packages.debian.org/stable/dash
        
           | throw0101d wrote:
           | > _So again, if we write system scripts for Debian and want
           | it to run on Debian without any hassle, it makes sense to
           | write the scripts to conform to POSIX shell._
           | 
           | Or explicitly use _bash_ in your shebang.
           | 
           | One of the problems with Bash is that it insists on doing
           | _bash_ -y things even when you tell it to act like _sh_.
           | 
           | People ask why you should write (or at least test) code to be
           | multi-platform (even the basics of running it on BSD or
           | macOS): it's because it forces you to be honest. Things
           | change and initial assumptions may not be the same forever.
           | 
           | * https://wiki.debian.org/Shell
           | 
           | * https://archlinux.org/packages/?name=checkbashisms
           | 
           | * https://wiki.ubuntu.com/DashAsBinSh
        
           | lye wrote:
           | Bash has `Priority: required` and is marked as "essential",
           | it's available on every Debian system.
        
             | susam wrote:
             | Indeed, Bash is always available on Debian! After all,
             | Debian uses Bash as the default interactive shell. But
             | that's not the point of writing system scripts for dash.
             | They are written for dash because that's still the default
             | non-interactive shell. And it is so because dash is leaner
             | and faster. Quoting from the link I posted in my previous
             | comment:
             | 
             | > Since it executes scripts faster than bash, and has fewer
             | library dependencies (making it more robust against
             | software or hardware failures), it is used as the default
             | system shell on Debian systems.
        
         | kstrauser wrote:
         | I've wondered the same. The most common standard I've seen in
         | everyday work for a long time now is "runs on my Mac and the
         | Linux server we're deploying to".
         | 
         | I'm not talking about shops that ship software that customers
         | receive and install on prem on their HPUX or whatever. That's
         | still a thing and people have to take that into account. I'm
         | grateful I'm no longer among them.
        
         | jjmarr wrote:
         | The original discussion on why Debian switched from bash to
         | dash for /bin/sh is insightful.
         | 
         | https://lwn.net/Articles/343924/
         | 
         | One big factor is for performance reasons in shell scripts. At
         | the time, the switch decreased boot times for Debian by 7.5%.
         | Bourne shell features add a lot of overhead and that's not
         | always an acceptable tradeoff.
         | 
         | Also, if you're using bash features in a script, you can always
         | just add #!/bin/bash to the top of your file instead of
         | #!/bin/sh to force a bash compatible shell.
        
           | arp242 wrote:
           | > One big factor is for performance reasons in shell scripts.
           | At the time, the switch decreased boot times for Debian by
           | 7.5%. Bourne shell features add a lot of overhead and that's
           | not always an acceptable tradeoff.
           | 
           | I seem to recall it was much smaller than that, something
           | like 4% on a 2008 EEE-PC or something like that, but I can't
           | find any numbers on that right now.
           | 
           | The Debian startup scripts were already POSIX; it's not hard
           | to get better performance out of zsh or bash by avoiding
           | expensive processes lookups.
           | 
           | Overall, I consider this to be mostly a myth, or at least
           | extremely simplistic.
        
           | cesarb wrote:
           | > One big factor is for performance reasons in shell scripts.
           | At the time, the switch decreased boot times for Debian by
           | 7.5%.
           | 
           | And that should no longer be a relevant factor, since most of
           | the boot process is now implemented directly in C (within
           | systemd), instead of a bunch of shell scripts.
        
           | cayley_graph wrote:
           | Prefer `#!/usr/bin/env bash` instead, since /bin/bash isn't a
           | standardized location for bash (even across Linux distros).
           | The former causes $PATH to be searched for bash.
        
             | n_plus_1_acc wrote:
             | Is /usr/bin/env a standardized location?
        
             | mappu wrote:
             | This is pretty common advice but I think it is fighting the
             | previous war. This idea is useful for virtualenv-type
             | tricks if you want to ensure use of your personal version
             | of the interpreter on a shared system, but you have to boil
             | an ocean of scripts. You don't know if you caught them all.
             | Docker won instead - a quick filesystem namespace
             | comprehensively catches everything. Just use #!/bin/bash.
             | 
             | EDIT: I was thinking about Linux, but I suppose macOS users
             | are stuck with needing this for Homebrew-supplied bash?
        
         | throw0101d wrote:
         | > _Where is POSIX actually useful today?_
         | 
         | Defining a stable API to code against?
         | 
         | > _And with many things being Linux (or Linux-like like WSL)
         | the need for this might be less?_
         | 
         | Define "being Linux". RHEL? Ubuntu? Other? Is _/ bin/sh_ linked
         | to Bash or something?
         | 
         | * https://mywiki.wooledge.org/Bashism
         | 
         | * https://linux.die.net/man/1/checkbashisms
         | 
         | > _Are Android and /or iOS fully POSIX compliant?_
         | 
         | UNIX(r) Certified Products include macOS:
         | 
         | * https://www.opengroup.org/openbrand/register/
         | 
         | POSIX:
         | 
         | * https://posix.opengroup.org/register.html
        
         | chuckadams wrote:
         | It's as good as any other standard: as a baseline of agreed-
         | upon "correct" behavior for interoperability. Anything you add
         | from there is gravy. It's more for implementers than users,
         | e.g. for someone writing a new shell rather than writing shell
         | scripts. Having the standard handy is also pretty useful when
         | writing foreign interfaces, like the posix modules in python
         | and perl.
         | 
         | As for the current state of POSIX, well, you're looking at it.
         | Might find a blog or two of someone on the POSIX committees,
         | but the organizations aren't the kind that keep blogs. Probably
         | best to just dive into the Wikipedia article on POSIX and start
         | following the references on the bottom. You'll probably want to
         | look into SUS, the Single Unix Specification, as well: it's
         | identical to POSIX (plus curses for some reason) but it's the
         | label that OS vendors may use rather than POSIX. macOS and some
         | Linux distributions claim to be fully SUS-compliant; Linux as a
         | whole does not, because its official scope is limited to the
         | kernel which only implements a subset of POSIX.
         | 
         | Fun fact: the name "POSIX" was coined by Richard Stallman.
        
         | denvaar wrote:
         | My view is that shell scripting feels like the wild-west, so I
         | try to conform to POSIX to maintain some level of sanity,
         | though it feels restrictive at times. I rely on ShellCheck to
         | help me write shell scripts that are POSIX compliant.
        
       | koolala wrote:
       | wasm support?
        
         | Jtsummers wrote:
         | What would it even mean for an interface description to
         | "support" WASM which is a (virtual) machine target for
         | implementations?
        
           | koolala wrote:
           | webrtc, ws, postmessage?
        
             | koolala wrote:
             | ircv3 did it
        
               | eqvinox wrote:
               | With all due respect, you don't seem to understand what
               | POSIX is.
        
               | koolala wrote:
               | I appreciate your respect. Can you contrast POSIX with
               | IRC and explain why IRC can work on the web but POSIX
               | can't?
               | 
               | https://ircv3.net/specs/extensions/websocket
        
             | Jtsummers wrote:
             | Those aren't part of Wasm. WebRTC and WebSockets both
             | predate Wasm by 6 years, and neither require Wasm to work.
             | postMessage is part of Web Workers, also separate frow
             | Wasm.
             | 
             | I'll ask again, what would it mean for POSIX to "support"
             | Wasm?
        
       | ykonstant wrote:
       | Did we get `local`?
        
         | ko1nksm wrote:
         | No. https://www.austingroupbugs.net/view.php?id=767
        
         | a-french-anon wrote:
         | Sadly no. That's one of the few things that can't be done with,
         | but semantics can subtly vary between implementation. Which is
         | why I do a runtime check for them:
         | https://git.sr.ht/~q3cpma/scripts/tree/b4b3c62f6a77828d0c445...
        
       | tiffanyh wrote:
       | The PDF is behind a login wall :(
        
         | colinsane wrote:
         | dead on arrival as far as i'm concerned.
         | 
         | i guess the parts of the ecosystem low enough to care about
         | things like POSIX compliance are mostly attached to some
         | foundation or other, so maybe those foundations will purchase
         | copies for their core maintainers? but that's a pretty counter-
         | intuitive thing. i wonder if there are large closed-source
         | POSIX implementors out there that this is aimed at, but are
         | there really enough closed-source implementations out there for
         | any of them to care about compatibility with _eachother_?
        
         | blueflow wrote:
         | Leak when?
        
       ___________________________________________________________________
       (page generated 2024-06-14 23:02 UTC)