[HN Gopher] POSIX.1-2024 is published
___________________________________________________________________
POSIX.1-2024 is published
Author : phoebos
Score : 139 points
Date : 2024-06-14 11:37 UTC (11 hours ago)
(HTM) web link (ieeexplore.ieee.org)
(TXT) w3m dump (ieeexplore.ieee.org)
| stephenr wrote:
| I wonder how long till
| https://pubs.opengroup.org/onlinepubs/9699919799/ has the new
| revision.
| a-french-anon wrote:
| Some goodies for POSIX sh programmers:
|
| * readlink/realpath
| (https://austingroupbugs.net/view.php?id=1457)
|
| * find -print0, xargs -0 and read -d
| (https://austingroupbugs.net/view.php?id=243)
|
| * find -iname (https://austingroupbugs.net/view.php?id=1031
|
| * sed -E (https://austingroupbugs.net/view.php?id=528)
|
| * set -o pipefail (https://austingroupbugs.net/view.php?id=789)
| clausecker wrote:
| Also c17 -G to create shared objects, SIGWINCH and
| tcgetwinsize() to query the size of a terminal window, lots of
| new shell features, make is now somewhat useful, gettext() and
| associated commands are in, asprintf(), C17 support, strlcpy,
| strlcat, and many more all new and exciting features.
| a-french-anon wrote:
| asprintf is a pretty cool one.
| https://frippery.org/make/2024.html details some of the make
| changes.
| thechao wrote:
| asprintf is a nice toy, but it should really take a context
| and a "realloc" function pointer to be useful, in general.
| Here's hoping for 2040!
| tedunangst wrote:
| How many other posix functions take allocators?
| cperciva wrote:
| _c17 -G to create shared objects_
|
| Finally! It always seemed very strange to me that posix said
| that shared objects were a thing and provided a rtld API for
| using them, but never specified how to _create_ them.
| dwheeler wrote:
| I agree! You can blame me for some of those proposals :-), my
| thanks to the POSIX team for getting this out the door.
|
| The sed -E option makes it easy to _portably_ use extended
| regular expressions.
|
| The find -print0, xargs -0, and read -d provide portable ways
| to securely process lists of files. They were already widely
| implemented, but now they're officially part of the spec and
| can be counted on being present in many other places.
| stephenr wrote:
| Thanks, those will indeed be useful. Looks like `pipefail` is
| already in Dash https://salsa.debian.org/debian/dash/-/blame/
| debian/unstable...
| EuAndreh wrote:
| wow, I just realized you're the same dwheeler from the work
| on diverse double-compilation!
|
| Thanks for the improvements on POSIX, I've read many issues
| and discussions raised by you in the past couple of years.
|
| If fact, I think it was one of yoir comments on make(1)'s
| dynamic dependency graph that reassured me I had a correct
| grasp on its execution model!
| casey2 wrote:
| I for one am overjoyed that i can now type readlink instead of
| invoking a shell script. I know newbies will also be overjoyed
| now that they can just read a LLONG_MAX page manual containing
| the solution to all their problems
| jwilk wrote:
| Also $'...' strings:
| https://austingroupbugs.net/view.php?id=249
| carterschonwald wrote:
| Is there a changelog for these standards?
| diggan wrote:
| Not an official one, as far as I'm aware. Maybe the Wikipedia
| article is the best resource for that right now:
| https://en.wikipedia.org/wiki/POSIX#Versions (but doesn't seem
| to have info on POSIX.1-2024 yet)
| crote wrote:
| I keep getting surprised at how little the tech industry as a
| whole seems to care about documentation. Way too many authors
| just upload the new PDF to some website - of course overwriting
| the old one.
|
| As an implementer I'm often more interested in the _exact
| changes_ than in the current wording. My product is already
| supporting the old spec, what do I need to change to support
| the new one? A redlined version is more valuable than the full
| PDF. Bonus points if it actually comes with the reasoning
| behind it so I don 't have to guess why some seemingly-
| arbitrary change was made.
|
| My dream documentation is a simple Markdown file (or similar)
| stored in a git repository. It allows me to see the current
| version, the old version, the diff, _and_ the commit messages
| can even store the reasoning.
| LegionMammal978 wrote:
| Though with a sufficiently-gnarly release history, even the
| Git repo might not tell the full story. I've recently been
| tracing the history of one particular library that's been
| around since the early 2000s, and hardly any of the versioned
| Git tags correspond exactly to the files in the released
| tarballs. Finding all the releases was quite tedious: the
| tarballs were published on multiple websites, some tarballs
| were updated in place (without changing the version number),
| one website (which held some versions exclusive to it)
| routinely deleted very old versions, and that website also no
| longer exists outside the Internet Archive. Overall, some of
| the releases have been totally lost to time, and the Git repo
| is of no help in reconstructing them.
| jwilk wrote:
| POSIX folks "have no plans to move to a public git repository
| for managing the development of the standard."
|
| https://lore.kernel.org/linux-
| man/04801FEA-3560-4BA5-93EF-76...
| mike_hock wrote:
| So, what's new?
| susam wrote:
| I am hoping this appears at
| https://pubs.opengroup.org/onlinepubs/9699919799/ soon. This is
| the link I use most often to go through the specification. In
| fact, I owe a lot of my shell scripting skills to this online
| resource.
|
| As a specific example, the seemingly simple matter of when the
| shell decides to split a string based on $IFS and when it does
| not were quite confusing to me until I went through the
| specification here:
| https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...
|
| For example, if a="foo bar"
|
| then ls $a
|
| will split the value into two fields (thus two arguments to ls).
| Of course we should surround $a with double-quotes to avoid the
| field splitting. However the following is fine:
| case $a in
|
| No field splitting occurs here. However, to be kind to your code
| reviewer, you might want to double-quote this anyway for the sake
| of simplicity and consistency. Behaviour like this is specified
| in sections "Field Splitting" and "Case Conditional Construct" of
| the aforementioned link. Specification documents like this were
| formative in in my journey toward learning to write shell scripts
| confidently.
| thebeardisred wrote:
| The note on the front page of the Austin Groups' website says
| as much with regards to publication:
|
| > June 14, 2024: IEEE Std 1003.1-2024 has been published by
| IEEE. The Open Group Base Specifications, Issue 8 has been
| published by The Open Group. At this stage only PDF is
| available. The HTML edition to follow soon.
|
| https://www.opengroup.org/austin/
| lordmauve wrote:
| > However the following is fine: > > case $a in > > No field
| splitting occurs here
|
| This kind of bullshit is how I made a career rewriting people's
| buggy shell scripts in Python
| kstrauser wrote:
| Wow, no kidding. I've been writing little shell scripts to do
| random things for literally decades and this is the first
| time I heard about it.
|
| I also rewrite my stuff in Python as soon as it becomes
| nontrivial.
| chubot wrote:
| I think many people are even more surprised by this:
| x=$a # not split! It means the same thing as x="$a"
|
| They were taught that you have to quote everything, which is a
| reasonable rule to follow, but it's not true.
|
| ---
|
| I never wrote about this on the Oils blog
| (https://www.oilshell.org/ ), but the post would be titled:
|
| _Shell Has Context Sensitive Evaluation_
|
| Basically the two contexts you should think of are:
|
| (1) EVAL WORD SEQUENCE
|
| This occurs in 2 places in POSIX shell: ls
| $x$y # simple command is a sequence of words for
| i in $x$y; do echo $i; done # for loop
|
| And 1 place in bash: a=( $x$y ) # array
| literal
|
| In these cases, the shell "wants" a sequence of strings, not a
| single one. So it does splitting.
|
| ---
|
| (2) EVAL WORD TO STRING
|
| But there are many other contexts where the shell does not
| "want" a sequence of strings.
|
| It wants a SINGLE string. And conversely, it actually JOINS
| arrays of strings, rather than splitting.
|
| Usually "$@" is an array / sequence of strings, while $@ or $*
| is a string, roughly speaking.
|
| But the shell doesn't want sequences of strings in MANY cases,
| e.g. a=$@ # I only want 1 string here, so I
| JOIN rather than splitting echo hi > "$@" #
| redirect arg (not all shells agree though!) case
| "$@" in ... esac # as you point out
|
| So the bottom line is that variables aren't really strings OR
| arrays of strings. Whatever the shell wants, it converts it to.
|
| And shells also DISAGREE on the specifics of those rules. POSIX
| shell has the array "$@", but arrays in general are not in
| POSIX.
|
| ---
|
| And even worse, think about this case: local
| x=$a
|
| Does it behave like an assignment, which wants a single string?
|
| Or does it behave like a simple command, which wants a
| sequence?
|
| You can look at it both ways. The bottom line is that
| assignment builtins are special and they don't follow the
| normal rules of simple commands. Shells have differed, but
| POSIX decided on this awhile ago.
|
| ---
|
| This is all of course mind numbing trivia that has no real
| reason for existing ... YSH fixes it, and it's now pure native
| C++, no more Python.
|
| _YSH Doesn 't Require Quoting Everywhere_ -
| https://www.oilshell.org/blog/2021/04/simple-word-eval.html
| (Oil was renamed to YSH since this blog post was written)
|
| _Simple Word Evaluation in Unix Shell_ -
| https://www.oilshell.org/release/latest/doc/simple-word-eval...
|
| In YSH you can tell just by looking it's a single string or an
| array. ls $a # identical to ls "$a"
| ls @myarray # splice an array
|
| It never "molests" your variables. There's no auto-conversion,
| and you can upgrade to those rules with shopt
| --set ysh:upgrade
| parasense wrote:
| Sometimes when I'm being lazy, say for example I have a
| variable with some folder/file absolute path... I'll convert
| the slashes to spaces, and word split there, sending the
| results into the stack. $ readarray -d '/'
| <<<${PWD:1:-1} $ echo ${MAPFILE[@]}
|
| and you get a nice list of folders you can push/pop as you
| wish...
| unnah wrote:
| That's also one of the more notable incompatibilities of zsh
| - by default it treats $a the same as "$a", and you're
| supposed to use arrays if you want multiple words. Although
| I'm not an expert - maybe ysh and zsh differ in the details
| here?
| chubot wrote:
| Yup, in terms of ls $a -- YSH happens to be like zsh. OSH
| is compatible with bash and does the POSIX word splitting,
| but YSH is not.
|
| In general YSH is pretty different than zsh though -- it's
| more of a Python- JS-like language with structured data,
| e.g. ysh$ var a = ['list', 'of' strings']
| ysh$ write -- @a list of strings
|
| In zsh I still think that's the pretty obscure "${a[@]}"
| rather than @a.
|
| Arrays are also "flat" in zsh -- you can't have an array of
| arrays, because there's no garbage collector. But YSH has
| arbitrarily nested JSON-like data structures, and JSON
| serialization built in.
|
| I need to put some code examples on the home page, but for
| now - https://www.oilshell.org/release/latest/doc/ysh-
| tour.html
| arp242 wrote:
| You can use $=a to get word splitting in zsh.
|
| While incompatible, the zsh behaviour makes a lot more
| sense.
|
| You can use "setopt sh_word_split" to get the POSIX
| behaviour.
|
| Or split explicitly with ${(s: :)a}, ${(s:SPLIT-ON-
| THIS:)a}, etc. (not compatible with anything but zsh).
| thomashabets2 wrote:
| I make a habit to always quote strings with "${a?}". That way a
| typo variable won't blindly go ahead and do the wrong thing.
| matrss wrote:
| I can recommend starting every bash script with
| set -euxo pipefail
|
| The "u" has basically the same effect as the question mark,
| but for every variable usage.
| ko1nksm wrote:
| set -o pipefail is now POSIX compliant. Use it in all POSIX
| shell scripts, not just bash scripts.
| matrss wrote:
| Oh wow, I didn't know that. I am pretty sure I had a sh
| trip up on it just recently, so I thought it still was
| bash only. Yes, use it everywhere possible.
| yjftsjthsd-h wrote:
| Emphasis on _now_ - it 's new in this version, so
| probably expect a little bit of delay before it's
| actually available everywhere. But yes, excellent
| addition and I'm happy to have it available more broadly
| now.
| cesarb wrote:
| I do the same, except for the "x" (that is, "set -euo
| pipefail"); depending on what you're doing, "set -x" might
| be helpful, might be too much noise, or it could even break
| things which were not expecting the extra output (and in
| the worst case, it might end up echoing secret tokens into
| your build logs).
| matrss wrote:
| I find it to be a good default when writing a script, but
| yes, it can get noisy and potentially leak stuff, if that
| is what your script deals with. That hasn't been a
| concern in the settings I usually use it, though.
|
| I am not sure how it could break anything though, unless
| you are parsing stderr of your script in a subsequent
| step, which would seem unusual anyway.
| tommiegannert wrote:
| Oh, it seems set -u was fixed with regards to arrays in
| 2011 or so:
| https://stackoverflow.com/questions/7577052/unbound-
| variable...
|
| Maybe I can start using it again. (I think I noticed that
| issue while I was at Google, and they used an older version
| of Bash.)
| Maledictus wrote:
| I prefer set -Eeuxo pipefail, so my ERR trap is inherited
| in functions.
| dveeden2 wrote:
| Where is POSIX actually useful today?
|
| Is it mostly for shell scripts? Aren't people targetting bash or
| basic bourne shell features intead of posix? Is shellcheck
| checking for best practices instead of POSIX compliance?
|
| And for other applications (GUI, servers, etc) strict POSIX
| compliance might be too restrictive?
|
| And with many things being Linux (or Linux-like like WSL) the
| need for this might be less?
|
| Are Android and/or iOS fully POSIX compliant?
|
| Any good blog or presentation describing the current state of
| POSIX?
| susam wrote:
| > Aren't people targetting bash or basic bourne shell features
| intead of posix?
|
| I know many banks still have AIX systems with shells like
| ksh89, ksh93, etc. as the default shell. So if a shell script
| is written to work with a POSIX shell (instead of a particular
| shell), it has a better chance of running on such systems.
|
| Also, on Debian, the default non-interactive shell is dash [1].
| This is the Debian Almquist Shell (dash). It is a POSIX-
| compliant shell derived from ash. So again, if we write system
| scripts for Debian and want it to run on Debian without any
| hassle, it makes sense to write the system scripts to conform
| to POSIX shell. Although shellcheck cannot perform full POSIX
| compliance check at this time, it is still a pretty good tool
| that can help with checking compliance with dash in particular.
|
| [1]: https://packages.debian.org/stable/dash
| throw0101d wrote:
| > _So again, if we write system scripts for Debian and want
| it to run on Debian without any hassle, it makes sense to
| write the scripts to conform to POSIX shell._
|
| Or explicitly use _bash_ in your shebang.
|
| One of the problems with Bash is that it insists on doing
| _bash_ -y things even when you tell it to act like _sh_.
|
| People ask why you should write (or at least test) code to be
| multi-platform (even the basics of running it on BSD or
| macOS): it's because it forces you to be honest. Things
| change and initial assumptions may not be the same forever.
|
| * https://wiki.debian.org/Shell
|
| * https://archlinux.org/packages/?name=checkbashisms
|
| * https://wiki.ubuntu.com/DashAsBinSh
| lye wrote:
| Bash has `Priority: required` and is marked as "essential",
| it's available on every Debian system.
| susam wrote:
| Indeed, Bash is always available on Debian! After all,
| Debian uses Bash as the default interactive shell. But
| that's not the point of writing system scripts for dash.
| They are written for dash because that's still the default
| non-interactive shell. And it is so because dash is leaner
| and faster. Quoting from the link I posted in my previous
| comment:
|
| > Since it executes scripts faster than bash, and has fewer
| library dependencies (making it more robust against
| software or hardware failures), it is used as the default
| system shell on Debian systems.
| kstrauser wrote:
| I've wondered the same. The most common standard I've seen in
| everyday work for a long time now is "runs on my Mac and the
| Linux server we're deploying to".
|
| I'm not talking about shops that ship software that customers
| receive and install on prem on their HPUX or whatever. That's
| still a thing and people have to take that into account. I'm
| grateful I'm no longer among them.
| jjmarr wrote:
| The original discussion on why Debian switched from bash to
| dash for /bin/sh is insightful.
|
| https://lwn.net/Articles/343924/
|
| One big factor is for performance reasons in shell scripts. At
| the time, the switch decreased boot times for Debian by 7.5%.
| Bourne shell features add a lot of overhead and that's not
| always an acceptable tradeoff.
|
| Also, if you're using bash features in a script, you can always
| just add #!/bin/bash to the top of your file instead of
| #!/bin/sh to force a bash compatible shell.
| arp242 wrote:
| > One big factor is for performance reasons in shell scripts.
| At the time, the switch decreased boot times for Debian by
| 7.5%. Bourne shell features add a lot of overhead and that's
| not always an acceptable tradeoff.
|
| I seem to recall it was much smaller than that, something
| like 4% on a 2008 EEE-PC or something like that, but I can't
| find any numbers on that right now.
|
| The Debian startup scripts were already POSIX; it's not hard
| to get better performance out of zsh or bash by avoiding
| expensive processes lookups.
|
| Overall, I consider this to be mostly a myth, or at least
| extremely simplistic.
| cesarb wrote:
| > One big factor is for performance reasons in shell scripts.
| At the time, the switch decreased boot times for Debian by
| 7.5%.
|
| And that should no longer be a relevant factor, since most of
| the boot process is now implemented directly in C (within
| systemd), instead of a bunch of shell scripts.
| cayley_graph wrote:
| Prefer `#!/usr/bin/env bash` instead, since /bin/bash isn't a
| standardized location for bash (even across Linux distros).
| The former causes $PATH to be searched for bash.
| n_plus_1_acc wrote:
| Is /usr/bin/env a standardized location?
| mappu wrote:
| This is pretty common advice but I think it is fighting the
| previous war. This idea is useful for virtualenv-type
| tricks if you want to ensure use of your personal version
| of the interpreter on a shared system, but you have to boil
| an ocean of scripts. You don't know if you caught them all.
| Docker won instead - a quick filesystem namespace
| comprehensively catches everything. Just use #!/bin/bash.
|
| EDIT: I was thinking about Linux, but I suppose macOS users
| are stuck with needing this for Homebrew-supplied bash?
| throw0101d wrote:
| > _Where is POSIX actually useful today?_
|
| Defining a stable API to code against?
|
| > _And with many things being Linux (or Linux-like like WSL)
| the need for this might be less?_
|
| Define "being Linux". RHEL? Ubuntu? Other? Is _/ bin/sh_ linked
| to Bash or something?
|
| * https://mywiki.wooledge.org/Bashism
|
| * https://linux.die.net/man/1/checkbashisms
|
| > _Are Android and /or iOS fully POSIX compliant?_
|
| UNIX(r) Certified Products include macOS:
|
| * https://www.opengroup.org/openbrand/register/
|
| POSIX:
|
| * https://posix.opengroup.org/register.html
| chuckadams wrote:
| It's as good as any other standard: as a baseline of agreed-
| upon "correct" behavior for interoperability. Anything you add
| from there is gravy. It's more for implementers than users,
| e.g. for someone writing a new shell rather than writing shell
| scripts. Having the standard handy is also pretty useful when
| writing foreign interfaces, like the posix modules in python
| and perl.
|
| As for the current state of POSIX, well, you're looking at it.
| Might find a blog or two of someone on the POSIX committees,
| but the organizations aren't the kind that keep blogs. Probably
| best to just dive into the Wikipedia article on POSIX and start
| following the references on the bottom. You'll probably want to
| look into SUS, the Single Unix Specification, as well: it's
| identical to POSIX (plus curses for some reason) but it's the
| label that OS vendors may use rather than POSIX. macOS and some
| Linux distributions claim to be fully SUS-compliant; Linux as a
| whole does not, because its official scope is limited to the
| kernel which only implements a subset of POSIX.
|
| Fun fact: the name "POSIX" was coined by Richard Stallman.
| denvaar wrote:
| My view is that shell scripting feels like the wild-west, so I
| try to conform to POSIX to maintain some level of sanity,
| though it feels restrictive at times. I rely on ShellCheck to
| help me write shell scripts that are POSIX compliant.
| koolala wrote:
| wasm support?
| Jtsummers wrote:
| What would it even mean for an interface description to
| "support" WASM which is a (virtual) machine target for
| implementations?
| koolala wrote:
| webrtc, ws, postmessage?
| koolala wrote:
| ircv3 did it
| eqvinox wrote:
| With all due respect, you don't seem to understand what
| POSIX is.
| koolala wrote:
| I appreciate your respect. Can you contrast POSIX with
| IRC and explain why IRC can work on the web but POSIX
| can't?
|
| https://ircv3.net/specs/extensions/websocket
| Jtsummers wrote:
| Those aren't part of Wasm. WebRTC and WebSockets both
| predate Wasm by 6 years, and neither require Wasm to work.
| postMessage is part of Web Workers, also separate frow
| Wasm.
|
| I'll ask again, what would it mean for POSIX to "support"
| Wasm?
| ykonstant wrote:
| Did we get `local`?
| ko1nksm wrote:
| No. https://www.austingroupbugs.net/view.php?id=767
| a-french-anon wrote:
| Sadly no. That's one of the few things that can't be done with,
| but semantics can subtly vary between implementation. Which is
| why I do a runtime check for them:
| https://git.sr.ht/~q3cpma/scripts/tree/b4b3c62f6a77828d0c445...
| tiffanyh wrote:
| The PDF is behind a login wall :(
| colinsane wrote:
| dead on arrival as far as i'm concerned.
|
| i guess the parts of the ecosystem low enough to care about
| things like POSIX compliance are mostly attached to some
| foundation or other, so maybe those foundations will purchase
| copies for their core maintainers? but that's a pretty counter-
| intuitive thing. i wonder if there are large closed-source
| POSIX implementors out there that this is aimed at, but are
| there really enough closed-source implementations out there for
| any of them to care about compatibility with _eachother_?
| blueflow wrote:
| Leak when?
___________________________________________________________________
(page generated 2024-06-14 23:02 UTC)