[HN Gopher] Deep Down the Rabbit Hole: Bash, OverlayFS, and a 30...
___________________________________________________________________
Deep Down the Rabbit Hole: Bash, OverlayFS, and a 30-Year-Old
Surprise
Author : Deeg9rie9usi
Score : 54 points
Date : 2025-06-25 13:43 UTC (9 hours ago)
(HTM) web link (sigma-star.at)
(TXT) w3m dump (sigma-star.at)
| x0x0 wrote:
| Saving this to explain why software is hard.
|
| For a long time, inode numbers from readdir() had certain
| semantics. Supporting overlay filesystems required changing those
| semantics. Piles of software were written against the old
| semantics; and even some of the most common have not been
| upgraded.
| JdeBP wrote:
| The opposite, if anything. Very little was written against the
| old semantics, with most of the time the supplied C library
| providing what was needed, and so the code that _did_ rely upon
| old semantics barely got exercised. A little-used shim that had
| been broken wasn 't noticed, in other words, until just the
| right combination of circumstances got the shim being used on a
| platform where it would break.
|
| What there _are_ piles of, are softwares that reinvent the C
| library, all too often in little bits of conditionally-compiled
| code that have either been reinvented or nicked from some old C
| library and sit unused in every platform that that application
| is nowadays ported to. Every time that I see a build log
| dutifully informing me that it has checked for <string.h> or
| some other thing that has been standard for 35 years I wonder
| (a) why that is thought to be necessary in 2025, and (b) what
| sort of shims would get used if the check ever failed.
| saurik wrote:
| FWIW, if you are cross-compiling, while you might get a vaguely
| usable result by ignoring all of the warnings and letting worst-
| common-denominator defaults get applied, you absolutely should be
| paying more attention and either manually providing autoconf the
| answers it needs or (if at all possible, as this is more general)
| make sure to tell it how to run a binary on the target system
| (maybe in an emulator or over ssh)... you shouldn't just be
| YOLOing a cross-compile like this and expecting it to work (not
| to say that this wasn't a good bug in the fallback to fix, just
| that the premise is awkward).
| iforgotpassword wrote:
| Like for example when compiling Linux (plus user space) from
| Windows XP using only the official Services for Unix package
| from Microsoft as a starting point.
| pogopop77 wrote:
| Interesting investigation, good read. Definitely illustrates how
| new paradigms (i.e. overlay filesystems) can subtly affect
| behaviors in ways that are complex to track down.
| akoboldfrying wrote:
| Remember, folks: It's not enough to check $WEARING_PANTS before
| stepping outside. You need to check !$PANTS_BROKEN && !$SOLARIS
| too.
| jwilk wrote:
| > Once the bug report becomes publicly visible, it will be linked
| here.
|
| Here it is: https://lists.gnu.org/archive/html/bug-
| bash/2025-06/msg00149...
| chubot wrote:
| Wow great bug!
|
| > Bash forgot to reset errno before the call. For about 30 years,
| no one noticed
|
| I have to say, this part of the POSIX API is maddening!
|
| 99% of the time, you don't need to set errno = 0 before making a
| call. You check for a non-zero return, and only then look at
| errno.
|
| But SOMETIMES you need to set errno = 0, because in this case
| readdir() returns NULL on both error and EOF.
|
| I actually didn't realize this before working on
| https://oils.pub/
|
| ---
|
| And it should go without saying: Oils simply uses libc - we don't
| need to support system with a broken getcwd()!
|
| Although a funny thing is that I just fixed a bug related to $PWD
| that AT&T ksh (the original shell, that bash is based on) hasn't
| fixed for 30+ years too!
|
| (and I didn't realize it was still maintained)
|
| https://www.illumos.org/issues/17442
|
| https://github.com/oils-for-unix/oils/issues/2058
|
| There is a subtle issue with respect to:
|
| 1) "trusting" the $PWD value you inherit from another process
|
| 2) Respecting symlinks - this is the reason the shell can't just
| call getcwd() ! if (*p != '/' || stat(p, &st1)
| || stat(".", &st2) || st1.st_dev != st2.st_dev ||
| st1.st_ino != st2.st_ino) p = 0;
|
| Basically, the shell considers BOTH the inherited $PWD and the
| value of getcwd() to determine its $PWD. It can't just use one or
| the other!
| justincormack wrote:
| Most of the stuff that configure scripts check is obsolete, and
| breaks in situations like this as the checks are often not
| workable without running code. It is likely the check does not
| apply to any system that has existed for decades. Lots of systems
| have disabled eg Nix in 2017 [1]
|
| [1]
| https://github.com/NixOS/nixpkgs/commit/dff0ba38a243603534c9...
| arp242 wrote:
| I had a look at the bash source code a few years back, and
| there are tons of hacks and workarounds for 1980s-era systems.
| Looking at the git log, GETCWD_BROKEN was added in bash 1.14
| from 1996, presumably to work around some system at the time (a
| system which was perhaps already old in 1996, but it's not
| detailed which).
|
| Also, that getcwd.c which contains the getcwd() fallback and
| bug is in K&R C, which should be a hint at how well maintained
| all of this is. Bash takes "don't fix it if it ain't broke" to
| new levels, to the point of introducing breakage like here (the
| bash-malloc is also notorious for this - no idea why that's
| still enabled by default).
| malkia wrote:
| Autoconf is the prime example of easy vs simple.
|
| It looks easy on the surface to roll down support for any kind of
| operating system there is, based on auto-detection and then #if
| HAVE_THIS or #if HAVE_THAT, but it breaks in ways that maybe
| really hard to untangle later.
|
| I'd rather have a limited set set of configurations targeting
| specific platforms/flavors, and knowing that no matter how I
| compile it, I would know what is `#define`-d and what is not,
| instead of guessing on what the "host" might have.
___________________________________________________________________
(page generated 2025-06-25 23:01 UTC)