[HN Gopher] Ghosts of Unix past, part 3: Unfixable designs (2010)
___________________________________________________________________
Ghosts of Unix past, part 3: Unfixable designs (2010)
Author : wmanley
Score : 135 points
Date : 2021-05-17 14:25 UTC (8 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| kazinator wrote:
| Forgets threads!
|
| Speaking of signals, the mess of when POSIX threads collided with
| signals. And with fork. And chdir being process wide ...
| Juntu wrote:
| Biometrical specimens can be another 5 to 20 years. No threads
| old now tomorrow I can create data to show easy tasks the year is
| decade plus one passed part 2 and 1. No threads
| jude- wrote:
| I think "unfixable" is in the eye of the beholder. A lot of these
| design choices make a lot more sense if you believe that complex
| Unix applications are supposed to be built from multiple loosely-
| coupled programs that perform exactly one task each, and
| communicate by piping data to one another. I think the two design
| decisions discussed in this article only create problems for
| developers who write programs that do "too much," such that the
| programs can no longer make best use of the OS facilities.
|
| > Unix signals
|
| Unix signals are an asynchronous best-effort form of out-of-band
| IPC. Because the programs that make up your Unix application
| _already use_ pipes for IPC (which are synchronous and reliable),
| the role of the signal handler in a program would be to either
| absorb the signal by taking some localized action, or translate
| the signal into some piped IPC message to other program(s) in the
| application to consume and handle.
|
| It's been pointed out elsewhere that threads and signals don't
| play nicely. But that shouldn't be a problem for a multi-program
| Unix application -- you'd keep the multi-threaded logic in a
| separate program(s) from the signal-handling logic, and have the
| signal-handling program forward the multi-threaded program the
| signal data in-band, via a pipe. For example, you might factor
| the application into a supervisor program and one or more
| subordinate programs (which can be multi-threaded), and have the
| supervisor intercept signals and route the relevant IPC
| notification to subordinates via pipes.
|
| > Unix permissions
|
| The "one user and one group" model for files stops being so
| limiting if you can make it so the different programs that make
| up your application run as different users and groups. For
| example, a "logger" program in your application would have a
| separate user/group ID than a "database" program, and in doing
| so, ensure that the "logger" program can only access log state,
| and the "database" program can only access database state.
| zxzax wrote:
| But that's more working around the problems than fixing them,
| no? You can't really move all signal handling logic to a
| dedicated process, because every process can receive signals.
| And as for databases: every major production database I've seen
| implements its own authentication and permission scheme, so it
| can do things like provide ACLs on a more granular (per-table,
| sometimes per-row) basis.
| jude- wrote:
| > But that's more working around the problems than fixing
| them, no?
|
| I don't think Unix signal behavior is the problem. The
| problems outlined in the article stem from people using them
| inappropriately. The new signal syscalls introduced in Linux
| over the years haven't stopped people from misusing them.
|
| > You can't really move all signal handling logic to a
| dedicated process, because every process can receive signals.
|
| Processes are not obliged to take action in response to
| signals. But, they _could_ simply propagate the signal data
| to the parent process via a pipe file descriptor it inherits.
| Then, you _could_ place all the signal-handling logic into a
| supervisor -- the supervisor would get notified via a pipe
| when one of its descendants receives a signal, and take
| appropriate action.
|
| > And as for databases: every major production database I've
| seen implements its own authentication and permission scheme,
| so it can do things like provide ACLs on a more granular
| (per-table, sometimes per-row) basis.
|
| No one said an application can't have its own authentication
| and permission scheme. All I am saying is that if you factor
| your application into multiple processes running under
| different system-level user accounts, you can get more
| mileage out of the Unix permission system than you could
| otherwise, because the kernel would be able to distinguish
| individual pieces of your application as having different
| sets of permissions.
| bombcar wrote:
| Isn't the case with something like signals is that it needs to
| simply be left as is and instead a new API for interprocess
| communication be developed alongside?
|
| It seems pretty clear that they're used for way more than
| originally expected (did threads even exist when signals began?)
| - and I suspect a number of systems use other communication paths
| already.
| Jasper_ wrote:
| A proper inter-process communication mechanism that supports
| multicast has been proposed to the kernel multiple times and
| denied every time. So it's probably not going to happen.
| nine_k wrote:
| Why was it denied?
| cbsmith wrote:
| Unix has a plethora of IPC APIs, almost all of which were
| invented after signals (e.g. sockets).
|
| Signals themselves got new APIs long before signalfd: sigaction
| and posix real-time signals were already a thing, as were posix
| threads, when Linux was invented.
| User23 wrote:
| What's really sad is that multi user system interrupts were
| long since a solved problem when Unix was developed. I don't
| know why that existing body of knowledge wasn't applied.
| pjc50 wrote:
| Which current system uses this? Do you have a shorter
| explanation than the Dijkstra link below, like API docs?
| coldtea wrote:
| So, like warts in C and Go that were not present or fixed in
| languages a decade or more earlier?
| pjmlp wrote:
| Yep, somehow there is a common line going on there.
| linschn wrote:
| Do you have a reference on pre-1969 multi user interrupts
| being solved? Also, Unix was developed on a ~16kB RAM machine
| IIRC... Maybe that's the reason?
| User23 wrote:
| Sure, here's Dijkstra on it[1]. The X1[2] was a
| significantly more limited machine than the PDP-11. I
| believe that work predates Unix by nearly a decade.
|
| [1] https://www.cs.utexas.edu/users/EWD/transcriptions/EWD1
| 3xx/E...
|
| [2] https://ub.fnwi.uva.nl/computermuseum//X1.html
| linschn wrote:
| Thanks
| pm215 wrote:
| The impression I got from reading the v6 kernel code in the
| Lions book was that signal handling had been added in as a
| solution to a few specific problems (like "we're going to
| kill this process but maybe it should get a chance to clean
| up first"). If you're thinking about them from that viewpoint
| then the (now) well-known problems like interruption-of-
| syscalls and the initial "when you take a signal the handler
| gets unregistered" don't seem like such a big deal -- after
| all, the process is going to exit anyway.
| pjmlp wrote:
| Just like safer systems programming languages precede C for
| about 10 years.
|
| It is not as if the UNIX culture was to pay attention to best
| practices being done on other systems.
| icedchai wrote:
| Threads arrived relatively late to Unix, long after signals. I
| think Solaris 2.x was the first mainstream Unix to have
| threads.
| usr1106 wrote:
| Are you talking about kernel threads or user space threads? I
| believe none of them were really in Unix very early, but the
| time of introduction varied.
| icedchai wrote:
| I was referring to kernel threads.
| usr1106 wrote:
| I vaguely remember that kernel threads were something new
| in HP-UX in the end of the 1990s.
|
| The question is, was that late? Windows NT had kernel
| threads from the beginning, so maybe a few years earlier.
| But then it took NT years to become stable enough to be
| used in servers, so saying they were generally ahead
| would not be a correct description.
|
| So if Unix is considered late (according to the GGP) and
| NT not a comparable competitor, who was really early? If
| anybody.
| QuesnayJr wrote:
| Didn't OS/2 have threads?
| icedchai wrote:
| BeOS was heavily threaded.
| pjmlp wrote:
| Xerox PARC workstations, Solo OS, Topaz are some early
| examples.
| pavon wrote:
| Title should have (2010)
| bediger4000 wrote:
| Lots of difference between how unfixable design problems are
| treated in open source vs closed source operating systems. Is
| this difference good or bad?
| [deleted]
| surajrmal wrote:
| I would argue being open source vs closed source doesn't
| matter. The governance model does as do the priorities of the
| project. This isn't to say you need a BDFL or single company
| running the project to address "unfixable" problems, but they
| certainly do seem to help.
|
| On a more meta note, open source means a lot of different
| things. There is actually a lot of nuance in the different
| styles of open source. Linux vs chromium vs that project that
| just does source dumps. Whether they accept contributions,
| accept bug reports/feature requests, allow you to build from
| source (source dumps often don't include a working build
| system), have open communication channels, etc can all vary. I
| hope we have more specific terms for the different styles of
| open source in the future.
| bombcar wrote:
| I agree - though open source does allow for a major fork if
| the users don't agree with the developers (or project
| leadership) on how it should be fixed.
|
| Project leadership is the most significant aspect - look at
| Linus's absolute declaration that the kernel can "never break
| userspace" meaning that once an API is exported to userspace
| it never gets removed.
|
| This is actually similar to Microsoft's philosophy though
| theirs is more "business oriented" (nobody will buy Win95 if
| their DOS and Win 3.1 programs won't work). Another example
| of this is Knuth's TeX code.
|
| Open source developments seem to lean (in general) more
| toward "rip it out and replace everything" (see for example
| internal kernel APIs not exported to userspace) because
| access to the source means they can fix the things that touch
| it. Closed source programs more likely just die and get
| entirely replaced, otherwise they roughly try to keep working
| as is.
| cbsmith wrote:
| Open source isn't a governance model.
| zxzax wrote:
| Something else that doesn't often get brought up here is that
| kill(2) itself is an unfixable race condition waiting to happen.
| It's only safe to use that to signal direct child processes. In
| Linux, programs should be using the newer pidfd_send_signal
| syscall in almost every case where they would otherwise use
| kill(2).
|
| Edit: waitpid is also similarly broken and unfixable for a lot of
| the same reasons as signals, pidfds and waitid(P_PIDFD, ...)
| should be replacing most uses of that as well.
| colonwqbang wrote:
| Could you explain what the problem is, or provide a link?
| [deleted]
| taviso wrote:
| The primary problem is that a process could exit, and the pid
| might be recycled, so you kill() the wrong process.
| MaxBarraclough wrote:
| The classic ABA problem, then.
|
| https://en.wikipedia.org/wiki/ABA_problem
| orthonormal wrote:
| Had no idea about this! Thank you. Now I'm starting to
| understand undefined behavior safety is such a walled
| garden. All sorts of snakes might be lurking beneath.
| asveikau wrote:
| This is a (rare?) instance where I would say Win32 gives
| you some remedy over POSIX. On Windows, you can open a
| handle and deal with that, rather than a pid. Once the
| handle is open, it isn't subject to this recycling problem.
|
| However if opening a handle based on pid you may want to
| double-check that the handle matches your expectation
| before using it, since that would be prone to the same
| race.
| simcop2387 wrote:
| This is actually what pidfd does on linux, you get a
| handle to the process that lets you interact with it in
| that same manner. Once the process exits the handle gets
| closed and all the operations will report an error even
| if you have a recycled pid
| hnarn wrote:
| This might be naive of me, but isn't this in a way fixed by
| systemd service handling? Assuming the process in question
| is, in fact, handled as a service of course.
| zxzax wrote:
| The issue is solved in any service manager as long as the
| service doesn't fork, when you are the parent process you
| can ensure that you don't reap the child before sending a
| signal.
|
| Once the service forks then it becomes a problem. If you
| use cgroups it can be solved separately with the cgroup
| freezer, but there are still some open issues with this
| in systemd:
| https://github.com/systemd/systemd/issues/13101
| Quekid5 wrote:
| Correct.
| Jasper_ wrote:
| I've always felt like it should be the case that as long as
| a pidfd for that process is open, the pid doesn't get
| recycled, so you could open the pidfd and then use kill
| safely, then close it later. Means you wouldn't need a
| whole bunch of new syscalls.
|
| Unfortunately, it seems like this idea was rejected during
| the introduction of pidfd.
| zxzax wrote:
| I think the idea with that was it would lead to denial-
| of-service type situations where some process could leak
| a bunch of pidfds and then that would cause exhaustion of
| pids everywhere else.
| Jasper_ wrote:
| A process could also do that by spawning a bunch of
| processes if it wanted to.
| zxzax wrote:
| Not really, most systems should set RLIMIT_NPROC to
| prevent that. If pidfds held onto the pid, it would
| create a new denial-of-service that allowed random other
| processes to keep zombie processes open, and the fix for
| it would actually allow you to circumvent that limit!
| Jasper_ wrote:
| You can also set RLIMIT_NOFILE to prevent the number of
| FDs the app can open if you're worried about it.
| zxzax wrote:
| I don't think that would necessary solve it, since the
| maximum number you can have open is still RLIMIT_NPROC *
| RLIMIT_NOFILE, right? It seems it would still be a
| problem as long as it's greater than RLIMIT_NPROC. Edit:
| I suppose you could fix it as long as you could guarantee
| that NPROC * NOFILE * maxlogins < kernel.pid_max... but
| to me this is piling on more workarounds.
| zxzax wrote:
| The problem was discussed quite a lot around the introduction
| of pidfds:
|
| https://lwn.net/Articles/773459/
|
| https://lwn.net/Articles/784831/
|
| In essence it's a classic TOCTTOU.
| SavantIdiot wrote:
| Regarding file permissions: This kind of archeaology (or
| forensics) is very important. Why? Because it exposes the trial-
| and-error over a multi-decade evolution. Sometimes trial-and-
| error is used as a pejorative (brute force hacking), but over the
| course of decades as technology advances, it is inevitable. The
| author is clear to point out that much of the issue here was due
| to scalability, but i think there is something else at work:
| unknown unknowns. It is impossible to be a 100% defensive
| software architecture team, and "room to grow" is usually
| jettisoned because it can lead to sloppy code, or worse, attack
| vectors. It's such a hard problem and analyses like these papers
| are first step in what I believe will become a full-blown
| historic discipline of software meta-thought. I say "become"
| because you can't really do this kind of analysis with 5, 10 or
| 20 years of history: you need multiple decades, and that is just
| now upon us.
|
| I think there are applications for this kind of study. It can
| very clearly feed back into current practices, and possibly even
| more formalized language syntax that can be defensive and
| extensible. I would also love to see if this kind of analysis
| bears out which aspects of various languages (and architectural
| OS decisions) proved to be the most robust. Like with hemaglobin:
| it is one of the largest and oldest genes, it is hard to break
| via mutatation, and is shared by every animal with oxygenated
| blood cells. Something was done right with that design!
| nooyurrsdey wrote:
| Really enjoyed reading about the struggles of implementing file
| permissions.
|
| It seems like something that should be so simple, but once you
| sit down and try to build it you'll realize you have to support
| so many uses cases. I bet if you asked everyone on HN how they'd
| do it, you'd end up with so many confident answers that also had
| shortcomings themselves.
| williesleg wrote:
| Well kids, get off your ass and write a new operating system. We
| invented it, you all sit on your ass and do nothing but complain.
| With ads.
| primis wrote:
| One of the complaints brought up - the 16 bit group/uid seems to
| have been fixed quite nicely in modern linux systems by adding an
| additional 16 bit s to each. It seems these problems aren't
| "unfixable" after all
| mmcgaha wrote:
| The idea that having ownership and permission bits at the file
| level being a problem fixed by moving the permissions to the
| directory level completely hand waves over the fact that hard
| links exist in unix file systems. They need to think a little
| harder about that Mencken quote.
| wmanley wrote:
| See the next article in the series "Ghosts of Unix past, part
| 4: High-maintenance designs"[1]. This specifically addresses
| how the existence of hard links is elegant in itself, but
| exports complexity to other parts of the system.
|
| [1]: https://lwn.net/Articles/416494/
| Macha wrote:
| Proper ACLs exist on Linux these days as an alternative to
| user/group permissioning as well for use cases which call for a
| more powerful system.
| bombcar wrote:
| The "systematic" part of things is relatively easy to handle
| (as the code can be made to handle anything complex) - it's
| the "user" interface that is harder. A user with root access
| wants to give access to a given file/directory to a user -
| this needs to be made easy to do successfully and securely.
| Too many times I've seen entire web directories 777 because
| they just wanted it to work.
|
| Commands providing "why user X can't access Y" and
| recommended solutions can help.
| samf wrote:
| Yes, this is a major problem when you introduce ACLs into
| unix-like systems. A comment in the article mentions the
| "Richacl" work. A key problem with this work was that even
| "chmod 777" might not get you out of a situation where an
| ACL was denying access. It's been over ten years since I've
| been involved in this; it might have changed.
|
| The POSIX draft ACLs had the same problem, where a chmod
| might not grant you the permission that you're asking for.
| Back when Solaris implemented POSIX draft ACLs, they needed
| to change many user-level interfaces (e.g., the chmod
| command and the ftp daemon) to have a chmod request work
| the way end users expected.
| devchix wrote:
| Have you worked with setfacl(1), getfacl(1) recently? The
| agony they inflict makes me want to die. Do you need log dirs
| read by a non-root logreader? Are there nested subdirs? What
| are the defaults? Extra crispy boss-mode: is SELinux on? I
| think the extended ACLs have taken us further into the weeds,
| and I think the permission architecture needs to be rethink
| entirely. It was designed for shared university-type
| computing resources at a time when 30 profs and researchers
| shared dirs and commingle a set of files, and daemons are
| users with own places to keep things. No longer. The RBAC and
| inheritance model, I dunno, they may work correctly but they
| are so fiddly with so many knobs and intersections that you
| end up front-loading a huge amount of work; nobody wants to
| do that, nor have I seen it done correctly, with design and
| intent.
| GauntletWizard wrote:
| I'm actually fully behind the POSIX permissions model as a
| solution for this: If you have a group that all needs
| read/write, no big deal. If you have a group that needs to
| write and the world reads, no big deal. If you have a group
| that writes and another group that reads: No big deal, so
| long as you have a third group that's the union of both
| groups and can have a multi-level subdirectory (where a/
| has 750 and a/b/ has 775). If you have groups that need to
| read and groups that need to write in a more complicated
| (or somehow path-specific) problem, you probably need a
| daemon or setuid program to moderate access, and that's
| okay.
|
| Happy to argue it or simply be told I'm wrong, but I've yet
| to encounter a not-insane permissions model that I couldn't
| solve with some "simple" nested groups (that in and of
| itself is a tooling problem, but a solvable one) and POSIX.
| trasz wrote:
| Linux is probably the last major system not supporting NFSv4
| ACLs. Windows (obviously), MacOS X, Solaris, FreeBSD - all
| those support them - for at least a decade now.
___________________________________________________________________
(page generated 2021-05-17 23:00 UTC)