[HN Gopher] IPC - Unix Signals
___________________________________________________________________
IPC - Unix Signals
Author : goodyduru
Score : 43 points
Date : 2023-10-09 13:09 UTC (9 hours ago)
(HTM) web link (goodyduru.github.io)
(TXT) w3m dump (goodyduru.github.io)
| butterisgood wrote:
| Signals are such a mess - especially with a multi-threaded
| process/service.
|
| I suspect very strongly you'll get a much better behaved system
| if you stick to unix domain sockets.
|
| Then, at least, you get some form of authn from the permissions
| of the filesystem namespace.
| jlokier wrote:
| _> A new pid is always more than the previously assigned pid
| during an OS uptime_
|
| This isn't true. The pid increments most of the time, but
| occasionally the new pid jumps down to a seemingly arbitrary
| value to start a new range. It is switching between unused
| ranges. This happens on my Linux machines all the time.
|
| However, it doesn't matter for what the article describes: it
| doesn't depend on the pid always increasing, it only depends on
| the pid for each process being unique.
|
| Pid reuse doesn't matter often, but it matters occasionally with
| how sigals are often used. Many scripts are written to signal a
| process that might have terminated already, e.g. using a pid
| stored in an environment variable or file. This usually just
| returns an error, but if there has been enough system activity
| meanwhile, the pid may belong to a new process, which the script
| does not intend to signal, so a random process may be killed
| unexpectedly.
| rcr wrote:
| Linux introduced pidfds for this reason.
| duped wrote:
| I feel a bit nerd sniped: On Linux PIDs are only unique within
| a PID namespace. But that doesn't matter because the entire
| point of a PID namespace is to isolate processes.
|
| But it's worth mentioning because it can mitigate the problem
| you've laid out. If you have a long running process that may
| need to kill its children, you can start with within a new PID
| namespace such that it may only kill its children (or
| descendant).
| rcr wrote:
| This doesn't entirely solve the problem of potentially
| killing the wrong process though
| duped wrote:
| No but it is a lot less likely for a PID to get reused on
| accident within a new PID namespace.
| remram wrote:
| That should only be a problem if you're running as root.
| Running as a dedicated user is also a mitigation.
| goodyduru wrote:
| Author here. Thanks for the feedback! Corrected the error.
| jeffbee wrote:
| Anyone who thinks they understand unix signals is fooling
| themselves. Anyway, the basis of the claim that you can exchange
| half a million small messages per second using signals is
| misunderstanding. The benchmark suite in question passes no data,
| it only ping-pongs the signal.
|
| https://github.com/goldsborough/ipc-bench
| loeg wrote:
| > A new pid is always more than the previously assigned pid
| during an OS uptime[1]
|
| This is not universal. Pids can be reused in unixes. The footnote
| seems to be missing when I went to see if it provided this
| context.
|
| > Signals are plenty fast. Cloudflare benchmarked 404,844 1KB
| messages per second. That can suit most performance needs.
|
| Uh, how is Cloudflare sending 1kB messages with signals? Neither
| this article nor the linked Cloudflare post seem to elaborate on
| this.
| goodyduru wrote:
| Author here. I've corrected the errors. Thanks!
| Denvercoder9 wrote:
| > Uh, how is Cloudflare sending 1kB messages with signals?
| Neither this article nor the linked Cloudflare post seem to
| elaborate on this.
|
| Cloudflare links to the code they used; it doesn't actually
| seem to transfer any message (note that the README there also
| marks unix signals as broken):
| https://github.com/goldsborough/ipc-bench/blob/master/source...
| toth wrote:
| If you look at the link to the Cloudfare post [1], they are
| comparing a bunch of different IPC methods. I bet what I did is
| for all the other ones (shared memory, mmaped files, unix
| sockets, etc) they did send 1KB messages, but for signals they
| are just sending the signal and associated int. It's the only
| thing that makes sense to me.
|
| [1] https://blog.cloudflare.com/scalable-machine-learning-at-
| clo...
| jandrese wrote:
| That's a good question. Maybe set up 1kb of shared memory and
| you notify the far process that it is ready via a signal? Seems
| a bit clunky.
| loeg wrote:
| Cloudflare's table has a separate row for shared memory, so,
| who knows.
| generalizations wrote:
| In FreeBSD at least, PIDs can also be randomized.
| dale_glass wrote:
| That's a bit misleading without mentioning all the mess that
| comes with signals.
|
| Threads are a problem. Reentrancy is a problem, so you can't just
| call printf from your signal handler. Libraries are a problem,
| independent users of signals will easily step on each other's
| toes. There's a lack of general purpose signals, which compounds
| the problem. Signals are inherited when forking. Signals are not
| queued and therefore can be lost.
|
| Signals carry no data, so they can't tell the program "This
| specific pipe closed", you have to figure out yourself which
| specific thing the signal is relevant to.
|
| I'm probably forgetting more stuff.
|
| TL;DR: Signals are a pain in the butt and IMO undesirable to use
| under most modern circumstances. If you have something better you
| can use, you probably should. It would be nice to have kdbus or
| something similar with more functionality, and less footguns.
| Joker_vD wrote:
| > We can start our processes with the same process groups and
| send signals to each other using kill(0, signum). With this,
| there's no need for pid exchange, and IPC can be carried out in
| blissful pids ignorance.
|
| Just make sure your processes don't launch any other processes
| not written by you: you don't know who else might be sneaky
| enough to use this method. Compounded by the fact that Linux
| process groups don't nest, this leads to very "funny" debugging
| sessions, when signals not supposed to arrive suddenly do or the
| signals that are supposed to arrive don't.
| duped wrote:
| What's wrong with sockets? They have quite a few advantages for
| IPC over Unix signals.
|
| - They're scoped within the program (no global signal handler
| that you need to trampoline out of)
|
| - The number of IPC channels you can create is effectively
| unbounded (not really true, but much less limiting than signals).
| If another process or part of the program needs IPC you can just
| open a new channel without breaking code or invariants relied
| upon by any other IPC channels.
|
| - Reads/writes can be handled asynchronously without interrupting
| any thread in the program.
|
| - You can use them across the network (AF_INET), VMs (AF_VSOCK),
| or restrict locally (AF_UNIX)
|
| - Unix sockets can be used to send file descriptors around, even
| if they're opened by a child after forking. That includes using
| Unix sockets to send the file descriptor of _other_ sockets (eg
| program 1 is talking to program 2 on the same machine and program
| 2 opens an IPC channel with program 3 and wants to send it back
| to program 1).
|
| I feel like just because you can use signals for IPC doesn't mean
| you should.
| ajross wrote:
| So, nothing is wrong with sockets. Sockets (both unix domain
| and TCP) are the overwhelming choice for IPC mechanisms in new
| code, have been for the last few decades, and will be when we
| all retire.
|
| Nonetheless it's not uncommon to have a communication style
| where (1) messages are extremely simple, with either no
| metadata or just a single number associated with them, (2) may
| be sent to any of a large number of receiving processes. And
| there, signals have a few advantages: you don't need to manage
| a separate file descriptor per receiver, you don't need to
| write a full protocol parser, you can test or trigger your API
| from the command line with e.g. /usr/bin/kill. They're good
| stuff.
|
| But do be aware of the inherent synchronization problems with
| traditional handlers. Signals interrupt the target's code and
| can't block to wait for it (they act like "interrupts" in that
| sense), so traditional synchronization strategies don't work
| and there are traps everywhere. If you're writing new signal
| code you really want to be using signalfd (which, yeah, re-
| introduces the "one extra file descriptor per receiver" issue).
| duped wrote:
| Using a signal handler for IPC isn't really "simple" though,
| since the handler needs to be async signal safe itself. You
| don't need a "full protocol parser" for sockets either. You
| can send/receive C structs just fine. It's also not hard to
| write to sockets from the command line.
|
| > But do be aware of the inherent synchronization problems
| with traditional handlers.
|
| This is why you should almost never use signal handlers for
| IPC, because they're full of footguns and not actually simple
| to use.
|
| Increasing the number of file descriptors doesn't seem like
| much of a burden. If your app actually pushes on those limits
| you need to be messing with ulimits anyway.
| ajross wrote:
| > You don't need a "full protocol parser" for sockets
| either. You can send/receive C structs just fine. It's also
| not hard to write to sockets from the command line.
|
| Yeah, that's how you get vulnerabilities. For the same
| reason that "Thou shalt not write async signal handlers
| without extreme care", thou must never send data down a
| socket without a clear specification for communication. It
| doesn't (and shouldn't) have to be a "complicated"
| protocol. But it absolutely must be a fully specified one.
|
| And signals, as mentioned, don't really have that
| shortcoming due to their extreme simplicity. They're just
| interrupts and don't pass data (except in the sense that
| they cause the target to re-inspect state, etc...). They're
| a tool in the box and you should know how they work.
|
| Mostly the point is just "IPC is hard", and choosing tools
| isn't really the hard part. Signals have their place.
| [deleted]
| goodyduru wrote:
| Author here. I agree with you on preferring sockets to signals
| for IPC. It's more straightforward and stable to use it. This
| article is one of a series I'm writing on IPC. Writing about
| them ensures I understand them better. I wrote one on
| sockets[1].
|
| [1] https://goodyduru.github.io/os/2023/10/03/ipc-unix-domain-
| so...
___________________________________________________________________
(page generated 2023-10-09 23:02 UTC)