[HN Gopher] Fork() without exec() is dangerous in large programs...
       ___________________________________________________________________
        
       Fork() without exec() is dangerous in large programs (2016)
        
       Author : pcr910303
       Score  : 107 points
       Date   : 2022-06-14 14:12 UTC (8 hours ago)
        
 (HTM) web link (www.evanjones.ca)
 (TXT) w3m dump (www.evanjones.ca)
        
       | tuxoko wrote:
       | The suggestions here aren't really great. What you should do is
       | already written in the fork(2) manpage
       | https://man7.org/linux/man-pages/man2/fork.2.html
       | 
       | "After a fork() in a multithreaded program, the child can safely
       | call only async-signal-safe functions (see signal-safety(7))
       | until such time as it calls execve(2)."
       | 
       | So just use only async-singal-safe function
       | https://man7.org/linux/man-pages/man7/signal-safety.7.html
       | 
       | I don't know why so many people still hit this issue when it
       | already told you what you can do and not do in the document. I've
       | done this sort of things without any issue.
        
         | the_mitsuhiko wrote:
         | Because in practice it's almost impossible to know if you are
         | single threaded still. Plenty of system libraries have
         | background threads.
        
       | krylon wrote:
       | The devil, as they say, is in the details.
       | 
       | But to be fair, the only times I can recall using fork() without
       | exec() were forking network servers, and that was mostly me
       | learning about doing network stuff, and a forking server was the
       | easiest to implement manually.
       | 
       | Oh yeah, and that one time I accidentally wrote a fork bomb
       | trying to stress test a DNS server. At least I learned something
       | from my mistake. ;-)
       | 
       | EDIT: To me, using fork() without exec() is kind of like operator
       | overloading - there are cases where it absolutely is the right
       | tool, but these aren't very numerous, so one should exercise
       | caution. A lot.
        
       | Strilanc wrote:
       | Another danger using fork is it duplicates the internal state of
       | pseudo random number generators. It's a great way to accidentally
       | take the same random samples in every process, utterly trashing
       | any statistics you were intending to do. Bonus: the python
       | multiprocessing module silently uses fork by default. Person A
       | writes a "make multiprocessing convenient" library, Person B
       | writes a sampling library, you put them together and...
       | _whoops!_.
        
         | cryptonector wrote:
         | Libraries like that should use pthread_atfork() to
         | automatically reset/reseed/whatever state as needed at fork()
         | time.
        
           | Strilanc wrote:
           | I don't think that's really a viable strategy in practice in
           | an ecosystem as complex as python's. There's too many
           | libraries and too many little corner cases and interactions
           | around what the behavior should be.
           | 
           | For example, suppose I am using library A and I initialized
           | the random number generator with a fixed seed. Clearly when I
           | fork it's not appropriate for A to reseed, because I wanted
           | fixed behavior. Something is very wrong so probably there
           | should be an exception. But now suppose I was using library B
           | which was using A and B handles getting system entropy to
           | seed A. Now it _is_ clear that when I fork I probably want B
           | to reseed A, but alas A has already raised an exception
           | because it was given a (from its perspective) fixed seed. So
           | now A needs to be redesigned to be given a seed and like some
           | sort of intent on what should happen when forking, and oh my
           | god wow this is creating a lot of work for everyone
           | everywhere this is not actually going to be done consistently
           | and cannot be trusted.
        
             | cryptonector wrote:
             | If you're writing a simulation or a test, then you'll want
             | the PRNG to stay unchanged, and you'll want to be in
             | control of any reseeding.
             | 
             | For all other RNG uses, you really do want it to reseed.
             | 
             | A cryptographic PRNG vs. a simulation PRNG are very
             | different things, and should be different libraries.
        
           | agwa wrote:
           | pthread_atfork functions aren't called if the application
           | calls the clone syscall directly. The right solution is
           | MADV_WIPEONFORK on Linux, or MINHERIT_ZERO on OpenBSD:
           | 
           | https://www.metzdowd.com/pipermail/cryptography/2017-Novembe.
           | ..
        
             | cryptonector wrote:
             | That helps with memory mappings, but it doesn't help with
             | file descriptors -- you still have to be careful with
             | those.
        
       | chubot wrote:
       | FWIW this is the same reason you can't implement implement a
       | portable Unix shell in portable Go. (And similar issues with an
       | init daemon)
       | 
       | Go only exports os.ForkExec() -- there is no os.Fork() or
       | os.Exec(), because the things you can do between the calls could
       | break Go's threaded runtime. (Goroutines are implemented with OS
       | threads.)
       | 
       | Some elaboration on that:
       | https://lobste.rs/s/hj3np3/mvdan_sh_posix_shell_go#c_qszuer
       | 
       | That is, the space between fork and exec is where pipelines are
       | implemented, but also entire subinterpreters/subshells. The shell
       | actually uses copy-on-write usefully. (And yes I'm aware that
       | there's a good argument that the shell is almost the ONLY program
       | that needs fork() !)
       | 
       | ----
       | 
       | A lot of people have asked me why not implement Oil in Go and
       | various other languages, so I wrote this page:
       | 
       | https://github.com/oilshell/oil/wiki/FAQ:-Why-Not-Write-Oil-...
       | 
       | So the funny thing is that Python is a lower level language than
       | Go for this particular problem. It doesn't do anything weird with
       | regard to syscalls. I'm still looking for help on this (and
       | donations to pay people other than me):
       | 
       |  _Oil Is Being Implemented "Middle Out"_
       | https://www.oilshell.org/blog/2022/03/middle-out.html
        
         | throwaway892238 wrote:
         | I am always thoroughly amazed when people develop programming
         | languages that are intentionally difficult to use.
        
           | chubot wrote:
           | Please don't post superficial and petty comments. Instead
           | think more deeply about tradeoffs in designing for
           | concurrency
        
         | gizmo686 wrote:
         | > The shell actually uses copy-on-write usefully. (And yes I'm
         | aware that there's a good argument that the shell is almost the
         | ONLY program that needs fork() !)
         | 
         | It's been a while since I looked at it, but I believe Android
         | uses fork for it's copy-on-write sementics to optimize app
         | startup. On boot it initializes a single instance of the app
         | runtime environment. Then when you launch apps that initial
         | process is forked. As a result you do not need to reinitialize
         | the runtime for every app launch.
        
           | epmos wrote:
           | This is moderately common for environments where you are
           | pushing a lot of startup work into the dynamic linker and
           | will be launching processes frequently. Loading shared
           | libraries for example.
           | 
           | You have a parent process which uses dlopen() to load all the
           | libraries you want to avoid re-linking. When you want to
           | spawn a child, rather than exec() you dlopen() an object with
           | your child's main() and call it. For the case where you have
           | enough libraries this is much faster than an exec(), saving
           | tens of seconds on every application launch if you have a
           | really bad case of C++.
           | 
           | There some small surprises which become obvious with a little
           | thought. You are responsible for everything that normally
           | happens in your process before main() is called. ASLR is only
           | done once per session. People rarely think to fix-up argv[]
           | for ps and friends in the first version.
        
           | chubot wrote:
           | Yes I think the argument is that Android (and Chrome) could
           | use something like vfork or posix_spawn().
           | 
           | I'm not sure which, if any; I'd like to see an analysis of
           | that... The issue is what kernel state is preserved/shared
           | across the process creation call.
           | 
           | Every process sort of has a "mirror" in kernel memory. The
           | user memory is CoW, and I suppose you also have to choose
           | whether to copy or reference every kernel data structure as
           | well --- open files in FD tables which point to
           | disk/pipes/sockets, locks which seem to be nonsensical, etc.
           | 
           | But probably you can get the "warmup" property without the
           | full semantics of fork(). That is the CoW of user memory is a
           | somewhat separate choice from the kernel data structures.
           | 
           | ----
           | 
           | As far as the shell .... In the recent linked thread, Ninja
           | uses posix_spawn because it has a simple use of subprocesses:
           | https://news.ycombinator.com/item?id=30503382
           | 
           | So there are definitely cases where a shell uses fork/exec
           | like Ninja, so you could imagine optimizing it. But the
           | subshell/subinterpreter case is probably the most general --
           | the language semantics depend on it. And it's actually
           | useful, e.g. this "alternative shell challenge":
           | 
           | https://www.oilshell.org/blog/2020/02/good-parts-
           | sketch.html...
           | 
           | and I use that pattern literally yesterday and this morning,
           | etc.
           | 
           | ----
           | 
           | edit: looks like fish already addresses this, i.e. where you
           | can use posix_spawn in a shell, and where you can't!
           | https://news.ycombinator.com/item?id=31743230
        
         | jepler wrote:
         | I think this turns out to be a tangent, but at least
         | superficially it is possible for a C program to "do" shell
         | pipelines without use of fork or vfork (directly) but rather by
         | posix_spawn. I suppose "portable go" does not directly wrap
         | posix_spawn so this option may not be on the table for you.
         | 
         | Basics:
         | https://gist.github.com/ec8469273c7808d46c7285cd056d0104
         | 
         | Typical use: `./a.out seq 3 2 9 -- cat -n` is similar to `seq 3
         | 2 9 | cat -n` except that the return value is nonzero if either
         | side's return value is nonzero.
         | 
         | that said, I wouldn't be surprised if there's something
         | important I'm overlooking here.
        
           | chubot wrote:
           | Hm very interesting. I haven't used or seen posix_spawn()
           | used before, but I saw in a recent thread that Ninja uses it,
           | so that's a positive sign.
           | 
           | I saved this here!
           | https://github.com/oilshell/oil/issues/1161
        
       | garethrowlands wrote:
       | "The received wisdom suggests that Unix's unusual combination of
       | fork() and exec() for process creation was an inspired design. In
       | this paper, we argue that fork was a clever hack for machines and
       | programs of the 1970s that has long outlived its usefulness and
       | is now a liability. We catalog the ways in which fork is a
       | terrible abstraction for the modern programmer to use, describe
       | how it compromises OS implementations, and propose alternatives."
       | 
       | from _A Fork in the Road_ , <https://www.microsoft.com/en-
       | us/research/uploads/prod/2019/0...>
        
         | cryptonector wrote:
         | Discussed in:
         | 
         | https://hn.algolia.com/?q=https%3A%2F%2Fwww.microsoft.com%2F...
         | 
         | and also here:
         | 
         | https://news.ycombinator.com/item?id=30502392
        
         | indrora wrote:
         | To present some contrast:
         | 
         | Windows doesn't have fork(). It has a real, fully mature thread
         | and process model. In Windows NT, every process consists of a
         | handle that is a "Process", which in turn points to a structure
         | containing a list of "Threads". A process is done when its main
         | thread exits or all threads exit, whichever is defined by the
         | main process. Fork/Exec is replaced with CreateProcess (or
         | ShellExecute, your choice).
         | 
         | For a very zen-like example of the fork/exec and pipe
         | management that you'd do on a POSIX system done in Windows, the
         | [MSDN Docs](https://docs.microsoft.com/en-
         | us/windows/win32/procthread/cr...) are quite informative.
        
         | wongarsu wrote:
         | fork() and exec() is a reflection of the unix philosophy: small
         | tools for separate tasks, and as a result you have great
         | composability. You can set up right etc for your new process by
         | calling any number of system calls between fork() and exec().
         | 
         | Windows is a great example of the alternative (following
         | "Windows philosophy"): There are about three different API
         | calls for creating a new process, each with a heap of
         | complicated optional arguments. The API becomes more complex,
         | less composable, less extensible and less powerful. But also
         | easier to reason about and easier for the kernel to provide,
         | and arguably with fewer footguns.
        
           | phendrenad2 wrote:
           | Has Linux gained syscalls equivalent to those Windows API
           | calls yet? Or is linux too different from the windows kernel
           | to make that happen? (In that case, what does WINE do?)
        
             | lanstin wrote:
             | On linux, fork libc function(s) is(are) a wrapper to clone
             | system call which is more flexible (they added some stuff
             | to make Wine work better):
             | 
             | https://lwn.net/Articles/826313/
             | 
             | Clone is also used to start threads, IIRC:
             | 
             | https://stackoverflow.com/questions/4856255/the-
             | difference-b...
        
           | fpoling wrote:
           | Windows approach is not the only alternative. Simply provide
           | API to create a process in a suspended state, then adjust its
           | properties based on pid/handle and then start the process
           | execution.
        
             | ncmncm wrote:
             | A huge number of tricky thread problems go away if the
             | child thread is blocked at startup, and allowed to run only
             | after the parent allows it. To retrofit this, it is easiest
             | to lock a mutex before spawning and have the child block on
             | that. Then the parent unlocks it to let the child run.
        
               | lanstin wrote:
               | I think shared libraries can spawn threads on their
               | load/init phase that you don't know about. Then you are
               | hosed but you only know about it due to sporadic weird
               | problems, that if you restart on failure, e.g. a pre-
               | forked worker scenario, you might never even really care
               | about.
        
           | klodolph wrote:
           | fork() is not a small tool.
           | 
           | It is a MASSIVE, unwieldy tool that is difficult to use
           | correctly. It happens to have a small interface.
        
             | adastra22 wrote:
             | ...is it?
             | 
             | In it is original implementation, fork() was pretty
             | trivial. All it did was create a new process entry in the
             | kernel table, with all the pages and capabilities and such
             | copied from the original process. Then mark all pages as
             | copy-on-write, and return to the caller. Maybe not trivial,
             | but much less complicated than loading an executable file
             | from disk.
             | 
             | My understanding of Linux internals is maybe 20 years out
             | of date, so I am legitimately curious what makes fork() so
             | complicated these days.
        
               | cryptonector wrote:
               | fork() is not trivial now. Processes are huge now -- they
               | have huge heaps among other things. Copying all that is
               | expensive. In the 80s we tried COW, but that turns out to
               | be very slow as well. What operating systems do now is
               | immediately copy the resident set, then do COW for the
               | rest of writable memory, but in large, multi-threaded
               | processes, this is still too slow.
               | 
               | Use vfork() or posix_spawn().
        
               | duskwuff wrote:
               | Using fork() also means you end up with shared ownership
               | of resources like file descriptors, which can have some
               | pretty weird consequences.
        
               | bregma wrote:
               | Or more importantly, IPC mechanisms like mutexes. If
               | they're in shared memory, you now have two problems. The
               | runtime of a very _very_ popular scripting languages does
               | this.
        
               | cryptonector wrote:
               | This is true with all process creation APIs.
               | 
               | Windows defaults to CLOEXEC semantics and you have to
               | opt-in to child process inheriting open file handles, and
               | that has caused problems.
               | 
               | Unix defaults to not-CLOEXEC sematincs, and that too has
               | caused problems.
        
               | temac wrote:
               | The Windows default can cause problems because of simple
               | logic bugs.
               | 
               | The Unix default can cause unsolvable problems because of
               | races between threads.
               | 
               | You should use CLOEXEC everywhere. Except you can't
               | because you are using libraries.
        
               | cryptonector wrote:
               | True. The main problem with the Unix default is that
               | there wasn't a way to set O_CLOEXEC on all new FDs race-
               | free until recently. That's a real problem. FD leaks to
               | children can be bad, but most of the time they are not
               | the end of the world, and often one can steal a
               | closefrom() implementation from a BSD or Illumos as a
               | workaround when you know exactly what you want to allow
               | the child to inherit.
        
               | asveikau wrote:
               | closefrom() comes in handy for this. It's missing on some
               | platforms (notably glibc and mac iirc) but actually not
               | too hard to implement a work-alike.
        
               | cryptonector wrote:
               | I've copied a closefrom() many a time.
               | 
               | A common hack I've had to add is an argument FD that is
               | not to be closed because. e.g., an flock is held on it.
        
               | spc476 wrote:
               | Hmmm, from <https://www.man7.org/linux/man-
               | pages/man3/posix_spawn.3.html>:                   The
               | posix_spawn() and posix_spawnp() functions provide the
               | functionality of a combined fork(2) and exec(3), with
               | some         optional housekeeping steps in the child
               | process before the         exec(3).  These functions are
               | not meant to replace the fork(2)         and execve(2)
               | system calls.  In fact, they provide only a subset
               | of the functionality that can be achieved by using the
               | system         calls.
               | 
               | Also, there's no way to set resource limits in the child
               | process, nor switch user or group ID, using
               | posix_spawn().
        
               | adastra22 wrote:
               | Hrm. Googling "fork linux copy-on-write" seems to find a
               | lot of stack overflow answers from 2014-2015 claiming
               | Linux marks pages as copy-on-write when fork() is called.
               | I didn't see anything more recent in the first page of
               | results.
               | 
               | I could see it being worthwhile to immediately copy a few
               | pages, like the top of the stack, but copying the whole
               | resident set seems excessive. Especially since some of
               | that data might not even be written to.
        
               | CHY872 wrote:
               | So the problem is what happens to the old and new
               | processes after the fork. To CoW, you need to mark all
               | the pages read only in _both_ old and new which means
               | that every memory write in the caller will now pagefault
               | since the OS now has to lazily copy on both sides. So
               | with true copy on write the fixed costs may be low but
               | the marginal cost per memory write may be high in both
               | parent and child. In this case you can see why the
               | resident set is copied, yes? It's the smallest amount of
               | memory that guarantees predictable performance subsequent
               | to the call returning.
        
               | cryptonector wrote:
               | If the parent is threaded and the host has more than one
               | CPU, then fork() == TLB shootdowns, which are slow.
               | 
               | As well there's the cost of all those page faults that
               | the two processes are likely to take to do the copying.
               | 
               | And lastly there's all sorts of complexity involving
               | multiple parent threads calling fork(), or the child
               | calling fork() again (or vfork()) before calling exec.
               | 
               | It's just much easier to copy the resident set and mark
               | the address space as being CoW, because now you only have
               | to worry about page faults for pages that are not in core
               | anyways and so were going to fault anyways, and that
               | means you don't have to worry about TLB shootdowns either
               | (if a page is not in core, it's not referenced by any TLB
               | either). You still have the multi-fork issues, but now
               | you can use an atomic reference count on the address
               | space.
        
               | moomin wrote:
               | It was simple when processes were simple, but
               | requirements got serious.
        
               | turminal wrote:
               | > Then mark all pages as copy-on-write, and return to the
               | caller.
               | 
               | Unix actually copied the memory over initially.
        
             | [deleted]
        
           | sitkack wrote:
           | Fork is not a reflection of unix philosophy! It _is_ a
           | beautiful hack tho.
        
       | [deleted]
        
       | layer8 wrote:
       | New programming language implementations should maybe make fork()
       | and multithreading be mutually exclusive at link time by default,
       | and only allow them together in an unsafe-I-know-what-I'm-doing
       | mode (if at all).
        
       | infogulch wrote:
       | The dense fog lifts, tree branches part, a ray of light shines
       | down on the ruins of a moss-covered pedestal revealing the hidden
       | intentions of the ancients. A plaque states "The operational
       | semantics of the most basic primitives of your operating system
       | are optimized to simplify the implementation of command line
       | shells." You look upon the pedestal, pause in respect, then turn
       | away disappointed but unsurprised. As you walk you shake your
       | head trying to evict the after image of a beam of light
       | illuminating a turd.
        
         | tom_ wrote:
         | They called it RISC iX because it was RISC.
         | 
         | They called it Minix because it was mini.
         | 
         | And they called it POSIX because...
        
         | infogulch wrote:
         | I wrote this silly little bit on a previous fork() thread and
         | touched it up a little.
         | https://news.ycombinator.com/item?id=30504470
        
           | cryptonector wrote:
           | It was a glorious comment then, and it is now.
           | 
           | BTW, it's unclear whether the turd is... the specific truth
           | revealed, or the revelation itself (since it could be
           | incorrect). It's still a glorious comment.
        
             | infogulch wrote:
             | Thank you! The goal was to evoke the same emotional
             | response I felt when this idea struck me (I think when I
             | was reading earlier comments in that thread in fact), so I
             | guess there's as much in the sequence as the specific
             | objects. The semantic design of fork was always
             | incomprehensible to me (not what but _why_ ), so the
             | setting and pedestal and plaque represent how strongly it
             | struck me and the depth of its historical explanatory
             | power. The turd is how I feel about it. I don't resent the
             | designer's history or motivations or incentives and I'm
             | happy to know the truth. I'm just sad/depressed that this
             | is the reason why things are the way they are, that the
             | original design intent is so misaligned with the needs of
             | modern systems, how it limits our capabilities today even
             | down to hardware, and how unlikely it is for this to
             | change. I don't suppose a turd would last very long on a
             | pedestal, or that the ancients would put a turd there on
             | purpose. Maybe they put something beautiful there back
             | then, but times have changed and now we're stuck with it
             | and it's kind of shit.
        
               | cryptonector wrote:
               | Don't get too sad about legacy. A lot of things were
               | brilliant once that aren't now. I do still feel that
               | fork+exec _was_ brilliant _then_ , just not _now_.
               | Deprecation is hard, but we can celebrate what legacy
               | made possible.
               | 
               | Now, if only we had a time machine...
        
               | infogulch wrote:
               | Agreed. I'm not sure where to go from here, but I think I
               | smell fresh air through the WASM/WASI doorway.
               | 
               | "If in doubt, Meriadoc, always follow your nose." -
               | Gandalf
        
       | perryizgr8 wrote:
       | It's only dangerous if you use libraries without fully
       | understanding what they're doing. And most well designed
       | libraries will avoid creating threads, and will do so only when
       | you make it explicit that you want it to happen.
       | 
       | I also find that libraries that absolutely need to make their own
       | threads are better off being their own process. Then you can use
       | proper communication methods to pass data.
        
         | aidenn0 wrote:
         | Per TFA, a large fraction of OS X system libraries use threads,
         | so if you're developing for the Macintosh, fork() is already
         | out.
        
       | zokier wrote:
       | See also this recent 340 (!) comment thread about the issues of
       | fork https://news.ycombinator.com/item?id=30502392
        
         | phendrenad2 wrote:
         | Top comment there is gold (and makes a good point).
        
       | medoc wrote:
       | fork() also presents performance issues for programs with a large
       | virtual space. Here vfork() helps, but it has even more pitfalls
       | than fork(). I had written a small doc about converting the
       | recollindex Recoll indexer from fork() to vfork() a while ago:
       | https://www.lesbonscomptes.com/recoll/pages/idxthreads/forki...
        
       | Skunkleton wrote:
       | Odd to not see any mention of vfork. vfork solves the problems
       | with fork/exec for large programs.
        
         | tsimionescu wrote:
         | vfork() does the opposite of solving these problems. While
         | there are a few functions that you can call after fork(), there
         | is absolutely no function you can call after vfork() before
         | exec(). You can't even write most local variables.
         | 
         | vfork() solves the problem of not wasting so much time on
         | fork() when you're just going to call exec() afterwards (fork()
         | does A LOT of work - potentially, anyway).
        
           | cryptonector wrote:
           | > vfork() does the opposite of solving these problems. While
           | there are a few functions that you can call after fork(),
           | there is absolutely no function you can call after vfork()
           | before exec(). You can't even write most local variables.
           | 
           | Mostly wrong.
           | 
           | You can call functions on the child side of vfork(), but you
           | don't want to exec() in them -- you want to exec() in the
           | same function that called vfork().
           | 
           | And you can write to local variables, but you have to be
           | careful about it.
           | 
           | There's a ton of vfork()-using code that does these things.
           | 
           | Now, it's true that a compiler optimizer that knows nothing
           | about vfork() but knows about _exit()'s semantics, could
           | delete code it thinks is unreachable. So there is some issue,
           | but you can just disable the optimizer if you run into this.
        
             | tsimionescu wrote:
             | That's all undefined behavior, under POSIX at least [0]:
             | 
             | > The vfork() function has the same effect as fork(2),
             | except that the behavior is undefined if the process
             | created by vfork() either modifies any data other than a
             | variable of type pid_t used to store the return value from
             | vfork(), or returns from the function in which vfork() was
             | called, or calls any other function before successfully
             | calling _exit(2) or one of the exec(3) family of functions.
             | 
             | So sure - you can do these things, but they have very
             | little defined semantics after vfork().
             | 
             | It is true that Linux describes the semantics more clearly,
             | so perhaps on Linux it is safer to use.
             | 
             | [0] https://man7.org/linux/man-pages/man2/vfork.2.html
        
               | cryptonector wrote:
               | If you can call _exit() and exec, then you can call other
               | functions too. I do believe that the Open Group has
               | changed the description of vfork() to discourage its use
               | because of an old and incorrect paper from the 80s.
               | Actual implementations of vfork() are not as dangerous as
               | the Open Group text purports them to be.
               | 
               | Moreover, most posix_spawn() implementations use vfork(),
               | and they call more functions than _exit() and exec on the
               | child side.
               | 
               | Let's be reasonable about these things.
        
               | int_19h wrote:
               | An implementation of posix_spawn() is usually owned by
               | the same people who implemented vfork(), so _they_ know
               | what is and isn 't safe to call _in that particular
               | implementation_. But we don 't, and we shouldn't assume
               | that the implementation will not change. That's exactly
               | why public APIs and stability guarantees exist.
        
               | cryptonector wrote:
               | Not on Linux!
        
         | remexre wrote:
         | Isn't vfork much worse in terms of the problem the author is
         | talking about, since the child can now acquire locks in the
         | _parent's_ address space?
        
           | stefan_ wrote:
           | I thought the point of vfork is that they do not share an
           | address space. But there are other things still shared and
           | they should really just have a CreateProcess.
        
             | monocasa wrote:
             | They still share an address space until exec replaces it
             | for one of them. Particularly awful is that they share the
             | same mutable stack which is a pathway that only leads to
             | the inner circle of hell.
        
               | stefan_ wrote:
               | Assuming you call exec, of course. To not call exec after
               | vfork is not an option; one of the many ways the fork
               | family of functions are fundamentally broken.
        
               | monocasa wrote:
               | Well, without undefined behavior you can also call
               | _exit(), continue within the same function, and receive
               | conforming signals. Unfortunately this isn't always
               | spelled out and there's code out there that definitely
               | does other work invoking undefined behavior.
        
             | remexre wrote:
             | no, fork creates a new address space, vfork doesn't
             | 
             | the posix_spawn mentioned in the article is effectively the
             | equivalent of CreateProcess
        
               | medoc wrote:
               | Last time I looked, posix_spawn() just called fork/exec
        
               | int_19h wrote:
               | That's an implementation detail at this point. The idea
               | is to have a single syscall that takes all the
               | information needed to spawn the process, and does so
               | atomically, without the need to spread it across several
               | calls. On Win32, that's CreateProcess(). On POSIX, the
               | equivalent is posix_spawn().
        
       | dang wrote:
       | Discussed at the time:
       | 
       |  _Fork() without exec() is dangerous in large programs_ -
       | https://news.ycombinator.com/item?id=12302539 - Aug 2016 (101
       | comments)
        
       | throwaway892238 wrote:
       | > When I ran into this problem, I was just trying to run all of
       | Bluecore's unit tests on my Mac laptop. We use nose's
       | multiprocess mode, which uses Python's multiprocessing module to
       | utilize multiple CPUs. Unfortunately, the tests hung, even though
       | they passed on our Linux test server.
       | 
       | There will never be a time at which you can reliably expect any
       | program developed on one system to "just work" on a different
       | system. This person wasted a lot of time tracking down what was
       | essentially a portability bug. Did they _need_ this to be
       | portable? Was this time well spent generating business value?
       | 
       | Pick one system for development through production, stick to it.
       | There will be portability bugs hiding in your code, but you will
       | never have to fix them. You will be upset for a minute that you
       | can't use a different system, but you will get over it.
        
         | csours wrote:
         | I can write code that works NOW, but there is no guarantee that
         | code will work in the future.
        
       | billpg wrote:
       | I asked this a year or so ago. Interesting to read this article
       | in light of that discussion.
       | 
       | https://news.ycombinator.com/item?id=863871 (13 years? Yikes!)
        
         | cryptonector wrote:
         | fork() and the exec system calls exist because they were easy
         | to implement in the 70s on PDPs, and fork() was cheap enough
         | then, but much more importantly, it got the shell developers
         | out of having to write and evolve a more complex API in the
         | kernel. With fork/exec a shell developer could try lots of
         | variations for executing a pipe command w/o having to develop
         | any more kernel code.
         | 
         | For example, until BSD came along, not much had to change in
         | kernel land for any shell. Job control meant that the shell
         | would need to put all the processes for a job in the same pgrp,
         | and also there was a need to add `setsid()`.
        
       | legalcorrection wrote:
       | Won't stop the *nix know-nothings from criticizing Windows for
       | not natively supporting fork().
        
         | krylon wrote:
         | Funny side note, Perl fakes fork() on Windows using (I believe)
         | threads. I am not sure if that is better or worse than Windows
         | having fork() natively, though.
         | 
         | Some people would probably argue that if you use Perl, you have
         | much bigger problems to worry about, but that's another debate.
        
       | wruza wrote:
       | Curious why there isn't an interface in which all required
       | handles and resources could be passed to a child process
       | explicitly. E.g.:                 execvpehm(         ...,
       | int *handles, size_t,         void **pages, size_t,         /*
       | etc */       );
       | 
       | Would remove so many headaches with concurrency and accidental
       | inheritance.
        
       | ttoinou wrote:
       | > Only use fork in toy programs. The challenge is that successful
       | toy programs grow into large ones, and large programs eventually
       | use threads. It might be best just to not bother.
       | 
       | How do you create a new process and pipe it data in a fast
       | fashion without using fork, exec or posix_spawn ?
        
         | cryptonector wrote:
         | posix_spawn or vfork+exec
        
         | masklinn wrote:
         | > without using fork, exec or posix_spawn ?
         | 
         | Did you manage to forget what you'd read two paragraphs earlier
         | when you reached this bit? Because the essay's first
         | recommendation is literally:
         | 
         | > Only use fork _to immediately call exec (or just use
         | posix_spawn)_.
         | 
         | It seems difficult to infer "don't use exec or posix_spawn"
         | from this.
        
         | ehvatum wrote:
         | Use a shared memory region. For example:
         | https://github.com/erikhvatum/py_interprocess_shared_memory_...
        
       | elankart wrote:
       | Let's also not forget to call our how APIs have to be "fork"
       | aware. I'm surprised fork is still widely in used given all of
       | these downsides
        
         | tyingq wrote:
         | I imagine it comes up fairly often for languages that don't do
         | well with threads. No shortage of those.
        
       | jordemort wrote:
       | If you mix fork with threads, you're going to have a [undefined
       | behavior] time. It seems like if you link with the sqlite that
       | comes with macOS, you're using threads whether you like it or
       | not. I think ending up at "you shouldn't use fork() at all" is a
       | bit of an extreme conclusion, though.
       | 
       | BTW, article title needs a (2016). It appears that the relevant
       | Python bug has long since been closed, by avoiding linking with
       | the system sqlite on macOS.
        
         | klodolph wrote:
         | fork + threads is not undefined behavior. It is safe as long as
         | you only do "async-signal safe" functions in the child. The
         | child will be single-threaded.
         | 
         | The Linux man pages have a list of safe operations after fork
         | here: https://man7.org/linux/man-pages/man7/signal-
         | safety.7.html
         | 
         | Note that this includes most of your standard syscalls, like
         | (importantly) write(), read(), close(), chdir(), as well as
         | certain "obviously safe" library functions like strlen(),
         | memcpy(), etc.
         | 
         | Non-multithreaded programs can fork() how they like and do
         | whatever they want after (mostly).
        
           | cryptonector wrote:
           | > fork + threads is not undefined behavior. It is safe as
           | long as you only do "async-signal safe" functions in the
           | child. The child will be single-threaded.
           | 
           | Yes, but the async-signal-safe restriction is pretty severe,
           | so you have to know what you're doing. Yes, that's also true
           | of vfork(), but at least vfork() will be much faster.
           | 
           | > Non-multithreaded programs can fork() how they like and do
           | whatever they want after (mostly).
           | 
           | Only as long as they haven't used libraries that are not
           | fork-safe prior to calling fork(). And you still need to do
           | things like fflush() stdio handles prior to fork()ing.
        
             | klodolph wrote:
             | Yes, I've read the man page, but thank you for repeating
             | the info here.
        
         | masklinn wrote:
         | > I think ending up at "you shouldn't use fork() at all" is a
         | bit of an extreme conclusion, though.
         | 
         | Is it? There are more descriptive (as opposed to procedural)
         | APIs which behave in a safer and more well-defined manner to do
         | it these days. Unless you're implementing a shell, fork has
         | never been a great tool.
         | 
         | As one commenter noted 3 months back:
         | 
         | > The dense fog lifts, tree branches part, a ray of light beams
         | down on a pedestal revealing the hidden intentions of the
         | ancients. A plaque states "The operational semantics of the
         | most basic primitives of your operating system are designed to
         | simplify the implementation of shells." You hesitantly lift
         | your eyes to the item presented upon the pedestal, take a pause
         | in respect, then turn away slumped and disappointed but not
         | entirely surprised. As you walk you shake your head trying to
         | evict the after image of a beam of light illuminating a turd.
         | 
         | https://news.ycombinator.com/item?id=30504470
        
           | aidenn0 wrote:
           | IMO, fork is strictly better than threads as a tool for
           | having operations perform off of the main thread; they get
           | all the state they need at the beginning and they can use IPC
           | to return the result.
           | 
           | TFA does allow for off-process operations, but all of the
           | inputs to the operation would need to be passed explicitly.
           | In this sense, I suppose TFA isn't arguing against
           | multiprocessing per-se, but against the specific type that
           | implicitly includes all of the current process state (which
           | has both up- and down-sides).
        
             | quinoablast wrote:
             | That is basically how NGINX works if you run it in daemon
             | mode. When you start or reload the server, the main process
             | initializes common state then forks to become a worker
             | process. Although I would recommend avoiding any IPC past
             | that if possible
        
             | masklinn wrote:
             | > In this sense, I suppose TFA isn't arguing against
             | multiprocessing per-se, but against the specific type that
             | implicitly includes all of the current process state (which
             | has both up- and down-sides).
             | 
             | You don't have to suppose anything, TFA specifically says
             | that you should use posix_spawn or immediately exec() after
             | forking.
             | 
             | It doesn't imply or hint, let alone say, that threads are
             | superior, it only mentions them because they interact badly
             | with fork() and that's the issue they'd hit. It's not like
             | threads are the only thing which interacts badly with fork.
        
           | cryptonector wrote:
           | As the author of a gist that trashes on fork(), I do
           | nonetheless use it, usually early in daemons' lives:
           | - to daemonize       - to fork multiple worker processes
           | 
           | And maybe POSIX-ish shells should use fork() for subshells,
           | naturally.
           | 
           | But I think that's about it for good uses of fork().
           | 
           | For all process spawning uses of fork() I strongly recommend
           | vfork() or posix_spawn() instead.
        
             | masklinn wrote:
             | Isn't the first use-case a pretty debatable / bad one? By
             | daemonizing internally, you make service management and
             | supervision of the program much more difficult, and if you
             | include a non-daemonizing mode for debugging you now have
             | two different runmodes with a pretty significant semantics
             | difference, only one of which is easily inspectable.
        
               | gunapologist99 wrote:
               | Your program probably knows how (if it wished) to manage
               | its own resources far better than an external program
               | ever could.
        
               | masklinn wrote:
               | Said no sysadmin ever.
        
               | lanstin wrote:
               | I love this sysadmin comment and the GP dev comment. The
               | key is to get the sysadmin team putting requirements to
               | the code, whether the restarting ends up in process or as
               | a nice small unix separate tool that just does restarts
               | well, is an outcome of a process.
        
               | cryptonector wrote:
               | Daemonizing is a thing of the past with modern restarter
               | frameworks, like SMF, systemd, supervisord, etc. But
               | daemonizing was always an _option_ , not a requirement,
               | and as an option, it's safe enough to provide it for
               | those who don't use a restarter.
        
         | int_19h wrote:
         | It's not just SQLite in macOS.
         | 
         | http://sealiesoftware.com/blog/archive/2017/6/5/Objective-C_...
         | 
         | In general, I don't see how one could safely rely on a third-
         | party library spawning or not spawning threads unless they
         | explicitly make guarantees regarding not using them as part of
         | their public contract.
        
       | ridiculous_fish wrote:
       | fish shell uses posix_spawn sometimes because of its performance
       | benefits. We can't use it in the following cases:
       | 
       | 1. No analog to tcsetpgrp, so it's no good if job control is
       | enabled
       | 
       | 2. No analog to fchdir, meaning you have to synchronize with
       | fchdir elsewhere in the progarm
       | 
       | 3. Error codes do not convey enough information for good error
       | messages (e.g. if a file doesn't exist, posix_spawn doesn't tell
       | you which file)
       | 
       | 4. Inconsistent behavior around dup2 fd redirections and
       | CLO_EXEC.
       | 
       | 5. Inconsistent behavior for shebangless scripts
       | 
       | These are basically deal-breakers so fish also supports a
       | fork/exec path. However the performance benefits of posix_spawn
       | are too real to ignore so fish uses posix_spawn when it can, and
       | fork/exec when it must.
        
         | cryptonector wrote:
         | Shells should really use `vfork()` to exec, and `fork()` for
         | subshells (maybe).
        
         | chubot wrote:
         | OK very interesting ... is the performance benefit on certain
         | platforms, or everywhere? A previous thread says Ninja is
         | faster on OS X and Solaris because of it.
         | 
         | Though that does seem like a large number of corner cases,
         | probably learned through painful experience :-/
        
           | ridiculous_fish wrote:
           | It has performance benefits on both Linux (where it can use
           | vfork) and macOS/BSDs (where it has a kernel implementation).
           | 
           | I tweeted a little about it here, with some perf numbers: htt
           | ps://twitter.com/ridiculous_fish/status/12328893907639336...
        
         | jepler wrote:
         | ah a good list as a companion to my simple posix_spawn pipe
         | example in another sub-thread here! thank you!
        
         | lgg wrote:
         | Not sure if it worth the platform specific code, but for #2
         | macOS 10.15+ has `posix_spawn_file_actions_addfchdir_np()`.
         | 
         | I think most of these are deficiencies in the available
         | posix_spawn actions, not anything inherent. Of course getting
         | all the relevant OSes to add new functionality is a huge pain.
         | The error handling seems bad though.
        
       ___________________________________________________________________
       (page generated 2022-06-14 23:01 UTC)