[HN Gopher] Teleforking a Process onto a Different Computer
___________________________________________________________________
Teleforking a Process onto a Different Computer
Author : kaladin-jasnah
Score : 56 points
Date : 2022-05-31 20:37 UTC (2 hours ago)
(HTM) web link (thume.ca)
(TXT) w3m dump (thume.ca)
| cl0ckt0wer wrote:
| This sounds like Kafka, but more low level.
| latenightcoding wrote:
| Not even close
| basementcat wrote:
| Both MOSIX and openMOSIX supported fork()ing to another node on
| the network. https://en.m.wikipedia.org/wiki/MOSIX
| AaronFriel wrote:
| The Cloud Haskell project and language is likely the only one to
| get this right, thanks to strictly enforced purity. It's much
| simpler to understand, absence global mutable state, whether it's
| safe and possible to serialize a closure and run it somewhere
| else. (fork(2) being a closure by another name.)
|
| In almost all other languages there's just no way to know if a
| closure is holding on to a file descriptor.
|
| Critics may say the Haskell closures could contain
| `unsafePerformIO`, but as the saying goes: now you have two
| problems.
| gpderetta wrote:
| Isn't fork more of a continuation?
|
| /pedantic
| gnufx wrote:
| Is Cloud [sigh] Haskell still alive?
|
| For comparison, two old distributed lexical scope systems were
| Cardelli's Obliq and Kelsey's(?) Kali Scheme. From what I
| remember, not like remote forking, though.
| fleddr wrote:
| In the late 90s I attended a tour at Holland Signaal, a dutch
| defense company producing radar and anti-missile systems.
|
| I remember vividly how they demonstrated an unbreakable process.
| They had a computer running a process and no matter what happened
| to that computer, the next one would flawlessly continue the
| process down to the cycle, with no change of corruption or
| skipping a beat.
|
| It may very well be that this is actually not very difficult, but
| it seemed difficult and impressive.
|
| Perhaps more shocking were ultra high resolution radar screens,
| some 3 generations ahead of anything I had seen in the consumer
| space, showing an incredible visualization of the air space,
| live. Showing exactly which plane is where, the model/type, age,
| fuel on board, hostile/friendly, all of it.
|
| They even had a "situation room" with a holodeck chair in the
| middle, full of controls. The entire room was covered in wall-
| size screens basically showing the air space of the entire
| country, being live analyzed.
|
| Sounds very 2022, not 1998.
| a-dub wrote:
| this is a fun hack. it would be interesting to look at some real
| world workloads and compare whether this sort of init once ship
| initialized memory image everywhere style is faster than just
| initializing everywhere.
| tenken wrote:
| Doesn't Erlang support these ideas of distributed computing ....
| And if I recall correctly Clipper supported remote execution of
| objects, or sharing object code in a distributed fashion.
| mghfreud wrote:
| isn't this exactly what the vm migration in cloud is?
| mlyle wrote:
| No. VM migration moves entire virtual computers. Forking makes
| a copy of a process with the current state; this moves that
| single duplicated process to a different machine.
| mghfreud wrote:
| Virtual computer is a bunch of processes.
| speed_spread wrote:
| And a kernel. And drivers. And devices. And busses. And
| interrupts.
| Animats wrote:
| That used to be in some UNIX variants, such as UCLA Locus and the
| IBM derivatives of that. But it never got to be a Linux thing.
| Fnoord wrote:
| Was VMS capable of achieving this as well?
| jonathaneunice wrote:
| Congratulations! You have just reinvented the core idea of UCLA's
| LOCUS distributed computing project from 1979.
| https://en.wikipedia.org/wiki/LOCUS
|
| Reinventing LOCUS also has a strong heritage. Bell Lab's Plan 9,
| for example, did so in part in the late 1980s.
|
| While never a breakout commercial success, tele-forking and its
| slightly more advanced cousins machine-to-machine process
| migration and cluster-wide process pools intrigued some of the
| best minds in distributed computing for 20+ years.
|
| Unfortunately "it's complicated" to implement well, especially
| when you try to tele-spawn and manage resources beyond compute
| cycles (network connections, files, file handles, ...) that are
| important to scale up the idea.
| zozbot234 wrote:
| > Unfortunately "it's complicated" to implement well,
| especially when you try to tele-spawn and manage resources
| beyond compute cycles (network connections, files, file
| handles, ...)
|
| Aren't all of these resources namespaced/containerized in
| modern Linux? This should make it feasible to checkpoint and
| restore them on the same machine (via, e.g. the CRIU patchset)
| and true location-independence is not _that_ much harder. One
| of the hardest parts (not even implemented in plan9, AFAICT) is
| distributed shared memory (allowing for sharing a _single_
| virtual address space across cluster nodes), but even that AIUI
| has some research-level implementations.
| gowld wrote:
| Also , Haskell Control.Distributed.Fork
|
| > This module provides a common interface for offloading an IO
| action to remote executors.
|
| > It uses StaticPointers language extension and distributed-
| closure library for serializing closures to run remotely. This
| blog post[1] is a good introduction for those.
|
| > In short, if you you need a Closure a:
|
| > One important constraint when using this library is that it
| assumes the remote environment is capable of executing the
| exact same binary. On most cases, this requires your host
| environment to be Linux.
|
| https://hackage.haskell.org/package/distributed-fork-0.0.1.3...
|
| [1] https://blog.ocharles.org.uk/blog/guest-
| posts/2014-12-23-sta...
| felixgallo wrote:
| Always worth a reread:
| https://joearms.github.io/published/2013-11-21-My-favorite-e...
| eointierney wrote:
| <About a year later I had to write a paper. One of the
| disadvantages of being a researcher is that in order to get
| money you have to write a paper about something or other, the
| paper is never about what currently interests you at the
| moment, but must be about what the project that financed your
| research expects to read about.
|
| Well I had my gossip network setup on planet lab and I could
| tell it to become anything, so I told it to become a content
| distribution networks and used a gossip algorithm to make
| copies of the same file on all machine on the network and wrote
| a paper about it and everybody was happy.>
|
| I miss Joe, not that I ever met him, but his attitude and good
| humour are inspiring.
| gnufx wrote:
| For what it's worth, at least for HPC-ish distributed computing,
| this sort of thing turns out not to be terribly worthwhile. We
| have a standard for distribution of computation, shared memory,
| i/o, and process starting in MPI (and, for instance, DMTCP to
| migrate the distributed application if necessary, though I think
| DMTCP needs a release).
|
| I don't know what its current status is, but the HPC-ish Bproc
| system has/had an rfork [1]. Probably the most HPC-oriented SSI
| system, Kerrighed died, as did the Plan 9-ish xcpu, though that
| was a bit different.
|
| 1.
| https://www.penguinsolutions.com/computing/documentation/scy...
| zozbot234 wrote:
| The biggest benefit is arguably that codes that are designed
| for "telefork" and perhaps remote threads can also be scaled
| _down_ to a single shared-memory machine, and run way more
| efficiently than if they had been coded using the MPI approach.
| Whilst you don 't really add much of any overhead when running
| in a cluster, assuming that the codes are designed properly.
| daenz wrote:
| Lambda/Cloud Functions are starting to converge on this idea. It
| will eventually get streamlined and ergonomic enough that it
| appears you're executing an expensive or async function locally,
| except you aren't.
___________________________________________________________________
(page generated 2022-05-31 23:00 UTC)