http://catern.com/compdist.html

Your computer is a distributed system

Most computers are essentially a distributed system internally, and
provide their more familiar programming interface as an abstraction
on top of that. Piercing through these abstractions can yield greater
performance, but most programmers do not need to do so. This is
something unique: an abstraction that hides the distributed nature of
a system and actually succeeds.

Many have observed that today's computers are distributed systems
implementing the abstraction of the shared-memory multiprocessor. The
wording in the title is from the 2009 paper "Your computer is already
a distributed system. Why isn't your OS?". Some points from that and
other sources:

  * In a typical computer there's many components running
    concurrently, e.g. graphics cards, storage devices, network
    cards, all running their own programs ("firmware") on their own
    cores, independent of the main CPU and communicating over the
    internal network (e.g. a PCI bus).
  * Components can appear and disappear and fail independently, up to
    and including CPU cores, and other components have to deal with
    this. Much like on larger-scale unreliable networks, timeouts are
    used to detect communication failures.
  * Components can be malicious and send adversarial inputs; e.g., a
    malicious USB or Thunderbolt device can try to exploit
    vulnerabilities in other hardware components in the system,
    sometimes successfully. The system should be robust to such
    attacks, just as with larger-scale distributed systems.
  * These components are all updated on their own schedule. Often the
    computer's owner is not allowed to push their own updates to
    these components.
  * The latency between different components of a computer is highly
    relevant; e.g., the latency of communication between the CPU and
    main memory or storage devices is huge compared to communication
    within the CPU.
  * To reduce these latencies, we put caches in the middle, just as
    we use caches and CDNs in larger distributed systems.
  * Those caches themselves are sophisticated distributed systems,
    implemented with message passing between individual cores.
    Providing this consistency has a latency cost just as it does in
    larger distributed systems.

As the shared-memory multiprocessor is actually a distributed system
underneath, we can, if necessary, reason explicitly about that
underlying system and use the same techniques that larger-scale
distributed systems use:

  * Performing operations closer to the data, e.g. through compute
    elements embedded into storage devices and network interfaces, is
    faster.
  * Offloading heavy computations to a "remote service" can be
    faster, both because the service can run with better resources,
    and because the service will have better locality.
  * Communicating directly between devices rather than going through
    the centralized main memory reduces load on the CPU and improves
    latency.
  * Using message passing instead of shared memory is faster, for the
    same reasons larger distributed systems don't use distributed
    shared memory.

But most programs do not do such things; the abstraction of the
shared-memory multiprocessor is sufficient for most programs.

That this abstraction is successful is surprising. In distributed
systems, it is often supposed that you cannot abstract away the
issues of distributed programming. One classic and representative
quote:

    It is our contention that a large number of things may now go
    wrong due to the fact that RPC tries to make remote procedure
    calls look exactly like local ones, but is unable to do it
    perfectly. - Section 1, A Critique of the Remote Procedure Call
    Paradigm

Following this, most approaches to large-scale distributed
programming today make the "distributed" part explicit: They expose
(and require programmers to deal with) network failures, latency,
insecure networks, partial failure, etc. It's notoriously hard for
programmers to deal with these issues.

So it is surprising to find that essentially all programs are written
in an environment that abstracts away its own distributed nature: the
shared-memory multiprocessor, as discussed above. Such programs are
filled with RPCs which appear as if they're local operations.
Accesses to memory and disk are synchronous RPCs to relatively
distant hardware, which block the execution of the program while
waiting for a response. But for most programs, this is completely
fine.

This suggests that it may be possible to abstract away the
distributed nature of larger-scale systems. Perhaps eventually we can
use the same abstractions for distributed systems of any size, from
individual computers to globe-spanning networks.