https://percisely.xyz/syncronicity
SyncronicityCongruenceSing About MeRAFT ElectionsSlow FinalityWhat is
a DAO?Bank Fraud
Time, Clocks, and the Ordering of Events in a Distributed System -
Paper Review
This is a review of Leslie Lamport's paper Time, Clocks, and the
Ordering of Events in a Distributed System. I focus on the paper's
proof of conditions under which a distributed system of physical
clocks can produce an ordering of events, which does not violate
causality. The proof lies in the appendix of the paper, and is
extremely brief: Not unlike reading a mystery novel. I expand on the
paper by presenting the proof, all the math worked out, explained
with a less mysterious level of brevity.
---------------------------------------------------------------------
Introduction
We are presented with a set of processes, PPP, each Pi[?]PP_i \in PPi [?]
P has a clock CiC_iCi . Ci C_i\langle a \rangleCi returns the
timestamp of event aaa occurring on PiP_iPi ; Ci(t)C_i(t)Ci (t)
returns the timestamp of PiP_iPi 's clock at physical time ttt. We
assume all processes are moving at the same speed, so there's no need
to worry about relativity.
We define a relation -\rightarrow-, or "happens before", as a
relation on the set of events such that:
1. If aaa and bbb are events occurring on PiP_iPi , and Ci
C_i\langle a \rangle < C_i \langle b \rangleCi ,
then a-ba \rightarrow ba-b.
2. If aaa is the sending of a message, and bbb the receiving, then
a-ba \rightarrow ba-b.
A nice property of -\rightarrow- is, aaa can causally effect bbb if
and only if a-ba\rightarrow ba-b. Because, for events aaa and bbb
occurring on PiP_iPi and PjP_jPj , aaa can only effect bbb if PjP_j
Pj has learned about aaa by receiving a message from PiP_iPi .
To break ties, we order events according to [?]\prec[?].
a[?]b={i =Cj Ci ,otherwise\begin{equation} a \prec
b = \begin{cases} i < j, & \text{if \( C_i\langle a\rangle = C_j\
langle b\rangle \)} \\ C_i\langle a\rangle < C_j\langle b\rangle, & \
text{otherwise} \end{cases} \end{equation}a[?]b={i , if
Ci =Cj otherwise
[?]\prec[?] will never violate causality, as two events occurring at the
same time can't cause each other.
For our system of clocks to create an ordering of events satisfying -
\rightarrow-, we define two clock conditions.
C1: If aaa and bbb are events on PiP_iPi , and aaa comes before bbb,
then Ci C_i\langle a\rangle < C_i\langle b\rangleCi .
C2: If aaa is the sending of a message by process PiP_iPi , and bbb
the message's receipt by process PjP_jPj , then Ci C_i\langle
a\rangle < C_j\langle b\rangleCi .
C1 ensures the first rule of -\rightarrow- is satisfied, and C2 the
second. Thus, a system of clocks satisfying the clock conditions will
order events with a -\rightarrow- relation.
Logical Clocks
Before considering physical clocks, we consider a simpler clock.
Logical clocks satisfy the clock conditions, but don't order events
in a way compatible with our notion of time.
To order events with a logical clock, processes follow two rules.
IR1: Each process PiP_iPi increments CiC_iCi between any two
successive events.
IR2: (a) If an event aaa is the sending of a message mmm by process
PiP_iPi , then the message contains a timestamp Tm=Ci T_m = C_i\
langle a\rangleTm =Ci . (b) Upon receiving a message mmm in event
bbb, a process PjP_jPj sets CjC_jCj greater than or equal to its
present value, and greater than TmT_mTm .
Processes implementing these rules satisfy the clock conditions C1
and C2, by IR1 and IR2 respectively.
Time, and Logical Clock Time
In a sense, logical clock time and the time we are used to line up. [?]
\prec[?] guarantees events happening before a message is sent, are
ordered before events on the receiving process after the message is
received. This preserves causality, as in order for an event on PiP_i
Pi to cause an event on PjP_jPj a message must be sent between the
processes.
When events can not cause eachother, our logical clocks order events
according to the number of events preceding them. This has no basis
in our typical notion of time. To address this, we turn to physical
clocks.
Physical Clocks
Physical clocks track time as we perceive it. A watch is a physical
clock, and like a watch, physical clocks can become out of sync. This
can cause causality to break; If one clock is so far ahead of another
the receiving timestamp is smaller than the sending one, then receive
-\rightarrow- send.
This section models a physical clock, derives the conditions under
which causality can break, and proves what conditions will ensure it
does not.
Physical Clock Conditions
To model a physical clock, we put two conditions on its behavior.
PC1: There exists a constant k[?]1k \ll 1k[?]1, such that for all iii:
|dCi(t)dt-1|1\frac{dC(t)}{dt} > 1dtdC(t) >1, and too slow dC(t)
dt<1\frac{dC(t)}{dt} < 1dtdC(t) <1.
Breaking Causality
For any two processes communicating, there is a lower bound on how
quickly they may exchange messages. At the limit, this is the
distance between the processes, divided by the speed of light. We
call this lower bound m\mum.
Our goal is to find values of [?]\epsilon[?] for which, for any message,
sent by PiP_iPi at ttt, and received by PjP_jPj at t't't', Ci(t)t+mt'
> t + \mut'>t+m. Thus, to satisfy C2:
Ci(t)<=Cj(t+m)\begin{equation} C_i(t) \leq C_j(t + \mu) \end{equation}
Ci (t)<=Cj (t+m)
By PC1, the slowest possible clock advances at a rate dCj(t)dt>(1-k)\
frac{dC_j(t)}{dt}>(1-k)dtdCj (t) >(1-k). Thus, Cj(t+m)>Cj(t)+(1-k)
mC_j(t+\mu) > C_j(t) + (1-k)\muCj (t+m)>Cj (t)+(1-k)m. We plug this
lower bound for Cj(t+m)C_j(t+\mu)Cj (t+m) into the right side of (2),
yielding a bound on [?]\epsilon[?].
Ci(t)<=Cj(t+m)Ci(t)<=Cj(t)+(1-k)mCi(t)-Cj(t)<=(1-k)m[?]<=m(1-k)\begin
{equation} \begin{aligned} C_i(t) &\leq C_j(t + \mu) \\ C_i(t) &\leq
C_j(t) + (1-k)\mu \\ C_i(t) - C_j(t) &\leq (1-k)\mu \\ \epsilon &\leq
\mu(1-k) \end{aligned} \end{equation}Ci (t)Ci (t)Ci (t)-Cj (t)[?] <=Cj (
t+m)<=Cj (t)+(1-k)m<=(1-k)m<=m(1-k)
If this inequality is true, [?]\prec[?] will satisfy C2. And thus, for
software satisfying C1, [?]\prec[?] will preserve causality. If the
inequality is false, it will be possible to order the receiving of a
message before its sending, causing [?]\prec[?] to break causality.
Preserving Causality
Having discovered conditions under which causality breaks, we set
about describing an algorithm which preserves causality, and prove it
works. Our algorithm is the same one used for logical clocks,
slightly modified for physical time.
IRP1: The physical clocks for each process are always advancing: dCi
(t)dt>0\frac{dC_i(t)}{dt} > 0dtdCi (t) >0.
IRP2: (a) For each iii, if PiP_iPi sends a message mmm at physical
time ttt, then mmm contains a timestamp Tm=Ci(t)T_m = C_i(t)Tm =Ci (t
). (b) Upon receiving a message mmm at t't't', process PjP_jPj sets
Cj(t')>=max[?](Cj(t'),Tm+m)C_j(t') \geq \max(C_j(t'), T_m + \mu)Cj (t')>=
max(Cj (t'),Tm +m).
We model our processes as being arranged in a graph, the edges of
which represent communication links. We make three assumptions about
our graph and the behavior of its processes.
1. The diameter, ddd, of the graph is the shortest number of links
which connects any two processes.
2. For a message sent at physical time ttt, and received at t't't',
we define the total delay vm=t'-tv_m = t' - tvm =t'-t. vm
=C2(t2)+(1-k)(t-t2)for t>=t2\begin{equation} \begin{aligned} C_2
(t) \geq C_2(t_2) + (1-k)(t-t_2) \\ \quad \text{for } t \geq t_2 \end
{aligned} \end{equation}C2 (t)>=C2 (t2 )+(1-k)(t-t2 )for t>=t2
Per IRP2, after the message has been received, the clock of the
receiving process must satisfy C2(t2)>=C1(t1)+mmC_2(t_2) \geq C_1(t_1)
+ \mu_mC2 (t2 )>=C1 (t1 )+mm . Combining this with (4).
C2(t)>=C1(t1)+mm+(1-k)(t-t2)\begin{equation} C_2(t) \geq C_1(t_1) + \
mu_m + (1-k)(t-t_2) \end{equation}C2 (t)>=C1 (t1 )+mm +(1-k)(t-t2 )
Turning our attention to the second term on the right side of (5).
(1-k)(t-t2)=(1-k)(t-t1+t1-t2)=(1-k)(t-t1)+(1-k)(t1-t2)=(1-k)(t-t1)+
(k-1)(t2-t1)\begin{equation} \begin{aligned} (1-k)(t-t_2) &= (1-k)
(t-t_1+t_1-t_2) \\ &= (1-k)(t-t_1) + (1-k)(t_1-t_2) \\ &= (1-k)
(t-t_1) + (k-1)(t_2-t_1) \end{aligned} \end{equation}(1-k)(t-t2 ) =(1
-k)(t-t1 +t1 -t2 )=(1-k)(t-t1 )+(1-k)(t1 -t2 )=(1-k)(t-t1 )+(k-1)(t2
-t1 )
Combining (5) and (6).
C1(t1)+mm+(1-k)(t-t2)=C1(t1)+mm+(1-k)(t-t1)+(k-1)(t2-t1)=C1(t1)+(1-k)
(t-t1)-[(t2-t1)-mm]+k(t2-t1)\begin{equation} \begin{aligned} & C_1
(t_1) + \mu_m + (1-k)(t-t_2) \\ &= C_1(t_1) + \mu_m + (1-k)(t-t_1) +
(k-1)(t_2-t_1) \\ &= C_1(t_1) + (1-k)(t-t_1) - [(t_2 - t_1) - \mu_m]
+ k(t_2-t_1) \end{aligned} \end{equation} C1 (t1 )+mm +(1-k)(t-t2 )=
C1 (t1 )+mm +(1-k)(t-t1 )+(k-1)(t2 -t1 )=C1 (t1 )+(1-k)(t-t1 )-[(t2 -
t1 )-mm ]+k(t2 -t1 )
We now briefly digress into delay.
xm=vm-mm=(t2-t1)-mm<=x\begin{equation} \begin{aligned} \xi_m &= v_m -
\mu_m \\ &= (t_2-t_1) - \mu_m \\ &\leq \xi \end{aligned} \end
{equation}xm =vm -mm =(t2 -t1 )-mm <=x
Combining (7) and (8).
C2(t)>=C1(t1)+(1-k)(t-t1)-[(t2-t1)-mm]+k(t2-t1)>=C1(t1)+(1-k)(t-t1)-x+k
(t2-t1)>=C1(t1)+(1-k)(t-t1)-x\begin{equation} \begin{aligned} C_2(t) &
\geq C_1(t_1) + (1-k)(t-t_1) - [(t_2 - t_1) - \mu_m] + k(t_2-t_1) \\
&\geq C_1(t_1) + (1-k)(t-t_1) - \xi + k(t_2-t_1) \\ &\geq C_1(t_1) +
(1-k)(t-t_1) - \xi \end{aligned} \end{equation}C2 (t) >=C1 (t1 )+(1-k)
(t-t1 )-[(t2 -t1 )-mm ]+k(t2 -t1 )>=C1 (t1 )+(1-k)(t-t1 )-x+k(t2 -t1 )
>=C1 (t1 )+(1-k)(t-t1 )-x
(9) sets a lower bound on the receiving clock's time, in terms of the
sending clock. Now, imagine a series of processes [P1,...,PN][P_1,\
dots,P_N][P1 ,...,PN ]. At time tit_iti , process PiP_iPi receives a
message from process Pi-1P_{i-1}Pi-1 , sent at time ti-1't_{i-1}'ti-
1' , where ti-1'<=ti<=ti't_{i-1}' \leq t_i \leq t_i'ti-1' <=ti <=ti' .
Consider what happens when we repeatedly apply (9) to C3C_3C3 .
C3(t)>=C2(t2')+(1-k)(t-t2')>=[C1(t1')+(1-k)(t2'-t1')-x]+(1-k)(t-t2')-x=
C1(t1')+(1-k)(t-t1')-2x\begin{equation} \begin{aligned} C_3(t) &\geq
C_2(t_2') + (1-k)(t-t_2') \\ &\geq [C_1(t_1') + (1-k)(t_2'-t_1') - \
xi] +(1-k)(t-t_2') -\xi \\ &= C_1(t_1') + (1-k)(t-t_1') - 2\xi \end
{aligned} \end{equation}C3 (t) >=C2 (t2' )+(1-k)(t-t2' )>=[C1 (t1' )+(1
-k)(t2' -t1' )-x]+(1-k)(t-t2' )-x=C1 (t1' )+(1-k)(t-t1' )-2x
Every time we apply the inequality, it subtracts x\xix, and decreases
the time and clock subscript by one. For a process PiP_iPi , there is
a chain of (i-1)(i-1)(i-1) processes leading up to it, so (9) can be
applied (i-1)(i-1)(i-1) times to find a lower bound on PiP_iPi 's
clock.
Cn+1(t)>=C1(t1')+(1-k)(t-t1')-nxfor t>tn\begin{equation} \begin
{aligned} C_{n+1}(t) \geq C_1(t_1')+(1-k)(t-t_1')-n\xi \\ \quad \text
{for } t > t_n \end{aligned} \end{equation}Cn+1 (t)>=C1 (t1' )+(1-k)(t
-t1' )-nxfor t>tn
Finally, as a cleanup step, as C1(t1')>=(1-k)(t1'-t1)C_1(t_1') \geq
(1-k)(t_1'-t_1)C1 (t1' )>=(1-k)(t1' -t1 ), we can write (11) in terms
of t1t_1t1 .
Cn+1(t)>=C1(t1)+(1-k)(t-t1)-nxfor t>tn+1\begin{equation} \begin
{aligned} C_{n+1}(t) \geq C_1(t_1)+(1-k)(t-t_1)-n\xi \\ \quad \text
{for } t > t_{n+1} \end{aligned} \end{equation}Cn+1 (t)>=C1 (t1 )+(1-k
)(t-t1 )-nxfor t>tn+1
You may want to take a bong rip at this point.
Now imagine this scenario, applied to our graph of processes. Recall
assumptions (1) and (3), which say our graph has diameter ddd, and
every process sends a message to every other process every t\taut
units of time. If we imagine timestamps propagating through the graph
as messages are sent, every d(t+v)d(\tau + v)d(t+v) units of time, a
timestamp from every process will have had the opportunity to
propagate to every other process.
We can thus apply our earlier inequality for the minimum value of a
clock in a series of processes, for a series of processes of length
ddd. For any two processes, when t>=t1+d(t+v)t\geq t_1+d(\tau + v)t>=t1
+d(t+v):
Ci(t)>=Cj(t1)+(1-k)(t-t1)-dxfor t>=t1+d(t+v)\begin{equation} \begin
{aligned} C_i(t) \geq C_j(t_1)+(1-k)(t-t_1)-d\xi \\ \quad \text{for }
t \geq t_1 + d(\tau + v) \end{aligned} \end{equation}Ci (t)>=Cj (t1 )+
(1-k)(t-t1 )-dxfor t>=t1 +d(t+v)
This sets a lower bound on the value of a clock, in terms of some
known value of another clock and the amount of time that has passed.
We are now half way to finding [?]\epsilon[?]. In the next section, we
look for an upper bound.
The Fastest Clock
Say at time txt_xtx , CxC_xCx is the clock with the highest value.
By PC1:
Ci(t)<=Cx(tx)+(1+k)(t-tx)for t>=tx\begin{equation} \begin{aligned} C_i
(t) \leq C_x(t_x) + (1+k)(t - t_x) \\ \quad \text{for } t \geq t_x \
end{aligned} \end{equation}Ci (t)<=Cx (tx )+(1+k)(t-tx )for t>=tx
However, this argument is subtly incomplete. When a process receives
a message, it sets its clock's value to be >=Tm+mm\geq T_m + \mu_m>=Tm
+mm (IRP2). Imagine PxP_xPx sends a message at txt_xtx , setting Tm
=Cx(tx)T_m = C_x(t_x)Tm =Cx (tx ). Upon receiving the message at time
t't't', the receiver sets their clock to Tm+mmT_m + \mu_mTm +mm . Is
it possible for Tm+mm>Cx(tx)+(1+k)(t'-tx)T_m + \mu_m > C_x(t_x) +
(1+k)(t'-t_x)Tm +mm >Cx (tx )+(1+k)(t'-tx )? If so, (14) would be
incorrect.
Let's examine the inequality.
Cx(tx)+(1+k)(t'-tx)=tx+d(t+v)\begin
{equation} \begin{aligned} C_x(t_x) + (1-k)(t-t_x)-d\xi \leq \\ C_i
(t) \leq C_x(t_x) + (1+k)(t - t_x) \\ \quad \text{for } t \geq t_x +
d(\tau + v) \end{aligned} \end{equation}Cx (tx )+(1-k)(t-tx )-dx<=Ci (
t)<=Cx (tx )+(1+k)(t-tx )for t>=tx +d(t+v)
This holds for all iii, and thus bounds the value of a clock. The
farthest off two clocks can be, is the gap between the fastest and
slowest possible clocks. That is, the difference between the upper
and lower bound of (16).
Cx(tx)+(1+k)(t-tx)-[Cx(tx)+(1-k)(t-tx)-dx]=(1+k)(t-tx)-(1-k)(t-tx)+dx
=2k(t-tx)+dx\begin{equation} \begin{aligned} &C_x(t_x) + (1+k)(t -
t_x) - [C_x(t_x) + (1-k)(t-t_x)-d\xi] \\ &= (1+k)(t-t_x) - (1-k)
(t-t_x) + d\xi \\ &= 2k(t-t_x) + d\xi \end{aligned} \end{equation} Cx
(tx )+(1+k)(t-tx )-[Cx (tx )+(1-k)(t-tx )-dx]=(1+k)(t-tx )-(1-k)(t-
tx )+dx=2k(t-tx )+dx
Thus, the maximum difference between clocks in our system is:
|Ci(t)-Cj(t)|<=2k(t-tx)+dxfor t>=tx+d(t+v)\begin{equation} \begin
{aligned} |C_i(t) - C_j(t)| \leq 2k(t-t_x) + d\xi \\ \quad \text{for
} t \geq t_x + d(\tau + v) \end{aligned} \end{equation}|Ci (t)-Cj (t)
|<=2k(t-tx )+dxfor t>=tx +d(t+v)
Notice how (18) applies to all ttt, txt_xtx , where t>=tx+d(t+v)t \geq
t_x + d(\tau + v)t>=tx +d(t+v). This means, for any ttt, we can find a
txt_xtx , such that t-tx=d(t+v)t-t_x = d(\tau + v)t-tx =d(t+v). This
is the lowest value for t-txt-t_xt-tx , so we can replace the
t-txt-t_xt-tx term in (18).
|Ci(t)-Cj(t)|<=2k(t-tx)+dx<=2kd(t+v)+dx=d(2k(t+v)+x)for t>=t0+d(t+v)\
begin{equation} \begin{aligned} |C_i(t) - C_j(t)| &\leq 2k(t-t_x) + d
\xi \\ &\leq 2kd(\tau + v) + d\xi \\ &= d(2k(\tau + v) + \xi) \\ &\
text{for } t \geq t_0 + d(\tau + v) \end{aligned} \end{equation}|Ci (
t)-Cj (t)| <=2k(t-tx )+dx<=2kd(t+v)+dx=d(2k(t+v)+x)for t>=t0 +d(t+v)
This concludes our proof for a bound on [?]\epsilon[?]. Our algorithm
takes d(t+v)d(\tau + v)d(t+v) units of time to synchronize, and then
the amount two clocks may be out of sync is bounded.
[?]<=d(2k(t+v)+x)\begin{equation} \begin{aligned} \epsilon \leq d(2k(\
tau + v) + \xi) \\ \end{aligned} \end{equation}[?]<=d(2k(t+v)+x)
Conclusion
In equation (3), we found the maximum [?]\epsilon[?] that preserves
causality. We can combine (3) and (20) investigate this further.
[?]<=(1-k)md(2k(t+v)+x)<=(1-k)m\begin{aligned} \epsilon &\leq (1-k)\mu \\
d(2k(\tau + v) + \xi) &\leq (1-k)\mu \\ \end{aligned}[?]d(2k(t+v)+x) <=(
1-k)m<=(1-k)m
All of these variables, except t\taut describe physical properties of
the system. t\taut, the rate at which messages are sent, is a
property of the software. So, we rearrange the equation in terms of t
\taut.
t<=(1-k)m/d-x2k-v\tau \leq \frac{(1-k)\mu/d - \xi}{2k} - vt<=2k(1-k)m/d
-x -v
Selecting a value for t\taut which satisfied this inequality, would
guarantee ordering events such that, if aaa caused bbb, a[?]ba \prec ba
[?]b.
Thanks to M for reviewing.