[HN Gopher] Lamport's Bakery algorithm, demonstrated in Python
       ___________________________________________________________________
        
       Lamport's Bakery algorithm, demonstrated in Python
        
       Author : eigenvalue
       Score  : 51 points
       Date   : 2024-01-19 15:57 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | eigenvalue wrote:
       | I was watching the very good oral history of Leslie Lamport from
       | the Computer History Museum and was struck by how he said that
       | the Bakery Algorithm was the thing he was most proud of, despite
       | Paxos being much more elaborate and, in my mind, important. He
       | also described the Bakery Algorithm as being the only thing that
       | he "discovered" rather than "invented"; he later clarified that
       | this was because it "came out of nowhere" and had no precedent to
       | his knowledge from any previous ideas from him or anyone else.
       | 
       | Although it seems pretty simple on the surface, it came as a
       | surprise to many people who had been working hard on the mutual
       | exclusion problem in the early 1970s, with one of Lamport's
       | colleagues (Anatol Holt) describing it as "impossible" upon first
       | hearing it. Not only did it completely solve the mutual exclusion
       | problem, but it managed to do so without requiring atomicity of
       | reads or writes, and it was fully described and proved in just 3
       | pages. This was something that both Dijkstra and Knuth were
       | unable to do despite each writing at least one paper on the
       | subject prior to Lamport's solution in 1974.
       | 
       | Anyway, I thought it might be fun to demonstrate the principles
       | of how it works in Python. In some ways, Python is uniquely
       | unsuited for this because it has a Global Interpreter Lock (GIL),
       | the main purpose of which is precisely to avoid the kinds of
       | mutual exclusion issues that the Bakery Algorithm solves! Still,
       | we can "fake it" by using the multiprocessing library and by
       | introducing random delays at various points in the code.
       | 
       | Here is the oral history video btw:
       | 
       | https://www.youtube.com/watch?v=SXt3-iZpQQc
        
         | svat wrote:
         | Thanks for sharing! That video is part 1, and here is part 2:
         | https://www.youtube.com/watch?v=uK9yGNuGWKE
         | 
         | And here are the transcripts:
         | https://archive.computerhistory.org/resources/access/text/20...
         | (Part 1, 2016-08-12) and
         | https://archive.computerhistory.org/resources/access/text/20...
         | (Part 2, 2016-11-11)
         | 
         | (All four available from https://www.computerhistory.org/collec
         | tions/oralhistories/?s...)
         | 
         | The papers by Dijkstra and Knuth that you mention are I think:
         | 
         | * (1965 Dijkstra) "Solution of a problem in concurrent
         | programming control" (in CACM)
         | 
         | * (1966 Knuth) "Additional comments on a problem in concurrent
         | programming control" (Letter to CACM, reprinted with an
         | addendum as Chapter 13 of his _Selected Papers on Design of
         | Algorithms_ )
         | 
         | [From the addendum and comments on the latter, I don't get the
         | impression that it had an error (and I don't get that
         | impression from Lamport's comments in the transcripts either).
         | He cites Gary L. Peterson's "Myths About the Mutual Exclusion
         | Problem" (1981,
         | https://zoo.cs.yale.edu/classes/cs323/doc/Peterson.pdf) as a
         | later improvement though. But I may have misunderstood, as I
         | have not read any of these papers/letters or even understood
         | the question :)]
        
           | eigenvalue wrote:
           | Thanks for adding those references. I realized that Lamport
           | also wrote about this stuff on his website listing all his
           | publications, which is a good read:
           | 
           | "This paper describes the bakery algorithm for implementing
           | mutual exclusion. I have invented many concurrent algorithms.
           | I feel that I did not invent the bakery algorithm, I
           | discovered it. Like all shared-memory synchronization
           | algorithms, the bakery algorithm requires that one process be
           | able to read a word of memory while another process is
           | writing it. (Each memory location is written by only one
           | process, so concurrent writing never occurs.) Unlike any
           | previous algorithm, and almost all subsequent algorithms, the
           | bakery algorithm works regardless of what value is obtained
           | by a read that overlaps a write. If the write changes the
           | value from 0 to 1, a concurrent read could obtain the value
           | 7456 (assuming that 7456 is a value that could be in the
           | memory location). The algorithm still works. I didn't try to
           | devise an algorithm with this property. I discovered that the
           | bakery algorithm had this property after writing a proof of
           | its correctness and noticing that the proof did not depend on
           | what value is returned by a read that overlaps a write. I
           | don't know how many people realize how remarkable this
           | algorithm is. Perhaps the person who realized it better than
           | anyone is Anatol Holt, a former colleague at Massachusetts
           | Computer Associates. When I showed him the algorithm and its
           | proof and pointed out its amazing property, he was shocked.
           | He refused to believe it could be true. He could find nothing
           | wrong with my proof, but he was certain there must be a flaw.
           | He left that night determined to find it. I don't know when
           | he finally reconciled himself to the algorithm's correctness.
           | 
           | Several books have included emasculated versions of the
           | algorithm in which reading and writing are atomic operations,
           | and called those versions "the bakery algorithm". I find that
           | deplorable. There's nothing wrong with publishing a
           | simplified version, as long as it's called a simplified
           | version.
           | 
           | What is significant about the bakery algorithm is that it
           | implements mutual exclusion without relying on any lower-
           | level mutual exclusion. Assuming that reads and writes of a
           | memory location are atomic actions, as previous mutual
           | exclusion algorithms had done, is tantamount to assuming
           | mutually exclusive access to the location. So a mutual
           | exclusion algorithm that assumes atomic reads and writes is
           | assuming lower-level mutual exclusion. Such an algorithm
           | cannot really be said to solve the mutual exclusion problem.
           | Before the bakery algorithm, people believed that the mutual
           | exclusion problem was unsolvable--that you could implement
           | mutual exclusion only by using lower-level mutual exclusion.
           | Brinch Hansen said exactly this in a 1972 paper. Many people
           | apparently still believe it. (See [91].)
           | 
           | The paper itself does not state that it is a "true" mutual
           | exclusion algorithm. This suggests that I didn't realize the
           | full significance of the algorithm until later, but I don't
           | remember.
           | 
           | For a couple of years after my discovery of the bakery
           | algorithm, everything I learned about concurrency came from
           | studying it. Papers like [25], [33], and [70] were direct
           | results of that study. The bakery algorithm was also where I
           | introduced the idea of variables belonging to a process--that
           | is, variables that could be read by multiple processes, but
           | written by only a single process. I was aware from the
           | beginning that such algorithms had simple distributed
           | implementations, where the variable resides at the owning
           | process, and other processes read it by sending messages to
           | the owner. Thus, the bakery algorithm marked the beginning of
           | my study of distributed algorithms.
           | 
           | The paper contains one small but significant error. In a
           | footnote, it claims that we can consider reads and writes of
           | a single bit to be atomic. It argues that a read overlapping
           | a write must get one of the two possible values; if it gets
           | the old value, we can consider the read to have preceded the
           | write, otherwise to have followed it. It was only later, with
           | the work eventually described in [70], that I realized the
           | fallacy in this reasoning."
           | 
           | Link:
           | 
           | https://lamport.azurewebsites.net/pubs/pubs.html#monitor1
        
       | exDM69 wrote:
       | > Even if this read and write are not atomic, the system still
       | works
       | 
       | This statement is not really true for modern computers,
       | especially ARM and other architectures which are more sensitive
       | to memory orderings than x86 CPUs.
       | 
       | The reason is the lack of memory barriers.
       | 
       | The CPU and the compiler are free to reorder loads and stores.
       | Memory barriers are inserted in between to enforce specific
       | ordering.
       | 
       | You need an acquire memory barrier after acquiring a lock to make
       | sure that any loads and stores inside the critical section do not
       | get reordered to before the lock was grabbed.
       | 
       | So while the algorithm may work for acquiring a critical section
       | without atomics, it does not enforce that the stuff inside the
       | critical section stays inside the critical section within the
       | same thread.
       | 
       | In fact most normal loads and stores in ARM are "atomic" but to
       | achieve consistency in concurrent programming they are followed
       | or preceded by an appropriate memory barrier instruction e.g. in
       | std::atomic_load or _store.
        
         | arter4 wrote:
         | >You need an acquire memory barrier after acquiring a lock to
         | make sure that any loads and stores inside the critical section
         | do not get reordered to before the lock was grabbed.
         | 
         | What do you mean? That loads and stores inside a critical
         | section could, in fact, be reordered by the CPU and be executed
         | before grabbing a lock?
        
           | exDM69 wrote:
           | Exactly. A load from or store to memory protected by a mutex
           | could be executed before the store that sets the mutex as
           | locked. Unless there is an appropriate kind of memory barrier
           | in between.
        
         | gpderetta wrote:
         | Actually acquire/release barriers or operations are not enough.
         | I'm pretty sure you also need a #StoreLoad for example between
         | Step 1 and Step 2 in the lock algorithm, to prevent the loads
         | from other thread state in step 2 to be reordered before the
         | stores to the own thread state in step 1.
         | 
         | Generally #StoreLoad is required when you need bidirectional
         | synchronization between two or more threads.
         | 
         | Re atomics, I think the algorithm still requires word sized
         | operations to be atomic, but I think that was considered a
         | given. What the algorithm doesn't require is atomic load/stores
         | (or more complex RMW) across more than one word.
        
           | Vecr wrote:
           | Could you implement it using AtomicUsize (using only relaxed
           | operations) in Rust? With AtomicUsize representing the words.
           | You aren't allowed to put pointers in those.
        
       | ur-whale wrote:
       | > The Bakery Algorithm's unique feature is its ability to ensure
       | mutual exclusion in concurrent programming without requiring
       | atomic reads and writes.
       | 
       | This is the key, and somewhat surprising. Had someone told me
       | critical section was possible without hardware support (atomic
       | reads and writes), my knee-jerk reaction would have been "not
       | possible".
        
       | eigenvalue wrote:
       | One additional clarification is useful to make, which is WHY the
       | algorithm is robust to non-atomic reads and writes.
       | 
       | Basically, if one process is reading a value while it is being
       | written to by another process, then it's possible that the
       | reading process will not just get an old value, but it might get
       | a complete garbage value caused by a partially written or garbled
       | memory value. But even then it still works, and doesn't even
       | require any kind of "retry" logic, for this reason:
       | 
       | Recall that when a process wants to enter the critical section,
       | it first looks at the numbers (or tickets) assigned to other
       | processes to determine its own number. The process chooses a
       | number that is one higher than the highest number it observes.
       | 
       | Now, if a process reads a garbage value due to a concurrent
       | write, it typically just ignores this value. This is because the
       | garbage value is unlikely to be higher than the maximum valid
       | number it can read from other processes. As a result, this
       | garbage value does not impact the process's decision about what
       | number to take for itself.
       | 
       | The key is that the process bases its decision on the valid
       | numbers it can read. The algorithm does not depend on every read
       | being perfect. If a process cannot determine a clear maximum due
       | to a garbage value, it bases its decision on the highest valid
       | number it can discern.
       | 
       | The process then enters a waiting phase where it checks if it's
       | its turn to enter the critical section, based on its number and
       | others'. Again, the presence of a garbage value in one of the
       | reads does not impede this process, as the algorithm only
       | requires a relative ordering based on numbers.
       | 
       | Despite the possibility of reading invalid values, the bakery
       | algorithm maintains safety (no two processes are in the critical
       | section simultaneously) and liveness (every process eventually
       | gets its turn). This is because its core logic - ordering based
       | on numbers and waiting for its turn - remains intact.
        
         | vlovich123 wrote:
         | But stochastically, it's possible you're given a legal value
         | that violates mutual exclusion, no?
         | 
         | I'm very fuzzy as to how the busy wait loop that checks all
         | sibling threads for their lock counter number can distinguish
         | this scenario. Cause you could be TID 1, and obtain ticket X
         | even though TID 2 got there first and entered mutual exclusion
         | because at the time when it checked you weren't even assigned a
         | ticket and entered mutual exclusion... I'm missing something
         | subtle about the algorithm.
        
           | eigenvalue wrote:
           | The concern is that if a process reads a partially updated
           | ticket number from another process, it might end up with a
           | ticket number that falsely represents its position in the
           | queue. The algorithm is designed to handle such cases:
           | 
           | If the read value is too low (because of a partial update),
           | the process will wait longer than necessary, which doesn't
           | violate mutual exclusion but may affect fairness temporarily.
           | 
           | If the read value is too high, it doesn't allow the reading
           | process to jump ahead in the queue unfairly.
           | 
           | The key point is that even if a process reads a partially
           | updated value, it either waits longer than necessary or gets
           | a number that still respects the ordering. This is because
           | the algorithm uses relative ordering (i.e., based on
           | comparing ticket numbers) rather than absolute values.
        
       | valray wrote:
       | If I understand this correctly, the algorithm depends on the
       | underlying platform to ensure correct ordering. Two processes may
       | each get the same bakery ticket number. The conflict is resolved
       | by the process with the lower process id being allowed to go
       | first. This process ID is in effect a ticket number issued by the
       | underlying hardware. If the processes were running in a
       | distributed environment, there would not be a common underlying
       | platform to resolve this conflict.
        
         | eigenvalue wrote:
         | In that context you could simply pick another thing as the tie
         | breaker, like the IP address expressed as an integer (ie,
         | remove the periods and treat like a number) being lower.
        
           | valray wrote:
           | Ah, yes, that would work.
           | 
           | But then there is reliance on an underlying platform to
           | resolve the conflict by providing a globally unique
           | identifier. Instead of process ID, there is the IP address
           | mechanism.
           | 
           | So it's turtles all the way down, if I understand correctly.
        
             | eigenvalue wrote:
             | Not really. Each process could, at the beginning, generate
             | a random nonce and broadcast it and thereafter use that as
             | its tiebreaker number. The point is to just have a
             | consistent way to break ties that is convenient to
             | implement in whatever domain you're working in. And in a
             | networked domain, without a valid IP address, the worker
             | isn't going to be doing much in the process anyway since it
             | wouldn't be able to communicate at all.
        
       | brchr wrote:
       | Leslie Lamport gives a short oral history of the Bakery Algorithm
       | in Episode 3 of "Algorithms at Work":
       | https://www.audible.com/pd/Episode-3-Concurrency-How-to-Coor...
        
       ___________________________________________________________________
       (page generated 2024-01-20 23:00 UTC)