1419 Subj : Using lock-free PCOW (Partial Copy On Write) To : comp.programming.threads From : Joe Seigh Date : Mon Mar 28 2005 07:32 am This is a rough draft of a write up on how to do lock-free PCOW so it goes without saying that it needs a little more work. For instance I need to write go into semantics a little more. ----------------------------------------------- Lock-free PCOW (Partial Copy On Write) -- While the rules are fairly well understood for lock-free programming using COW (Copy On Write), the rules for lock- free programming using PCOW (Partial Copy On Write) the rules are not so well understood. For COW, the writer threads 1) make a new copy of the object, 2) atomically change with release semantics, pointers to the old object to the new object, 3) use some form of GC, e.g. RCU, hazard pointers, atomic reference counting, Boehm style GC, etc…, to delete the old object when no reader threads are accessing it. Reader threads must access the object only once through one of the pointers with acquire semantics. The pointers to the object are what will be called serialization points. The rules are simple. You have to be able to access the data structure through one or more what I will call serialization points, which are atomic. And the updates to the data structure have to be atomically done at those points as well. PCOW Linked Lists -- A common example of a PCOW object would be a linked list. In this case, the serialization points would be the list anchor and the forward pointer links, those being the pointers a read only list traversal would use. The obvious question at this point is how can a serialization point be in the list since you've already accessed the list to get to it? The answer is it can't unless you resort to a recursive definition of a list. That is a list is a node and a sublist. That makes the forward pointer in a node a serialization point for the sublist following the node. In this case the semantics of the list must be preserved. So if a serialization point splits a list into two parts, listA and listB, and listB' is the new list formed by adding or deleting a node at the list head, then (listA, listB') must have valid semantics as well. So to insert a new node in a list, you'd atomically set at the insertion point the sublist comprised of the new node and the sublist following the insertion point. To delete a node, you'd atomically set at the deletion point prior to the node being deleted to the sublist following the deleted node and then GC the deleted node. PCOW Trees -- For a PCOW tree, the serialization points would depend on how read access traversal is performed. Any traversal has to be performed such that nodes are only visited once and each subtree pointer only accessed once. Most tree searches and traversals are performed this way. The big problem is tree semantics. A binary search tree subpartitions the search space of nodes on each successive node visit. Simple addition or removal of leaf nodes is no problem. Interior nodes are problematic if the new subtree has to have nodes moved around to preserve tree properties. In this case, the moved nodes (any node which has its links changed) have to be copied instead and the resulting subtree atomically set through its serialization point, i.e. the link pointer of its parent node. The node containing the serialization point isn't considered a moved node and doesn't get copied. Generally for a tree, you're only going to have one serialization point. Example. A / \ B C / \ D E / \ F G Removing node E would give you A / \ B C / \ D F' \ G where the serialization point was B's right subtree pointer and F' is a copy of F which was moved. Note: if you end up with two disjoint subtrees and two serialization points, then you have in effect two separate atomic operations. If the object semantics cannot handle two separate atomic operations, then you have to combine the two into a single subtree, meaning copying parent nodes to the common subtree root. Lists revisited, moving list nodes -- An example of swapping two list nodes (if that makes any sense). A -> B -> C -> D Swapping B and C gets you A -> C' -> B' -> D where A's link pointer is the serialization point. Read access with dependent load ordering vs. load acquire ordering -- Generally, accesses through a serialization point require acquire memory semantics. If dependent load memory semantics are available then dependent loads can be used but they must be used to access every node reachable from the serialization point since visibility is guaranteed only on a per node basis. For example with a queue with the queue head as the only insertion point, load acquire is required for the queue head and normal loads for the rest of the list, or dependent loads for the queue head and the rest of the list. -- Joe Seigh . 0