1419
Subj : Using lock-free PCOW (Partial Copy On Write)
To   : comp.programming.threads
From : Joe Seigh
Date : Mon Mar 28 2005 07:32 am

This is a rough draft of a write up on how to do
lock-free PCOW so it goes without saying that it
needs a little more work.  For instance I need to
write go into semantics a little more.

-----------------------------------------------

Lock-free PCOW (Partial Copy On Write) --

While the rules are fairly well understood for lock-free
programming using COW (Copy On Write), the rules for lock-
free programming using PCOW (Partial Copy On Write) the
rules are not so well understood.

For COW, the writer threads
1)	make a new copy of the object,
2)	atomically change with release semantics, pointers to
the old object to the new object,
3)	use some form of GC, e.g. RCU, hazard pointers, atomic
reference counting, Boehm style GC, etc…, to delete
the old object when no reader threads are accessing
it.

Reader threads must access the object only once through one
of the pointers with acquire semantics.

The pointers to the object are what will be called
serialization points.


The rules are simple.  You have to be able to access the
data structure through one or more what I will call
serialization points, which are atomic.  And the updates to
the data structure have to be atomically done at those
points as well.

PCOW Linked Lists --

A common example of a PCOW object would be a linked list.
In this case, the serialization points would be the list
anchor and the forward pointer links, those being the
pointers a read only list traversal would use.  The obvious
question at this point is how can a serialization point be
in the list since you've already accessed the list to get
to it?  The answer is it can't unless you resort to a
recursive definition of a list.  That is a list is a node
and a sublist.  That makes the forward pointer in a node a
serialization point for the sublist following the node.  In
this case the semantics of the list must be preserved.  So
if a serialization point splits a list into two parts,
listA and listB, and listB' is the new list formed by
adding or deleting a node at the list head, then (listA,
listB') must have valid semantics as well.

So to insert a new node in a list, you'd atomically set at
the insertion point the sublist comprised of the  new node
and the sublist following the insertion point.

To delete a node, you'd atomically set at the deletion
point prior to the node being deleted to the sublist
following the deleted node and then GC the deleted node.

PCOW Trees --

For a PCOW tree, the serialization points would depend on
how read access traversal is performed.  Any traversal has
to be performed such that nodes are only visited once and
each subtree pointer only accessed once.  Most tree
searches and traversals are performed this way.  The big
problem is tree semantics.  A binary search tree
subpartitions the search space of nodes on each successive
node visit.  Simple addition or removal of leaf nodes is no
problem.  Interior nodes are problematic if the new subtree
has to have nodes moved around to preserve tree properties.
In this case, the moved nodes (any node which has its links
changed) have to be copied instead and the resulting
subtree atomically set through its serialization point,
i.e. the link pointer of its parent node.  The node
containing the serialization point isn't considered a moved
node and doesn't get copied.

Generally for a tree, you're only going to have one
serialization point.

Example.

        A
      /   \
    B      C
   /  \
D    E
      / \
    F    G


Removing node E would give you


        A
      /   \
    B      C
   /  \
D    F'
        \
         G

where the serialization point was B's right subtree
pointer and F' is a copy of F which was moved.


Note:  if you end up with two disjoint subtrees and two
serialization points, then you have in effect two separate
atomic operations.  If the object semantics cannot handle
two separate atomic operations, then you have to combine
the two into a single subtree, meaning copying parent nodes
to the common subtree root.

Lists revisited, moving list nodes --

An example of swapping two list nodes (if that makes any
sense).

A -> B -> C -> D

Swapping B and C gets you

A -> C' -> B' -> D

where A's link pointer is the serialization point.



Read access with dependent load ordering vs. load acquire
ordering --

Generally, accesses through a serialization point require
acquire memory semantics.  If dependent load memory
semantics are available then dependent loads can be used
but they must be used to access every node reachable from
the serialization point since visibility is guaranteed only
on a per node basis.

For example with a queue with the queue head as the only
insertion point, load acquire is required for the queue
head and normal loads for the rest of the list, or
dependent loads for the queue head and the rest of the
list.


-- 
Joe Seigh

.

0