Subj : Re: Proxy collector based on RCU+SMR hazard pointers
To   : comp.programming.threads
From : Joe Seigh
Date : Mon Aug 15 2005 03:18 pm

Joe Seigh wrote:
> d|dq wrote:
>> What is it? :)
>>
>> Please send me read me something that will tell me more about the 
>> background of this information.
> 
> 
> The RCU+SMR stuff is here
> http://atomic-ptr-plus.sourceforge.net/
> 
> The atomic_ptr proxy stuff is in the atomic-ptr-plus package in
> appc.h which uses atomic_ptr.  I haven't published the proxy
> collector based on RCU+SMR yet.  I should probably test it a
> bit more first.
> 
There was a bug but it was a testcase bug.  However I noticed the
api is not quite right.  You need to maintain exclusive access when calling
smr_defer() for delinked nodes to maintain fifo order on deleting
nodes.  This is also true for the appc (atomic pointer proxy collector)
but is a little trickier since you might get deadlock if the dtor tries
to get the same lock.  There's a simple work around for that but I don't
really support appc anymore.

So the smrsample code would change from

		pthread_mutex_lock(&mutex);
		p = lfq_dequeue(&q);
		pthread_mutex_unlock(&mutex);

		if (p) {
			node = containerof(p, node_t, link);
			node->seqnum++;

			//
			// free item
			//
			node->defer.func = &defer_free;
			node->defer.arg = p;
			smr_defer(&(node->defer));
		}


to
		pthread_mutex_lock(&mutex);
		p = lfq_dequeue(&q);

		if (p) {
			node = containerof(p, node_t, link);
			node->seqnum++;

			//
			// free item
			//
			node->defer.func = &defer_free;
			node->defer.arg = p;
			smr_defer(&(node->defer));
		}
		pthread_mutex_unlock(&mutex);

With proxy smr, the writer code looks like

		pthread_mutex_lock(&mutex);
		p2 = lfq_dequeue(&q);

		if (p2) {
			p = proxy;
			proxy = p2;
			node = containerof(p, node_t, link);
			node->seqnum++;

			//
			// free item
			//
			node->defer.func = &defer_free;
			node->defer.arg = p;
			smr_defer(&(node->defer));
		}
		pthread_mutex_unlock(&mutex);

You just swap the delinked node with the proxy copy and free it instead.
Reader code changes from

		for (p = smrload(local, &q.tail);
			p != NULL;
			p = smrload(local, &(p->next)))
		{
		  ...
		}
		smrnull(local);		// clear hazard pointer


to
		smrload(local, &proxy);
		for (p = atomic_load_depends(&q.tail);
			p != NULL;
			p = atomic_load_depends(&(p->next)))
		{
		   ...
		}
		smrnull(local);		// clear hazard pointer


On my linux box when running the readers all out, the reads/sec/thread are approx
  16000 for rcu+smr
  19000 for appc
  22000 for rcu+smr proxy

It probably doesn't matter since it's a linked list and if you want performance
you should be looking at a more appropiate data structure.
  

-- 
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

.