Subj : Re: Lock Free -- where to start To : comp.programming.threads From : Chris Thomasson Date : Sun Oct 02 2005 08:46 pm > Also remember that MOST (though not all) of the time required to acquire a > lock is the exact same time (and machine instructions) necessary for a > lock-free transaction. There ARE real savings, especially on machines like > Alpha, PowerPC, Itanium, where you can often avoid expensive "fence" or > "barrier" instructions that are critical for locks but not necessarily for > (some) lock-free operations; but that isn't much of a > factor on a Pentium. ^^^^^^^ mfence aside, sometimes a locked atomic operation ( full fence ) will actually lock the bus. All requests will be blocked until the atomic operation has completed. This can have a system wide effect, and can reduce "performance". > Sometimes you can reliably read without any memory system synchronization, > but you can't ever reliably write that way if the result needs to be > shared. Yes, generally writes use atomic operations and membars. Just to be clear, when your readers are not using any membars you can actually perform a high number of modifications in a number of ways. For instance, you could simply use a mutex to atomically remove the node representing the object from a collection, please note that this does not effect concurrent readers because the object itself has not been modified. You then send the node through the "collector", and when it comes out on the other side you know that there are no outstanding references. Now you can safely modify the object and use a mutex to push it back into the collection. Another way to perform a write is to pop the node; create a copy; modify the copy; membar; push the copy; send node through the "collector". You could use a mutex to protect the entire operation. Another method is to use an eventcount. Writers would use a mutex to isolate the object; perform its modifications; and increment the eventcount. Readers would read the eventcount; read the data; read the eventcount again and retry the read operation if the counts were not equal. Please note that the eventcount method adds some overhead to the readers... > The first SMP machine I programmed (a short-lived and long gone variant of > the PDP-11) had only one instruction that synchronized memory across > processors: ASRB (arithmetic shift right, byte). That was it. The only > "lock free" operation you could do was an arithmetic divide by a power of > two (and only for very small ranges of data). Humm... You could actually use asrb to implement my Virtually Zero-Overhead Object Management (VZOOM) solution! :) > That's as close as I've come to a machine that only implements "a lock" in > hardware. Once you get to "compare and swap", you gain an extremely > powerful tool to operate on real data instead of a lock -- and we've > finally reached the point where this flexibility is pretty nearly > universal. And that enables lock-free programming to become something more > than a curiosity. Agreed. .