Subj : Re: STL Containers w/ Virtually Zero-Overhead Lock-Free Reads... To : (Usenet) From : Chris Thomasson Date : Sun Sep 25 2005 11:08 am >> a well built generic lock-free read container doesn’t >> need to rely on internally defined forms of mutual exclusion in order to >> provide "read-only access" that doesn't use any nasty atomic operations >> and/or StoreLoad style memory barriers... > > Of course. But the effort you put into making your vector lock-free is > not needed to make the above work, so in effect it goes to waste. Correct. > , so in effect it goes to waste. My solution, wrt STL containers, doesn’t involve lock-free or lock-based writes. It’s strictly a solution for readers. > And even lock free primitives like CAS are significantly more expensive > than their "plain old" counterparts, so making the vector lock free > actually imposes an efficiency cost for no benefit in this case. Humm... I wonder what “plain old” counterparts you might be referring to? I totally agree that atomic operations are very expensive for modern processors to execute. This is one of the main reasons why I am working on completely eliminating them from read-only iterations. > Anyway "read-only access" usually works just fine even with a > completely thread-unaware container Humm... This is not really true in a multi-threaded environment... What happens if a writer thread pops a node and frees it while there are a couple of reader’s threads concurrently performing iterations? How can we efficiently and portably ensure that the readers don’t segfault? I will refer to this specific situation as the “reader/writer problem”. Now, you might be thinking that shared_ptr could be a viable solution to the problem. Unfortunately, it’s simply not compatible; believe it or not, shared_ptr is not a “true” atomic reference count: http://groups.google.com/group/comp.lang.c++.moderated/msg/0c6f5c5ab72164d0?hl=en http://groups.google.com/group/comp.programming.threads/msg/a60ce03f406dfcb5 http://groups.google.com/group/comp.programming.threads/msg/93983d56839ffb45 Even if shared_ptr could handle this type of reference counting, it still relies on expensive atomic operations and StoreLoad style memory barriers for virtually every reference modification. This produces a high amount of unneeded overhead. I guess one could use full-blown garbage collection. IMHO, that would be overkill and, unfortunately, garbage collectors still produce unneeded overhead in the form of stop-the-world techniques, memory barriers, ect... Luckily, all is not lost! :) There are some existing solutions to the reader/writer problem: IBM: Read, Copy and Update (RCU) IBM: Safe Memory Reclamation (SMR) SUN: Repeat Offender Problem (ROP) All of these work; however, they are not practical for any sort of “generic solution”. Here are just a couple of reasons why: http://groups.google.com/group/comp.programming.threads/msg/b28bdb0051e2c2c6 > , so I'm not sure what you're driving at here. I am very interested in incorporating a flexible solution to the reader/writer problem into STL container logic. This would give them the inherent ability to perform “extremely well” on various modern multi-processor systems. >>> Suppose we had lock-free vectors. Now one thread calls >>> sort(v.begin(), v.end()) while another calls v.push_back(5). >> >> These would be considered writes. > > Yes. > >>> What does either thread see? >> >> It depends on which thread executes first; remember writes are mutually >> excluded. > > The individual writes to the vector are mutually excluded (when you > have control over them -- which you don't, since the sort algorithm > gets random access iterators and can thus grab addresses and > references into the vector), but the call to sort consists of many > individual writes to the vector. Even if you _could_ mutually exclude > each one (which you can't, for reasons I just described) there's no > reason to think that push_back doesn't sneak between two of them. The calls to sort and push_back could be protected by an external mutex. The individual writes would fall under the same critical-section; therefore they would be atomic. Please note, I am not advocating lock-free writes in any way, shape or form. You are 100% correct in your assertion that lock-free writes are generally useless wrt STL Containers; however, this is not true for lock-free reads... >>> It seems to me that even if it's possible to >>> come up with coherent semantics for this case (which in itself is >>> debatable), it's impossible to come up with _useful_ semantics. >>> If containers are to be shared across threads, their state will almost >>> always be just a small piece of some larger invariant, in which case >>> you'll need to synchronize or "lock-free-ize" at a different level. >> >> This is completely compatible with virtually zero-overhead lock-free >> reads. > > Yes, but it renders wasteful any special measures taken or cycles > spent to do lock free reads. How so? In the realm of lock-free reads, special measures simply have to be taken if you don’t want your collection to basically be a massive race-condition. Writers need to be specially coordinated with the readers in order to solve the nasty reader/writer problem... >> Many existing application defined algorithms that decided to use STL >> containers in multithreaded environments use mutexs to protect everything >> anyway; therefore they are already compatible with the requirements of >> the >> highly flexible lock-free algorithm I am using. Application defined >> algorithm can literately pick and choose exactly which if their >> "read-only >> operations" can be executed without a lock. For instance, a thread that >> makes frequent read-only const_iteration's over a shared collection would >> be >> a great candidate to "lock-free-ize"... > > I thought you were "lock-free-izing" the container. Not the entire container. I just think that they should give an application programmer the ability to “lock-free-ize” read-only iterations. That's all. > If so, you have > done nothing to protect the entire process of iterating over it from > being disturbed by writes. This is exactly where a reader/writer solution would come into play. They can directly address this type of situation. > If not, maybe you should revise your description of what you're proposing. I agree. I have a feeling that I am not sufficiently and clearly stating my case. For that, I am sorry. Let me attempt to clear some things up: 1. I am proposing incorporating a solution to the reader/writer problem into the STL container logic. 2. I am not advocating lock-free writes in any way, shape, or form. I think that STL containers should provide the ability for lock-free read-only access to application programmers. This would allow STL containers to efficiently address multi-processor systems. I believe this ability would further help C++ enter the world of high-performance multi-threading... Any thoughts? BTW: Thank you, David, for your input, patience and more importantly, your time. I really do appreciate it. :) [ See http://www.gotw.ca/resources/clcm.htm for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ] .