Subj : Re: Memory visibility and MS Interlocked instructions To : comp.programming.threads From : Alexander Terekhov Date : Fri Aug 26 2005 06:37 pm Marcin 'Qrczak' Kowalczyk wrote: > > "David Schwartz" writes: > > >> But this would require that MS insert membars on all volatile > >> accesses, because there is, in general, no way to know whether > >> another part of the program uses an interlocked instruction. > >> Do MS and MS-compatible compilers really do that? > > > > On x86, it's not needed. I'm not sure about other platforms. > > I guess MS doesn't care about other platforms. Actually MS doesn't seem to care much about implications on X86 too. Regarding making C/C++ volatiles sequentially consistent [SC] like in revised Java (and its .Net clone so to speak): Consider having thread A execute the following, where initially x, y and z are all zero: atomic_store(&x, 1); [annotation: SC volatile store] r1 = atomic_load(&y); [annotation: SC volatile load] if (!r1) ++z; while thread B executes: atomic_store(&y, 1); [annotation: SC volatile store] r2 = atomic_load(&x); [annotation: SC volatile load] if (!r2) ++z; Under a sequentially consistent interpretation, one of the atomic store operations must execute first. Hence r1 and r2 cannot both be zero, and hence there is no data race involving z. There are data races involving x and y, but those accesses are made through the atomic operations library, and hence must be allowed. Atomic accesses are not meaningful if there is no data race. The difficulty is that there are strong reasons to support variants of atomic store and atomic load that allow them to be reordered, i.e. that allow the atomic load to become visible to other threads before the atomic store. For example, preventing this reordering on some common X86 processors incurs a penalty of over 100 processor cycles in each thread. Both the ordinary load and store operations, as well as the acquire and release versions from the preceding section, will allow this reordering. For variants that allow reordering, the above program should really invoke undefined semantics, since r1 and r2 can both be zero, and hence there is a data race on the ordinary variable accesses to z. To Joe: why don't you simply spend your weekend studying the archives. jupiter.robustserver.com/pipermail/cpp-threads_decadentplace.org.uk regards, alexander. .