Subj : Re: Memory visibility and MS Interlocked instructions To : comp.programming.threads From : Peter Dimov Date : Fri Aug 26 2005 02:41 pm Scott Meyers wrote: [...] > What's important about my undertanding is that guaranteeing that R sees the > correct value depends on both W and R taking actions. It's not enough for > W alone to use a membar or a lock, and it's not enough for R alone to use a > membar or a lock: both must do something to ensure that W's write is > visible to R. > > This model does not seem to be consistent with the documented semantics of > Microsoft's Interlocked instructions at > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/about_synchronization.asp, > which "ensure that previous read and write requests have completed and are > made visible to other processors, and ensure that that no subsequent read > or write requests have started." The page gives this example: > > BOOL volatile fValueHasBeenComputed = FALSE; > > void CacheComputedValue() > { > if (!fValueHasBeenComputed) > { > iValue = ComputeValue(); > InterlockedExchange((LONG*)&fValueHasBeenComputed, TRUE); > } > } > > The InterlockedExchange function ensures that the value of iValue is > updated for all processors before the value of fValueHasBeenComputed is > set to TRUE. > > What confuses me is that here only the writer needs to take an action to > guarantee that readers will see things in the proper order, where my > understanding had been that the reader, too, would have to take some action > before reading iValue and fValueHasBeenComputed to ensure that it didn't > get a stale value for one or both. Correct. Consider this reader: if( fValueHasBeenComputed ) { // do something with iValue } Nothing stops the compiler or the hardware from reading iValue before fValueHasBeenComputed. Even if the writes to iValue and fValueHasBeenComputed are ordered by the writer, it is still possible for the reader to access an uninitialized iValue because of a speculative load. This can be fixed with a mutex, in which case the lock release/acquire "handshake" will provide the necessary ordering. It can also be fixed by attaching an "acquire" label to the load of fValueHasBeenComputed, which will prevent the speculative execution of the subsequent loads (and stores, but these generally don't cross a conditional branch). .