Subj : Re: Memory visibility and MS Interlocked instructions To : comp.programming.threads From : Alexander Terekhov Date : Wed Aug 31 2005 01:09 pm Mayan Moudgill wrote: [...] > Having looked at Andy's answer, it appears that the following is possible: > > Y,Z are initially 0. > processor 1 writes Y with 2. > processor 2 reads 2 from Y and writes Z with that value. > processor 3 reads 2 from Z and 0 from Y. > > This is the "obvious" behavior assuming a processor with in-order stores > (via a store buffer) using ... Sure. Contrast it with (that's what I thought Joe was driving at when he alleged that PC is somewhat more relaxed than RC... I mean his claim that "processor consistency doesn't give you acquire and release as they are commonly understood"): X,Y,Z are initially 0. processor 1 writes X with 42. processor 1 writes Y with 2. processor 2 reads 2 from Y and writes Z with that value. processor 3 reads 2 from Z and 0 from X. Boom! PC doesn't allow that behavior because before processor 1 is allowed to perform store to Y *with respect to any other processor*, preceding store to X must be performed *with respect to all other processors* ("as if" of course). See the PC "conditions" (1990-rc-isca.pdf) under "Extension to Dubois’ Abstraction" (1993-tr-68.pdf) "Performing a Memory Request"? terms. http://research.compaq.com/wrl/people/kourosh/papers/1995_thesis.pdf "The models we have considered up to now all provide the appearance of a single copy of memory to the programmer. Processor consistency (PC) [GLL+90, GGH93b] is the first model that we consider where the multiple-copy aspects of the memory are exposed to the programmer.1 1 The processor consistency model described here is distinct from the (informal) model proposed by Goodman [Goo89, Goo91]. [...] The conceptual system consists of several processors each with their own copy of the entire memory. By modeling memory as being replicated at every processing node, we can capture the non-atomic effects that arise due to presence of multiple copies of a single memory location. Since the memory no longer behaves as a single logical copy, we need to extend the notion of read and write memory operations to deal with the presence of multiple copies. Read operations are quite similar to before and remain atomic. The only difference is that a read is satisfied by the memory copy at the issuing processor's node (i.e., read from Pi is serviced by Mi). Write operations no longer appear atomic, however. Each write operation conceptually results in all memory copies corresponding to the location to be updated to the new value. Therefore, we model each write as a set of n sub-operations, W(1) ... W(n), where n is the number of processors, and each sub- operation represents the event of updating one of the memory copies (e.g., W(1) updates the location in M1)." regards, alexander. .