Subj : Re: Memory visibility and MS Interlocked instructions To : comp.programming.threads From : Alexander Terekhov Date : Mon Sep 05 2005 08:23 pm Sean Kelly wrote: > > Alexander Terekhov wrote: > > Alexander Terekhov wrote: > > [...] > > > Nah, loops are needed for LR-SC on Power. For x86, it is just a single > > > load followed by InterlockedCompareExchange(&addr, temp, temp) [MP > > > > Silly me. InterlockedCompareExchange(&addr, 42, 42) should work just > > fine. I've asked Andy Glew of Intel to confirm it ("Intel x86 memory > > model question" thread on comp.arch). > > A load/store combination would definately work, What load/store combination? > but if CMPXCHG would > work as well then so much the better. Is a separate load even > necessary then? No. > Assuming *addr != 42 then we've essentially loaded > addr twice in a row. CMPXCHG on x86 always performs a (hopefully StoreLoad+LoadLoad fenced) load followed by a (LoadStore+StoreStore fenced) store (plus trailing MFENCE, so to speak). (CMPXCHG is supposed to be "fully fenced".) You just need to ensure that "source operand" register has the same value as "Accumulator = AL, AX, EAX, or RAX depending on whether a byte, word, doubleword, or quadword comparison is being performed". CMPXCHG will store the loaded value (if it's different) in the accumulator. Or am I just reading faked ia32 specs? regards, alexander. .