Subj : Re: Memory visibility and MS Interlocked instructions To : comp.programming.threads From : Joe Seigh Date : Mon Aug 29 2005 01:08 pm Sean Kelly wrote: > Joe Seigh wrote: > >>If you're thinking that doesn't matter because processor consistency >>gives you release and acquire semantics between two processors. But >>what if you have 3 (or more) processors? For example, processor A >>initializes and stores its address in X. Processors B reads X. So >>far, so good. All the stores by A are read in proper order by B. >>Now processors B stores the object address in Y and processor C reads Y. >>Now there's a problem. There's no guarantee that processor C will see >>the writes by A in proper order (i.e. relative to the read of Y). So >>processor consistency doesn't give you acquire and release as they are >>commonly understood. > > > So assuming this were the case, how would memory ordering be achieved > on Intel/AMD? The instruction set has precious few (ie. no) > instructions to achieve this. Any of the serializing instructions, e.g. cpuid, lock, etc.... Linux uses a dummy XCHG agains the stack (implied LOCK). The xFENCE are offered as being more efficient since all they basically do is serialize and not do something else as well. > > >>And you *really* do need to read the AMD docs. > > > I've got them but have been preferring the Intel docs as they're a bit > more readable. I assume this is in the section on the memory model? > Yes, in "AMD64 Architecture Programmer’s Manual Volume 2: System Programming", chapter 7. "Out-of-order reads are allowed. Out-of-order reads can occur as a result of out-of-order instruction execution or speculative execution. The processor can read memory out-of- order to allow out-of-order execution to proceed." Seems pretty clear IMO. They don't mention it as far as I can see offhand but out-of-order execution would also allow stores to occur before logically previous reads in some cases so MFENCE might be safer for release semantics than SFENCE. If there's a question of whether important apps are using a stricter de facto memory model, AMD probably knows the answer already. -- Joe Seigh When you get lemons, you make lemonade. When you get hardware, you make software. .