Subj : Re: [x86] CMPXCHG timing To : comp.programming.threads From : Michael Pryhodko Date : Thu Mar 31 2005 05:33 pm > > I understand that this question is quite platform-specific and can be > > considered offtopic here, but... > > > > suppose situation: > > 1. x86 platform (i.e. P6, Xeon and so on) > > 2. there are two different memory locations M1 and M2 > > 3. M1 cached in processor P1 cache > > 4. M2 cached in processor P2 cache (P2 is second processor) > > 5. M1 = 0 and M2 = 0 > > > > CMPXCHG &M1, 0, 1 > > CMPXCHG &M2, 0, 1 > > > > Can it be safely assumed that execution time for instruction above will > > be the same? (considering that first instruction will be executed on > > P1, second -- on P2) > > On average they'll probably be the same. Why would it matter? If > you're using timing dependent logic, your logic is wrong. Unfortunately I need precise answer :). And not every timing logic is wrong. > > // store buffers are empty > > mov aligned_mem_addr, value_32bit > > sfence // to flush store buffers > > > > does NOT depends on aligned_mem_addr value? Any 'memory bank', 'DIMM > > module' issues? > > No. Cache line state will affect it. But again, why should you care? I am trying to "delay" one processor until others will finish "write and flush store buffers" with "write to another memory location" (all memory locations are cached). Could you be more specific on "cache line state"? I wonder how "flush store buffer" performed if it contains only one "store" -- is it constant time "to generate and send message to cache, wait for cache coherency mechanism to finish" or something else? Basically I have built sophisticated ultra-fast lock which works on x86, but I need some guarantees from hardware to prove its correctness. Maybe I should post idea here? Bye. Sincerely yours, Michael. .