Subj : Re: CMPXCHG timing To : comp.programming.threads From : David Schwartz Date : Mon Apr 04 2005 07:47 pm "Michael Pryhodko" wrote in message news:1112663578.315153.228560@g14g2000cwa.googlegroups.com... >> First of all, you would always put the lock in its own cache >> line. A >> processor would have to own the cache line in order to perform the >> unlocked >> compare and exchange anyway. Your argument about a 486PC isn't >> convincing >> because the pipeline depth is so much shorter that the LOCK prefix >> hurts >> each CPU less. Plus, there aren't massively parallel SMP 486 systems. > 1. If m_lock variable is not cached, processor could (or will?) lock > system bus. I'm not sure what you're talking about here. There are quite a few cases. On a modern processor, the system bus is never locked. The cache line is acquired and not released for the duration of the LOCKed operation. > 2. I do not see any connection with pipeline depth, AFAIK 'sfence' and > 'LOCK' does not invalidate pipeline. They do. The cost of a fence or LOCK is controlled by the pipeline depth. For example, a store fence requires stores to be classified as either "before" or "after" the fence. This requires the fence to be a specific time, not a different time in each of various pipelines. > What do you mean by "owning" cache line? if you mean LOCK'ing it (in > order to avoid bus lock) -- it is not true, because only XCHG > implicitly locks, not CMPXCHG. Whether or not you lock the compare/exchange, the processor must acquire the cache line before it can do anything. And whether or not you lock it, the bus will not be locked, only the cache line might be. Assuming the locked variable is in its own cache line (which is the only sensible way to do it), the cost the LOCK prefix is due to pipeline issues, same as for the fence. DS .