Subj : Re: What is the real costs of LOCK on x86 multiprocesor machine? To : comp.programming.threads From : Mayan Moudgill Date : Thu Aug 04 2005 08:17 am Joe Seigh wrote: >> > MESI is required to maintain coherency so that only one > processor has a cache line exclusive at a time and thus > keeping updates from getting lost. If you didn't have > false sharing, you wouldn't need this. > > The big problem is with the strong part of strongly > coherent. When you have more cache traffic than you > need due to repeated refreshes of invalidated cache > lines, that has to slow things down. Hardware designers > don't get it. > I'm not quite sure I get what you mean. In the MESI protocol, a shared line will get invalidated when some other processor wants to write to the line (E or S->I). This means that if a processor is writing to an I line, then it has to fetch the new value. That is unavoidable (other than the case of false sharing). [Caveat: if the line in the other core is E but not M, then the data does not have to be fetched, but you still have to wait for the ack] Now, where do you see an opportunity to optimize the transaction? (again, excluding false sharing) One optimization which has been talked about (though I don't know whether any processor actually implements it) is to allow loads (and dependent ops) to execute speculatively against an I line. If it turns out that the line was E then the speculative ops are committed, if it turns out it was M, then they are restarted. .