99c Subj : Re: Memory visibility and MS Interlocked instructions To : comp.programming.threads From : David Hopwood Date : Sat Aug 27 2005 08:05 pm Joe Seigh wrote: > Joe Seigh wrote: > >> * The one exception I'm aware of (because I use it myself) is atomic >> load depends and we can get away with that because so much of lock-free >> depends on pointer swizzling. > > Speaking of that, what do most people think are its semantics? A qualified > acquire (#LoadLoad+#LoadStore) or just a qualified #LoadLoad? AFAICS, only the latter is needed in most cases. For example, consider a publisher/subscriber pattern. The publisher needs a #StoreStore; the subscriber needs a dependent #LoadLoad. The subscriber does not store to the published data at all, so there is no reason why it would need a #LoadStore. > I believe Linux uses the former. The only platform that requires an > explicit memory barrier, Alpha, has no #LoadLoad equivalent, just a full > membar and a #StoreStore membar. And current dependend loads on the other > platforms have acquire semantics AFAIK. > > I'm thinking of making the atomic_load_depends in my set of atomics just > provide #LoadLoad semantics. Because I strongly suspect that Intel and/or > AMD will break the dependent load hack down the road. Given that it is basically just a coincidence that it happens to work on the current processor implementations, that's quite possible. > If you have acquire semantics, then you will be forced to use MFENCE which > will affect performance more than just using LFENCE for #LoadLoad semantics. > > If you go the #LoadLoad route, then writing to shared objects accessed by > dependent load will require exlicit synchronization to ensure full acquire > semantics, something that is likely being used anyway. > > I'm trying to avoid being blindsided by Intel/AMD who seem to have almost > no awareness of what's going on in synchronization. Hey, that's not fair. Intel and AMD's documentation clearly do *not* guarantee anything about dependent loads. If it breaks, tough. This is no different from any other implementation-defined behaviour. I see no reason why Intel or AMD should be constrained to continue to support every random property of their current processor models that some bunch of hackers might rely on without any justification from the docs. -- David Hopwood . 0