Subj : Re: lockless low-overhead 'pipes' (w/semaphores) To : comp.programming.threads From : lindahlb Date : Tue Apr 19 2005 09:14 pm > I depends upon your assumed memory model. If, for example, you're just > assuming a POSIX-compliant environment, there's no guarantee the consumer > will read the data the producer wrote (instead of stale data stuck in some > cache somewhere), since the consumer doesn't hold the same lock the producer > held. Ok, so in the problem you iterate, the producer stores some data, which invariably ends up staying in a processor's local cache (say L1 or L2), and when the consumer, executing on another processor, would attempt to access the data, it wouldn't see the cache data located on the other processor? Don't today's SMP architectures guarantee visibility in situations like this (i.e. cache collision)? If they do - the data is stored and THEN 'len' is updated. If this order is preserved, then how could the consumer see the updated 'len' before the 'data' has reached memory? If you're talking about out-of-order reads/writes, both compiler and processor level - the atomic_* functions imply a memory barrier (lock prefix for i486+), so thats a non-issue, right? Or does the lock prefix not enforce order and I need a seperate instruction to prevent store-store (producer) and load-store (consumer) barriers? Also, if the lock prefix doesn't work that way I thought it did, are there any other locations that need types of memory barriers that I may be missing? However, if the problem you mention exists NOT because of out-of-order stores/loads, and that the 'len' could reach memory, but the 'data' is still suck in the cache, even after a 'lock' prefix, then I don't see how any program executing on multiple processors could guarantee coherency between data on seperate processors. If there's no way to gaurantee cache coherency, then synchronization would be near impossible, no? I thought I understood this quite well, but clearly I must be missing something (unless I am not enforcing an order to 'len' and 'data' via the lock prefix). Could you please elaborate? I really would appreciate the help, thanks. .