Subj : Memory Barriers, Compiler Optimizations, etc. To : comp.programming.threads From : Scott Meyers Date : Tue Feb 01 2005 08:38 pm I've encountered documents (including, but not limited to, postings on this newsgroup) suggesting that acquire barriers and release barriers are not standalone entities but are instead associated with loads and stores, respectively. Furthermore, making them standalone seems to change semantics (see below). Yet APIs for inserting them via languages like C or C++ seem to deal with barriers alone -- there are no associated reads or writes. (See, for example, http://msdn.microsoft.com/library/default.asp?url=/library/en- us/vclang/html/vclrf_readwritebarrier.asp.) So I have two questions about this. First, are acquire/release properly part of loads/stores, or does it make sense for them to be standalone? If the former, how are programmers in languages like C/C++ expected to make the association between reads/writes and memory barriers? Next, is it reasonable to assume that compilers will recognize memory barrier instructions and not perform code motion that is contrary to their meaning? For example: x = a; insertAcquireBarrier(); // or x.acquire = 1 if the barrier // should not be standalone y = b; Assuming that x, y, a, and b are all distinct locations, is it reasonable to assume that no compiler will move the assignment to y above the barrier, or is it necessary to declare x and y volatile to prevent such code motion? Finally, is the following reasoning (prepared for another purpose, but usable here, I hope) about the semantics of memory barriers correct? Based on the ICSA 90 paper introducing release consistency, I think of an acquire barrier as a way of saying "I'm about to enter a critical section," and a release barrier as a way of saying "I'm about to leave a critical section." So consider this situation, where we want to ensure that memory location 1 is accessed before memory location 2: Access memory location 1 Announce entry to critical section // acquire barrier Announce exit from critical section // release barrier Access memory location 2 We have to prevent stuff from moving out of the critical section, but there's no reason to keep stuff from moving into it. That is, if x is a shared variable, we need to access it only within a critical section, but if y is thread-local, compilers can perform code motion to move access of y into the critical section without harm (except that the critical section is now going to take longer to execute). Neither access above is inside the critical section, so both can be moved: Announce entry to critical section Access memory location 1 // moved this down Access memory location 2 // moved this up Announce exit from critical section But within a critical section, instructions can be reordered at will, as long as they are independent. So let's assume that the two memory locations are independent. That makes this reordering possible: Announce entry to critical section Access memory location 2 Access memory location 1 Announce exit from critical section And now we're hosed. On the other hand, if the memory barriers are part of the loads/stores, we have this: acquire & access memory location 2 access memory location 3 Because you can't move subsequent accesses up above an acquire (i.e. you can't move something out of a critical section), you're guaranteed that location 1 must be accessed before location 2. Thanks for all clarifications, Scott .