Subj : Re: Memory Barriers, Compiler Optimizations, etc. To : comp.programming.threads From : Gianni Mariani Date : Thu Feb 03 2005 10:24 pm Scott Meyers wrote: > On Tue, 01 Feb 2005 23:14:30 -0800, Gianni Mariani wrote: > >>All aquire does is to guarentee that any load (memory fetch) operations, >>possibly many, that have been requested before the barrier instruction >>are completed before any subsequent memory fetch operations. > > > I post this with great trepedation, because I've gotten this backwards > several times before, but my understanding is that an acquire guarantees > that subsequent memory operations will not take place before any operations > preceding the acquire, i.e., that memory references "after" the barrier (in > program order) won't migrate up to "before" the barrier. However, it's a > unidirectional barrier, so memory operations preceding the barrier may > migrate down to after it. Conceptually, we can move memory operations into > the critical section, but we can't move opertions inside the critical > section to above the acquire (i.e., out of the critical section). I don't have any trepedation, I suspect that if I am wrong, I'll get told sooner or later. :-) It depends on arhitecture. I came across another list of names for barriers. LoadLoad LoadStore StoreLoad StoreStore ref: http://gee.cs.oswego.edu/dl/jmm/cookbook.html > > Did I get it wrong again, did I misread what you wrote, or is there a > misstatement above? Given the prolific nature of the nature of the kinds of memory barriers, you may be right or we may both be right. Documentation seems scant and in some respects because we have a vast number of "theoretical" machines to deal with, both hard and virtual. > > >>volatile int v1 = BAD; >>volatile bool done = false; >> >>reader: >>a: bool is_done = done; >>b: aquire(); >>c: if ( is_done ) play_with( v1 ); > > > Yes, but consider: > > reader: > int x = 22; > a: bool is_done = done; > b: aquire(); > c: if ( is_done ) play_with( v1 ); > > The assignment to x can be moved down to between b and c, right? I suspect there could be a CPU that would do this, yes. Also, the > acquire is really meant to be associated with the store to is_done, right? Yes, it is meant to be associated with load(done)/load(v1). i.e. is_done is just a temporary that is only available to a single thread (on it's stack) so no barriers are required to read from is_done since to other thread can change it. read(done) -> is_done aquire check(is_done) then read(v1) Hence, if done is read, and it is false, v1 is not used. Worst case, v1 is not read but v1 is also not BAD. Combine this with. x: v1 = GOOD; y: release(); z: done = true; store(v1) release (make the new v1 visible) store(done) There is no way that is_done can be true without v1 being GOOD. .