Subj : Re: Memory Barriers, Compiler Optimizations, etc.
To   : comp.programming.threads
From : Scott Meyers
Date : Thu Feb 03 2005 09:01 am

On Wed, 02 Feb 2005 07:43:15 -0500, Joseph Seigh wrote:
> On Tue, 1 Feb 2005 20:38:08 -0800, Scott Meyers <Usenet@aristeia.com> wrote:
> > Assuming that x, y, a, and b are all distinct locations, is it reasonable
> > to assume that no compiler will move the assignment to y above the barrier,
> > or is it necessary to declare x and y volatile to prevent such code motion?
> 
> Hypothetically, yes.  Volatile wouldn't help as it has no meaning for
> threads.  If the variables are only known to the local scope, ie. they're
> not external or have had an address taken, then the compiler can move them
> whereever it wants since no other thread can see them.  

My concern wrt volatile was that treatments of memory issues refer to
"program order" as if it's the same as "source code order," but with
compilers moving stuff around prior to code generation, "source code order"
may be quite different from "program order."  At least in C++, if I want to
ensure that the the relative order of these reads is preserved,

  x = a;    // I want x to be read before y
  y = b;

declaring x and y volatile will do it.  Compilers can still move the reads
around wrt reads and writes of non-volatile data, but to remain compliant
with the C++ standard, x must be read before y in the generated code, i.e.,
in program order.

However, if compilers recognize and respect the semantics of membars, the
need for volatile goes away, because I can just stick a membar between the
reads (which I need anyway), and the problem is solved.

Incidently, I understand how compiler intrinsics like Microsoft's
_ReadWriteBarrier are recognized by compilers, but from what I've read in
this group, there seems to be the assumption that calling an externally
defined function containing assembler will prevent code motion across 
calls to the function, because compilers must pessimistically assume that
calls to the function affect all memory locations.  With increasingly
aggressiving cross-module inlining technology available, this seems like a
bet that gets worse and worse with time.  It's not hard to imagine a build
system that can see that a called function doesn't affect the value of a
global variable and thus move a read or write of that variable across the
call.  Is there a reason this can't happen, or are we just lucky that our
tools are, for the time being, both conservative and kind of dumb?

Regarding the other responses to my post, I have to study them before I
respond. 

Thanks,

Scott

.