Subj : Re: Memory visibility and MS Interlocked instructions
To   : comp.programming.threads
From : Seongbae Park
Date : Wed Aug 31 2005 01:01 am

Joe Seigh <jseigh_01@xemaps.com> wrote:
> processor 1:       store X; load X; load Y;
> processor 2:       store Y; load Y; load X;

This is a bad example. Let's make it a bit more specific:

Initially, X=Y=0
P1:       store 1,X; load X,r1; load Y,r2;
P2:       store 1,Y; load Y,r3; load X,r4;

Then, under TSO or PC, all of following four combinations are possible
after executing above code sequences:

r1 r2 r3 r4
 1  0  1  0  
 1  0  1  1
 1  1  1  0
 1  1  1  1

because both allows the second load (P1:load Y and P2:load X) on each processor
to return the value of Y before earlier stores are done.

If this PC is what I think it is[1], then following example would distinguish
between PC and TSO:

Initially, X=Y=0
P1:  store 1,X
P2:  if (X==1) store 1,Y
P3:  if (Y==1) load X,r1

After executing this code,
r1==0 and Y==1 is possible on PC but not on TSO.
This is because P2 may see the store on P1 earlier than P3,
and P3 may see the store in P2 before it sees the store in P1.

> TSO would require that both processors don't see the old value stored
> by the other processor.  PC doesn't requiere that.
> 
> Point 6 is that store followed by a load "as if" optimization.  It's subject to
> PC definition contraints.

I haven't read this thread at all and am jumping in
so allow me if I made wrong assumption about PC.

[1] PC as defined in Gharachorloo's 1990 ISCA paper.
-- 
#pragma ident "Seongbae Park, compiler, http://blogs.sun.com/seongbae/"

.