Subj : Re: std::msync To : comp.programming.threads From : Alexander Terekhov Date : Fri Apr 01 2005 12:55 am < Forward Inline > -------- Original Message -------- Message-ID: Subject: Re: slb/hsb et al On Thu, 31 Mar 2005 16:36:22 +0300, Peter Dimov wrote: > Alexander Terekhov wrote: > >> The final issue is that the reference counted objects in libstdc++ > >> (std::string, std::locale::facet) may also be broken *unless* > >> __exchange_and_add has some memory visibility guarantees that I don't > >> understand, > > > > It's really simple. See release/acquire requirement above. > > > > For immutable stuff we need > > > > msync::slb (instead of full conventional release) memory semantics > > when the result is not zero; > > > > msync::hsb (instead of full conventional acquire) memory semantics > > when the result is zero. > > > > slb stands for "sink load barrier". > > hsb stands for "hoist store barrier". > > Yeah, I lied. I think I do understand. ;-) > > But to clarify, sink X means that preceding X cannot cross the barrier. Right. But unidirectional stuff works in conjunction with some atomic memory operation (read/write/read-modify-write), not as fence instructions. So it actually means that preceding X cannot cross memory operation labeled with sink X barrier. It imposes ordering between preceding X stuff and operation labeled with sink X barrier. > Hoist Y means following Y cannot cross the barrier. Right? Exactly. > > Am I correct in thinking that Sparc-style #XY means sink-Y + hoist-X? That > is, #LoadStore means Sink-Load + Hoist-Store, i.e. what we want above. Yep. Sparc doesn't use unidirectional labels. They have bidirectional fences of various types. > A PPC isync is slb in your notation? isync works in conjunction with OP + conditional branching and has the effect of "full" acquire for that OP. > lwsync fence? http://www-128.ibm.com/developerworks/eserver/articles/powerpc.html (see "Storage barriers and ordering of storage accesses" table) It doesn't provide store-load fencing (expensive thing even on IA32). So it can't be used as full fence. That's more expansive "sync" instruction. > > And just to beat the subject to death, am I correct that the empty > lock/unlock regions are equivalent to .mf with the same location > used in both? (.mf is the label, the operation doesn't matter.) Yes, on the same lock. Sort of. Fences don't use "locations". They are bidirectional and impose ordering between preceding and subsequent operations. > > And that they can be replaced with .rel in release() and .acq > in weak_release(), again with the same location used in both > places? No. ".rel" and ".acq" are unidirectional labels. They must be used to "mark" specific operations. They don't work as bidirectional fences. Bidirectional fences can be used to achieve the effect of ".rel" and ".acq", but it doesn't really work the other way around. For op.rel you do "fence, op" and for op.acq you do "op, fence" (placing fence before op for .rel and after op for .acq). But it is less efficient than unidirectional labels because fences impose redundant constraints. regards, alexander. .