Subj : Re: Memory visibility and MS Interlocked instructions To : comp.programming.threads From : Chris Thomasson Date : Thu Sep 08 2005 01:28 am "Chris Thomasson" <_no_spam_cristom@no_spam_comcast._net> wrote in message news:jJGdnQm6WqhhCIreRVn-rA@comcast.com... >> CPU1: >> >> st.rel X 1 >> ld.acq Y >> >> CPU2: >> >> st.rel Y 1 >> ld.acq X > > The loads can migrate above the stores... > > http://groups.google.com/group/comp.programming.threads/msg/68ba70e66d6b6ee9?hl=en > > > > > you need to do this: > > st.rel 'location1' > mf > ld.acq 'location2' > > in order to ensure the stores effects become visible before the subsequent > ld.acq effects become applied. Here is a classic algorithm that requires a mfence / LOCK( to dummy thread-local location ) in order to work correctly on IA-32 systems: http://groups.google.com/group/comp.programming.threads/msg/1e45b4b16bad9784?hl=en this algorithm will crash if "any" of the operations that follow the mfence become visible 'before' "any" of the preceding operations. You have to be care-full on IA-32 wrt implementing some of the more "sensitive" lock-free algorithms... Any thoughts? .