Subj : Re: CMPXCHG timing
To   : comp.programming.threads
From : Michael Pryhodko
Date : Thu Mar 31 2005 10:04 pm

[implementation with problem skipped]

So far it looks like speaking to myself. :) Well, it seems I found why
proposed code does not work. It seems that internal implementation of
CMPXCHG does not conform microcode listed in Intel Architecture manual,
i.e.:
processor reads value from memory, compares it with EAX and if they are
different processor WRITES BACK original value!! Shit!! Who and why
decided to do that??? Until someone explains me reasons behind this
decision I will consider it as very "bright" and "brilliant" move from
Intel.

Proof: if you replace unlock function with:

----------------------------------------------------------------------
    void unlock(LONG dwId) throw()
    {
        LONG dummy = 5;
        if (m_lock != dwId)
        {
            __asm { int 3 }
        }

        __asm {
            mov ebx, this
            mov edx, dwId
ENTER_UNLOCK:
            mov [ebx]this.m_lock, 0
            sfence

            // CAS(&dummy, dummy, ~dummy)
            mov ecx, dummy
            mov eax, ecx
            neg ecx
            cmpxchg dummy, ecx

            mfence

            // if (m_lock == dwId)
            cmp [ebx]this.m_lock, edx
            jne EXIT_UNLOCK
            pause
            jmp ENTER_UNLOCK
EXIT_UNLOCK:
        }
    }
----------------------------------------------------------------------

test app starts working. But as far as I understand this implementation
is wrong -- i.e. it could put 0 into m_lock more than one time possibly
causing simultaneous entries thus violating lock guarantee (if we have
more than 2 competing threads). :(


Bye.
Sincerely yours, Michael.

.