Subj : Re: NPTL2 or something To : comp.programming.threads From : SenderX Date : Tue Jan 04 2005 07:12 am > Windows > doesn't have anything like pthread_cancel so I don't include one. Anyway > the > eventcount api for windows is different than the unix one, there's no > point > in making them the same. The eventcount for windows uses atomic_ptr correct? A simple eventcount for linux could use use futex's directly. I recently posted some *broken pseudo-code for a windows eventcount a while back that used an event per-thread to build its waitsets. That limits it to SCHED_OTHER. I was thinking of using your idea of a semaphore per-eventcount generation. I believe you posted some windows condvar code that used a semaphore per-waitset. I think you used a simple reference count to determine when it was ok to give up the current waitsets semaphore. That should get rid of SCHED_OTHER only. > I'm staying away from C++ exceptions for now since any system that > combines > synchronous and asynchronous exceptions displays a profound and > fundamental > lack of common sense. I do like the simplicity of simple C-style return codes. Mixing threads with exceptions, stack-local destructors and cancellation seems to be a real pain in the neck... ;) * http://groups-beta.google.com/group/comp.programming.threads/msg/46226c151fda7dd4 I can't post the fully fixed eventcount code, but I can briefly explain how the race-condition can and will occur... The posted eventcount code does the actual comparison of the local eventcount with the current eventcount as its locking the waitset. This gives it the ability to atomically compare the counts and lock the waitset. It also brings up a nasty race condition if the following conditions are "all" met: 1. The CAS succeeds 2. the eventcount comparison succeeded 3. the waitset was previously locked 4. the waiters bit was set 5. wait for the lock 6. more than one thread is concurrently meets conditions 1-5 An evil race-condition can occur if one of a "single" thread in condition 6 meets all of the following conditions, "after" it gets signaled: 1. The CAS succeeds 2. the eventcount comparison failed The "problem" thread would not have locked the waitset because the eventcount comparison failed. So, it will not call a unlock function. This can and will leave threads in the waitset unsignalled forever. This deadlocks the algorithm as a whole. The solution is to require that a "problem" thread signals another thread in the lock waitset. That's it. Very simple... ;( .