Subj : Re: Memory visibility and the C compiler... ( a bit long ) To : comp.programming.threads From : Eric Sosman Date : Fri Jan 14 2005 03:16 pm SenderX wrote: >>SenderX wrote: >>[...] >> >>>>Reordering across "critical instructions" (inline asm >>>>and/or external asm functions) is still possible (and is in fact >>>>desirable as long as it doesn't break anything). >>> >>>Are you talking about how the C compiler can reorder calls to external >>>assembled functions? >> >>Yes. For example, an external assembled function with the effect of >> >> int f(...) { >> return 42; >> } >> >>can be reordered in all the ways you can imagine. Oder? > > I see. I wonder how any pthread implementation gets around this fact? Most > pthread impls already use processor specific assembly to implement their > mutex api's. Could a compiler reorder variables around calls to a > lock-unlock function pair in a way that could mess things up? The compiler is a part of the Pthreads implementation, and must refrain from doing things that would violate the Pthreads semantics. For most compilers this is not too difficult: the things that happen in pthread_mutex_lock(), say, are modelled well by C's notions of side-effects and sequence points. Since the compiler doesn't usually "know" what global data an external function might read or write, it won't take the risk of reordering accesses to those data across function calls. Things get tougher if some or all of the Pthreads functions are visible to the compiler (perhaps as inline functions), or if aggressive optimization is done at what used to be called "link time." In such cases the compiler might be fooled: if it "knows" that the only object touched by pthread_mutex_lock() is the mutex, it might decide to move the lock call to a point after the access to the object the lock protects -- after all, Pthreads makes no explicit connection between the lock and the protected object, so as far as the compiler can tell they are independent. In such an implementation, the Pthreads functions need to be flagged in some implementation-magical way to prevent unacceptable optimizations. There'll be some kind of signal to the compiler -- perhaps a #pragma or something of the sort -- declaring that despite appearances, pthread_mutex_lock() should be treated as if it read and wrote all the accessible memory, so the optimizer dare not move the actual call out from between its nominal sequence points. When you're trying to roll your own synchronization with something like sequences of asm() code inserted inline amid the C code, you need to know how to prevent the compiler from being too aggressive. For compiler X you may simply "know" that asm() code never gets rearranged; for compiler Y you may need to utter the magic _Pragma to prevent the rearrangement. This is one of the reasons it's difficult to rebuild your kernel with a different C compiler ... -- Eric.Sosman@sun.com .