Subj : Re: Memory Barriers, Compiler Optimizations, etc. To : comp.programming.threads From : David Schwartz Date : Sun Feb 06 2005 12:50 pm "Scott Meyers" wrote in message news:MPG.1c6fc390ebc1c3ed9897c2@news.hevanet.com... > On Sat, 5 Feb 2005 17:26:11 -0800, David Schwartz wrote: >> This is a meaningless requirement because it doesn't say *where* the >> order needs to be preserved. One could argue that an L2 cache violates >> this >> requirment and the C standard requires you to disable the L2 cache for >> volatile accesses. The problem is that the standard simply calls the >> order >> of such accesses part of the 'oberservable behavior' of the program with >> no >> concept of how or where such a thing is to be observed. > I'd imagine it's the observable behavior of the abstract machine, not any > real machine, since the entire standard involves only an abstract machine. Exactly. The problem is, real machines don't have a (single, well-defined) point from which they can be observed. This makes the observation requirement meaningless. > My take would be if x and y are volatile and the generated code (program > order) accesses x before y, the compiler is off the hook, regardless of > what happens on any real machine at runtime. Nonsense. The C++ standard applies to the entire machine, not just the compiler. A compiler would not be conforming to the C++ standard if it generated code that might actually confrom (in the sense of an abstract machine) on some hypothetical hardware, the system as a whole complies if the compiler generates conforming code when run on a particular piece of hardware. The standard is about an abstract machine, not compiled code. In fact, it doesn't even require the compiler to generate object code at all. It just requires certain particular results. > After all, single > threaded-programs will always behave as if x is accessed before y, > regardless of what the hardware does (at least that's my understanding), The sentence above may or may not be true, but it's definitely not about the C++ standard. The C++ standard is not modified by how particular hardware implementations act. It's either comprehensible in terms of an abstract machine or it's not. The observation requirement for volatile accesses is, quite literally, incomprehensible in terms of an abstract machine. > and the standards have no concept of more than one thread. I can imagine > programmers wanting stronger guarantees, but I can't imagine compiler > writers offering weaker guarantees. Are there real compiliers where use > of > volatile does not have the effect of totally ordering accesses in the > generated code to volatile data? The standard is not about the generated code itself, it's about what the generated code does when it's run on the hardware. You cannot have a conforming C++ compiler whose target is "no hardware in particular". The C++ standard is in terms of an abstract machine and a conforming compiler must conform on some particular piece of hardware. One could argue that hardware on which the concept of observability of memory accesses is impossible makes it impossible to write a conforming C++ compiler. On modern x86 systems, you *cannot* enforce the order of volatile variable accesses in the sense that the C++ standard appears to require. However, you can't just arbitrarily pick one point in the implementation and say "ahh, that's where the C++ standard was talking about observing, between the compiler and the processor executing the compiled code" because between the processor and the memory controller is an equally valid point of observation. DS .