Subj : Re: Memory reordering problem (or is it?) To : comp.programming.threads From : Giancarlo Niccolai Date : Wed Jun 15 2005 12:01 pm Maciej Sobczak wrote: > Hi, > > Let's consider two threads (ThreadA and ThreadB), ThreadA executing the > following (C++, but treat it as pseudocode): > > int i = 7; > int j = 0; > // ... > j = i; > i = 0; > > Absent any synchronization, I understand that due to memory reordering > (and no visibility between threads guarantee in general), ThreadB might > at some point see this: > > i == 0 and j == 0 > > Now, consider that ThreadA does the following instead: > > int *p1 = new int; > int *p2 = 0; > // ... > p2 = p1; > p1 = 0; > > and that ThreadB is a simple Garbage Collector that scans the program > memory (well, it's own *perception* of the memory) for whatever it finds > to be a pointer and uses the following condition: > > p1 == 0 and p2 == 0 > > as an indication that it is OK to reclaim the dynamic object and its > memory. Gray smoke after that. > > I'm not an expert in today's GC technology, but considering the fact > that it has a lot to do with threads, I would like to ask you whether > this is considered to be a problem and what GC (or the language and its > runtime) can do to avoid this. > > Regards, > The most common solution, and not one of the worst, is to stop all threads in a safe "stop point" and start collection, allowing the threads to resume again after the collection is done. This is called "sequential mark-sweep". Another approach, which may or may not be better, depending on which situations and requirements you face, is that of never stopping the threads and using a "competitive" garbage collector. The most common is the "three color" approach, in which the GC thread scan endlessly the memory in search of unreachable nodes (the description is imprecise, but it gives the idea). The algorithm is less efficient than a mark-sweep, but it never stop threads, which may be good if you need high responsiveness, and on a massive parallel machine being finetuned with your program (i.e. a SMP with the a number of CPU nearing the number of your threads), it may end in higher overall performances. More can be found at: http://www.memorymanagement.org/ Bests, Giancarlo Niccolai. .