Subj : Re: recursive mutexes To : comp.programming.threads From : Uenal Mutlu Date : Thu May 19 2005 07:56 pm "Markus Elfring" wrote > > Sounds interessting, but I would be convinced only after testing > > their performance. And here I doubt it can be faster, because > > I myself had experimented with such structures too and had read papers > > on this, but unfortunately the performance was very poor due to the > > additional code checks one has to make. It sums up and degrades > > the performance. > > Would you like to publish your test cases where you got the bad experiences from? I did only some basic study. After I saw it's complexity and its limitations I came back to normal lock methods because I was after a generally usable fast method for mutex operations. > > This assumption is by the fact that you need to put more code to check. > > That is: more code must be executed; even just two or three if statements > > can mean too much compared to a classical mutex method using atomic counter. > > Would you like to perform a more detailed analysis to show concrete numbers for the > effects on factors like "code size", "execution speed", "memory consumption", > "concurrency/parallelization" and "througput"? Sorry, I've not done that deep testing. Using an atomic counter you just use 4 bytes usually, no mem alloc etc. since this would be an overkill for the performance. I just measure the elapsed cpu clock ticks. > How do you think about to compare your approach with the available non-blocking > synchronization implementations? I've yet to see one generally usable. But I understand that such a generally usable lock-free method cannot exist. > By the way, the optimization technique "loop unrolling" can produce "more code" with > improved runtime behaviour under specific conditions. > Can you measure each statement sequence or function call with precise processor cycles and > cache latencies to get an estimation for the time ranges? My measurements are not that sophisticated, I simply measure the net effect by timing the elapsed clock ticks after doing some million iterations in a loop. I have also overlooked same papers on this to see its complexity and weighted their limitations and their advantages. In the end I came to the conclusion that it's not suitable for general use, too complicated, too costly in terms of execution and too limited in their use. I concluded that it wasn't worth to invest more time on this. What's your experience? .