Subj : Re: pthread_create and memory To : comp.programming.threads From : Marcin 'Qrczak' Kowalczyk Date : Tue Mar 15 2005 12:20 am David Hopwood writes: > The point is that an OS should support a thread abstraction > implemented in such a way that it can scale to large numbers > of simultaneous threads, without requiring concurrent language > implementations to essentially duplicate that functionality. I agree that this is a worthwile goal, but it's hard not only because of scalability. Given an efficient and scalable OS-thread interface, I still don't know how to implement garbage collection: a) Taking a mutex for each allocation is out of the question. The cost of allocation would be about 20 times higher. b) Giving each thread a separate young generation, and stopping all threads in safe places on each GC, is something I would probably be able to implement, but it would probably not perform well either. With the current 128kB default of the global young heap there is an awfully lot of minor GCs. They are very fast: in the current design the cost of GC is proportional to live data, and the design of my language and some properties of its implementation yield much short-lived data. If this had to iterate over all threads and ensure that they are safe, and this happened several times a second, half of parallelism would be thrown out of the window. Also, a thread object would be about 20 times larger. This doesn't count the system stack. It's hard to decide what to do with it: on one hand my implementation almost doesn't use it and it would be a pity to waste a large stack, and on the other hand making a small stack would limit foreign code: in rare cases it would bite. c) I have no idea how to implement a truly parallel GC which can run together with the mutator. I've seen papers which described such thing, but I haven't understood them yet. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ .