Subj : Re: pthread_create and memory To : comp.programming.threads From : Marcin 'Qrczak' Kowalczyk Date : Thu Mar 17 2005 10:06 pm Giancarlo Niccolai writes: > Actually, also xharbour generates C code, but that is even 1800 > times slower than the equivalent C. I suggest you to do some > benchmark on simple math operations, so you have a precise > measurement. The Ackermann function is 5 times slower than in C. It does integer arithmetic and function calls. It's slower because of dynamic typing, because of portability (the stack is emulated to make it easier to do tail calls, GC and threads), and because of increased safety (integers won't overflow unless they become really huge, the stack won't overflow unless it eats a quarter of RAM, stack overflow is not fatal unless the program continues to ignore stack overflow signals). The producer/consumer test is 5 times faster than in C, because my threads are green. I suspect that some allocation-intensive programs might be faster than in C because my memory allocator is faster than malloc. > 500 times? it is quite a lot. I was surprised too. The program does lots of allocation. For other programs it might be less, but not much less for CPU-bound work. > Of course it doesn't matter if the GC loop is short, but to minimize > your work and make it work also on MT, I suggest you to reduce the > GC calls forcing to be no more than X per second and then performing > a suspension of currently ready-to-be-suspended threads; when all > ready-to-be-suspended threads are suspended, enter the GC. I don't understand. Are you talking about the current scheme where access to the runtime is serialized, or a hypothetical MT runtime? If a MT runtime, how designed? In a MT runtime the only thing which I can currently imagine how to implement is giving each thread a separate young generation and stopping all threads on every GC. Note that even on minor GC threads must be stopped, because newly allocated data might have been exchanged with other threads and they will not like it being moved without a warning. I don't even know whether this would be faster than the current serialized scheme. It will certainly be slower on a uniprocessor, except perhaps cases where some threads may run while other threads are waiting for disk I/O, because it has larger overheads everywhere. It might be faster on SMP unless the cost of synchronizing all threads whenever any of them wants to GC outweighs the benefits of parallelism. It seems the only sensible MT GC would have to be truly parallel, and this is unfortunately too hard for me to implement. > As there will be less GCs in MT mode, your program will require more > resource, but architectures where MT can be usefully applied to > scripts won't probably be affected very much. I'm not going to limit it to scripting. After an interpreter is done, I will be able to implement macros I've already designed; after macros are done, I will be able to design a more convenient C interface; after a better C interfacing scheme is implemented, I will be able to make a good binding to Gtk+. I already have most of the mechanism needed to make Gtk+ play well with threads. With an increasing number of bindings to C libraries I'm going to conquer the Linux desktop :-) -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ .