Subj : Re: Futexes are wrong! (Are they?)
To   : comp.programming.threads
From : David Schwartz
Date : Sat Jun 11 2005 07:57 pm


"Jomu" <dragisha@gmail.com> wrote in message 
news:1118534401.537945.242020@g49g2000cwa.googlegroups.com...

>>     No! The thread gets scheduled and already knows which I/O can be done
>> (say because a 'select' or 'poll' call has already returned). It only 
>> does
>> non-blocking I/O to the network, so the thread can use up its full
>> timeslice. File writes are likewise non-blocking. You should typically 
>> only
>> see context switches on file reads that can't be predicted or cached.

>  You don't read all I write, it happens... It's okay, in your model,
> if traffic already happened. What do you do when your select returns
> no-events, but in few microseconds it happens?

    You would block in select.

> Would you spend rest of
> your time slice busy waiting, just in case and to ensure no evil
> context switch?

    In the case where there is no work to do, it doesn't matter what you do. 
You don't have to be efficient in this case. All that matters is how 
efficient you are when there is lots of work to do.

> With non-blocking select you are actually polling your
> data, and it's opposed to "interrupt-driven", where every thread waits
> on blocked read for traffic to happen. What guarantee do you have there
> would be more than one fd active in your timeslice? Are there some
> numbers I am not aware off? Nobody pulls them as far as I follow these
> discussions. And without something like that, I really can't see how
> one can "autotune" it's parameters.

    I never said anything about a non-blocking select. If there is no work 
at all to do, you can do no better than to block in select.

    I have no idea what you're thinking when you ask questions like you're 
asking. It shows no comprehension at all of the model I'm talking about. If 
there's only one connection that needs work to be done on it, you can do no 
better than to do that work in the thread that's already running anyway,

>  As topic says "Futexes are wrong..." , signaling thread preemption
> can be bypassed for (IMHO) most popular use case. As JS told me before
> - no use when context switch is cheap if you must have more of them
> instead of one. But, bypass which started this topic ensures exactly
> that - efficient not ping-pong preemptive signaling, which coupled with
> lightweight context switch makes thr-per-con very much bare-metal
> solution.

    It is bad practice to architect your design to the particulars of one 
implementation of a standard.

>>     Of course, that's why you use I/O discovery techniques like 'select' 
>> and
>> 'poll' and non-blocking I/O.

>  And, what happens when incoming data is "just around corner atm"? As
> I am sure it would be just enough times to make that scheme pretty
> unusable. "My" approach would assure consumer is right there to accept
> incoming data.

    I have no idea what the heck you're talking about. If data is just 
around the corner, there are two possibilities. Either you can nevertheless 
do work now or there's nothing to do but wait. In the first case, you do the 
work. In the second case, you wait. There is no other option and this 
argument has no bearing on thread-per-connection versus thread pool.

>  What are drawbacks of thr-per-con. To this moment, it's banging on
> sync primitives for malloc and I/O (and this is not really banging as
> every thread accesses it's fd and it's fd only). Also, memory
> management is far cry from K&R one so it's also non-issue here. With
> lock-free and wait-free techniques for synchronization around the
> corner, it will surely become non-issue in short-time. What remains are
> these few pages of VM for stack, and few 100's of bytes for in-kernel
> data. With 64bit machines in consumer hardware already, and it's tons
> of gigabytes of VM, it's already problem of past.

    The drawbacks of thread  per connection are numerous and you've barely 
touched on the surface of them. For one thing, POSIX only guarantees you 64 
threads. Are you okay with a design that might not support more than 64 
connections? The main drawback of thread-per-connection is the you totally 
lose control over what work you do, turning that over to the scheduler. 
POSIX does not guarantee any notion of "fairness" in thread scheduling, and 
you generally do need some fairness is work scheduling. So you cannot map 
threads to work one-to-one.

>>     This just isn't a problem. Yes, you will hear about people 
>> complaining
>> that they can't get their automatic tuning algorithms to work correctly, 
>> but
>> that's because they're aiming for perfection. It is still trivially easy 
>> to
>> do *way* better than a thread-per-connection model can do.

>  Can be. Surely few years ago, with O(n) and O(n^2) schedulers and VM
> managers, but - it's past time too.

    You only have the guarantees the standard gives you. Rely on other 
things and your performance becomes dependent upon the details of the 
implementation. That's bad.

    DS

.