From nobody  Mon Oct 19 11:30:17 1998
Received: (from nobody@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id LAA03985;
          Mon, 19 Oct 1998 11:30:17 -0700 (PDT)
          (envelope-from nobody)
Message-Id: <199810191830.LAA03985@hub.freebsd.org>
Date: Mon, 19 Oct 1998 11:30:17 -0700 (PDT)
From: info@highwind.com
To: freebsd-gnats-submit@freebsd.org
Subject: pthread_cond_wait() spins the CPU
X-Send-Pr-Version: www-1.0

>Number:         8375
>Category:       kern
>Synopsis:       pthread_cond_wait() spins the CPU
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Oct 19 11:40:00 PDT 1998
>Closed-Date:    Thu Dec 10 15:00:20 PST 1998
>Last-Modified:  Thu Dec 10 15:01:19 PST 1998
>Originator:     Robert Fleischman
>Release:        3.0 with LATEST libc_r
>Organization:
HighWind Software, Inc.
>Environment:
% uname -a
FreeBSD zonda.highwind.com 3.0-19980831-SNAP FreeBSD 3.0-19980831-SNAP #0: Mon Aug 31 14:03:19 GMT 1998     root@make.ican.net:/usr/src/sys/compile/GENERIC  i386

>Description:
If a number of threads are waiting for a condition, the application spins
the CPU at 100% and eventually hangs the application. I've included
a test program below which illustrates the problem.
>How-To-Repeat:
/* Illustration of FreeBSD pthread_cond_wait() bug
   
   This program sets up a conditional wait and fires off a dozen threads
   that simply wait for the condition. Once the threads are started, 
   the main thread loops signalling the condition once a second.

   Normally, this should result in "Signalling" and "Got Condition"
   being printed once a second. However, because of some bugs in
   FreeBSD, the pthread_cond_wait() spins the CPU and no progress is
   made.

   g++ -o condWaitBug -D_REENTRANT -D_THREAD_SAFE -g -Wall condWaitBug.C -pthread

*/

#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

pthread_mutex_t lock;
pthread_cond_t condition;

static void *condThread(void *)
{
    // Wait until we are signalled, then print.
    while (true) {
	assert(!::pthread_cond_wait(&condition, &lock));
	::printf("Got Condition!\n");
    }
}

int main(int, char **)
{
    // Initialize Lock
    pthread_mutexattr_t lock_attr;
    assert(!::pthread_mutexattr_init(&lock_attr));
    assert(!::pthread_mutex_init(&lock, &lock_attr));
    assert(!::pthread_mutexattr_destroy(&lock_attr));

    // Initialize Condition
    pthread_condattr_t cond_attr;
    assert(!::pthread_condattr_init(&cond_attr));
    assert(!::pthread_cond_init(&condition, &cond_attr));
    assert(!::pthread_condattr_destroy(&cond_attr));

    // Lock the lock
    assert(!::pthread_mutex_lock(&lock));

    // Spawn off a dozen threads to get signalled
    for (int j = 0; j < 12; ++j) {
	pthread_t tid;
	pthread_attr_t attr;
	assert(!::pthread_attr_init(&attr));
	assert(!::pthread_create(&tid, &attr, condThread, 0));
	assert(!::pthread_attr_destroy(&attr));
    }

    // Sleep for 3 seconds to make sure the threads started up.
    timeval timeout;
    timeout.tv_sec = 3;
    timeout.tv_usec = 0;
    ::select(0, 0, 0, 0, &timeout);

    for (int k = 0; k < 60; ++k) {
	::printf("Signalling\n");
	::pthread_cond_signal(&condition);

	// Sleep for a second
	timeout.tv_sec = 1;
	timeout.tv_usec = 0;
	::select(0, 0, 0, 0, &timeout);
    }

    return EXIT_SUCCESS;
}

>Fix:

>Release-Note:
>Audit-Trail:

From: "Daniel M. Eischen" <eischen@vigrid.com>
To: freebsd-gnats-submit@freebsd.org, info@highwind.com
Cc: jb@freebsd.org
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Fri, 23 Oct 1998 12:40:19 -0400

 This is a multi-part message in MIME format.
 
 --------------446B9B3D2781E494167EB0E7
 Content-Type: text/plain; charset=us-ascii
 Content-Transfer-Encoding: 7bit
 
 The spinlock ordering of the cond_wait and mutex structures
 can lead to a deadlock.  Also, kernel scheduling initiated
 by a SIGVTALRM can cause improper mutex and condition
 variable operation.
 
 Attach is a patch that fixes these problems and also
 makes pthread_mutex[try]lock and pthread_cond_[timed]wait
 return EDEADLK when the caller has not locked the mutex.
 
 Dan Eischen
 eischen@vigrid.com
 
 --------------446B9B3D2781E494167EB0E7
 Content-Type: text/plain; charset=us-ascii; name="uthread.diffs"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline; filename="uthread.diffs"
 
 *** pthread_private.h.orig	Thu Oct 22 12:15:26 1998
 --- pthread_private.h	Thu Oct 22 12:16:04 1998
 ***************
 *** 446,451 ****
 --- 446,463 ----
   	/* Signal number when in state PS_SIGWAIT: */
   	int		signo;
   
 + 	/*
 + 	 * Set to non-zero when this thread has locked out thread
 + 	 * scheduling.  We allow this lock to be recursively taken.
 + 	 */
 + 	int		sched_lock;
 + 
 + 	/*
 + 	 * Set to TRUE if this thread should yield after reenabling
 + 	 * thread scheduling.
 + 	 */
 + 	int		yield_on_sched_unlock;
 + 
   	/* Miscellaneous data. */
   	char		flags;
   #define PTHREAD_EXITING		0x0100
 ***************
 *** 655,660 ****
 --- 667,674 ----
   void    _thread_kern_sched(struct sigcontext *);
   void    _thread_kern_sched_state(enum pthread_state,char *fname,int lineno);
   void    _thread_kern_set_timeout(struct timespec *);
 + void    _thread_kern_sched_lock(void);
 + void    _thread_kern_sched_unlock(void);
   void    _thread_sig_handler(int, int, struct sigcontext *);
   void    _thread_start(void);
   void    _thread_start_sig_handler(void);
 *** uthread_cond.c.orig	Sat Oct 17 19:46:57 1998
 --- uthread_cond.c	Thu Oct 22 11:44:46 1998
 ***************
 *** 137,142 ****
 --- 137,150 ----
   	 */
   	else if (*cond != NULL ||
   	    (rval = pthread_cond_init(cond,NULL)) == 0) {
 + 		/*
 + 		 * Disable thread scheduling.  If the thread blocks
 + 		 * in some way, thread scheduling is reenabled.  When
 + 		 * the thread wakes up, thread scheduling will again
 + 		 * be disabled.
 + 		 */
 + 		_thread_kern_sched_lock();
 + 
   		/* Lock the condition variable structure: */
   		_SPINLOCK(&(*cond)->lock);
   
 ***************
 *** 150,184 ****
   			 */
   			_thread_queue_enq(&(*cond)->c_queue, _thread_run);
   
 - 			/* Unlock the mutex: */
 - 			pthread_mutex_unlock(mutex);
 - 
 - 			/* Wait forever: */
 - 			_thread_run->wakeup_time.tv_sec = -1;
 - 
   			/* Unlock the condition variable structure: */
   			_SPINUNLOCK(&(*cond)->lock);
   
 ! 			/* Schedule the next thread: */
 ! 			_thread_kern_sched_state(PS_COND_WAIT,
 ! 			    __FILE__, __LINE__);
   
 ! 			/* Lock the condition variable structure: */
 ! 			_SPINLOCK(&(*cond)->lock);
   
 ! 			/* Lock the mutex: */
 ! 			rval = pthread_mutex_lock(mutex);
   			break;
   
   		/* Trap invalid condition variable types: */
   		default:
   			/* Return an invalid argument error: */
   			rval = EINVAL;
   			break;
   		}
   
 ! 	/* Unlock the condition variable structure: */
 ! 	_SPINUNLOCK(&(*cond)->lock);
   	}
   
   	/* Return the completion status: */
 --- 158,193 ----
   			 */
   			_thread_queue_enq(&(*cond)->c_queue, _thread_run);
   
   			/* Unlock the condition variable structure: */
   			_SPINUNLOCK(&(*cond)->lock);
   
 ! 			/* Unlock the mutex: */
 ! 			if (rval = pthread_mutex_unlock(mutex) == 0) {
 ! 
 ! 				/* Wait forever: */
 ! 				_thread_run->wakeup_time.tv_sec = -1;
   
 ! 				/* Schedule the next thread: */
 ! 				_thread_kern_sched_state(PS_COND_WAIT,
 ! 			 	   __FILE__, __LINE__);
   
 ! 				/* Lock the mutex: */
 ! 				rval = pthread_mutex_lock(mutex);
 ! 			}
   			break;
   
   		/* Trap invalid condition variable types: */
   		default:
 + 			/* Unlock the condition variable structure: */
 + 			_SPINUNLOCK(&(*cond)->lock);
 + 
   			/* Return an invalid argument error: */
   			rval = EINVAL;
   			break;
   		}
   
 ! 		/* Reenable thread scheduling. */
 ! 		_thread_kern_sched_unlock();
   	}
   
   	/* Return the completion status: */
 ***************
 *** 201,206 ****
 --- 210,223 ----
   	 */
   	else if (*cond != NULL ||
   	    (rval = pthread_cond_init(cond,NULL)) == 0) {
 + 		/*
 + 		 * Disable thread scheduling.  If the thread blocks
 + 		 * in some way, thread scheduling is reenabled.  When
 + 		 * the thread wakes up, thread scheduling will again
 + 		 * be disabled.
 + 		 */
 + 		_thread_kern_sched_lock();
 + 
   		/* Lock the condition variable structure: */
   		_SPINLOCK(&(*cond)->lock);
   
 ***************
 *** 221,226 ****
 --- 238,246 ----
   			 */
   			_thread_queue_enq(&(*cond)->c_queue, _thread_run);
   
 + 			/* Unlock the condition structure: */
 + 			_SPINUNLOCK(&(*cond)->lock);
 + 
   			/* Unlock the mutex: */
   			if ((rval = pthread_mutex_unlock(mutex)) != 0) {
   				/*
 ***************
 *** 230,245 ****
   				 */
   				_thread_queue_deq(&(*cond)->c_queue);
   			} else {
 - 				/* Unlock the condition variable structure: */
 - 				_SPINUNLOCK(&(*cond)->lock);
 - 
   				/* Schedule the next thread: */
   				_thread_kern_sched_state(PS_COND_WAIT,
   				    __FILE__, __LINE__);
   
 - 				/* Lock the condition variable structure: */
 - 				_SPINLOCK(&(*cond)->lock);
 - 
   				/* Lock the mutex: */
   				if ((rval = pthread_mutex_lock(mutex)) != 0) {
   				}
 --- 250,259 ----
 ***************
 *** 253,265 ****
   
   		/* Trap invalid condition variable types: */
   		default:
   			/* Return an invalid argument error: */
   			rval = EINVAL;
   			break;
   		}
   
 ! 	/* Unlock the condition variable structure: */
 ! 	_SPINUNLOCK(&(*cond)->lock);
   	}
   
   	/* Return the completion status: */
 --- 267,282 ----
   
   		/* Trap invalid condition variable types: */
   		default:
 + 			/* Unlock the condition structure: */
 + 			_SPINUNLOCK(&(*cond)->lock);
 + 
   			/* Return an invalid argument error: */
   			rval = EINVAL;
   			break;
   		}
   
 ! 		/* Reenable thread scheduling. */
 ! 		_thread_kern_sched_unlock();
   	}
   
   	/* Return the completion status: */
 ***************
 *** 276,281 ****
 --- 293,306 ----
   	if (cond == NULL || *cond == NULL)
   		rval = EINVAL;
   	else {
 + 		/*
 + 		 * Disable thread scheduling.  If the thread blocks
 + 		 * in some way, thread scheduling is reenabled.  When
 + 		 * the thread wakes up, thread scheduling will again
 + 		 * be disabled.
 + 		 */
 + 		_thread_kern_sched_lock();
 + 
   		/* Lock the condition variable structure: */
   		_SPINLOCK(&(*cond)->lock);
   
 ***************
 *** 299,304 ****
 --- 324,332 ----
   
   		/* Unlock the condition variable structure: */
   		_SPINUNLOCK(&(*cond)->lock);
 + 
 + 		/* Reenable thread scheduling. */
 + 		_thread_kern_sched_unlock();
   	}
   
   	/* Return the completion status: */
 ***************
 *** 315,320 ****
 --- 343,356 ----
   	if (cond == NULL || *cond == NULL)
   		rval = EINVAL;
   	else {
 + 		/*
 + 		 * Disable thread scheduling.  If the thread blocks
 + 		 * in some way, thread scheduling is reenabled.  When
 + 		 * the thread wakes up, thread scheduling will again
 + 		 * be disabled.
 + 		 */
 + 		_thread_kern_sched_lock();
 + 
   		/* Lock the condition variable structure: */
   		_SPINLOCK(&(*cond)->lock);
   
 ***************
 *** 342,347 ****
 --- 378,386 ----
   
   		/* Unlock the condition variable structure: */
   		_SPINUNLOCK(&(*cond)->lock);
 + 
 + 		/* Reenable thread scheduling. */
 + 		_thread_kern_sched_unlock();
   	}
   
   	/* Return the completion status: */
 *** uthread_create.c.orig	Sat Oct 17 19:46:57 1998
 --- uthread_create.c	Thu Oct 22 11:44:46 1998
 ***************
 *** 90,95 ****
 --- 90,97 ----
   			new_thread->stack = stack;
   			new_thread->start_routine = start_routine;
   			new_thread->arg = arg;
 + 			new_thread->sched_lock = 0;
 + 			new_thread->yield_on_sched_unlock = 0;
   
   			/*
   			 * Write a magic value to the thread structure
 *** uthread_kern.c.orig	Thu Oct 22 12:20:14 1998
 --- uthread_kern.c	Thu Oct 22 23:44:55 1998
 ***************
 *** 196,201 ****
 --- 196,206 ----
   		/* Check if there is a current thread: */
   		if (_thread_run != &_thread_kern_thread) {
   			/*
 + 			 * This thread no longer needs to yield the CPU.
 + 			 */
 + 			_thread_run->yield_on_sched_unlock = 0;
 + 
 + 			/*
   			 * Save the current time as the time that the thread
   			 * became inactive: 
   			 */
 ***************
 *** 1303,1307 ****
 --- 1308,1333 ----
   		}
   	}
   	return;
 + }
 + 
 + void
 + _thread_kern_sched_lock(void)
 + {
 + 	/* Allow the scheduling lock to be recursively set. */
 + 	_thread_run->sched_lock++;
 + }
 + 
 + void
 + _thread_kern_sched_unlock(void)
 + {
 + 	/* Decrement the scheduling lock count. */
 + 	_thread_run->sched_lock--;
 + 
 + 	/* Check to see if we need to yield. */
 + 	if ((_thread_run->sched_lock == 0) &&
 + 	    (_thread_run->yield_on_sched_unlock != 0)) {
 + 		_thread_run->yield_on_sched_unlock = 0;
 + 		sched_yield();
 + 	}
   }
   #endif
 *** uthread_mutex.c.orig	Sat Oct 17 19:46:58 1998
 --- uthread_mutex.c	Thu Oct 22 11:44:46 1998
 ***************
 *** 168,173 ****
 --- 168,181 ----
   	 * initialization:
   	 */
   	else if (*mutex != NULL || (ret = init_static(mutex)) == 0) {
 + 		/*
 + 		 * Disable thread scheduling.  If the thread blocks
 + 		 * in some way, thread scheduling is reenabled.  When
 + 		 * the thread wakes up, thread scheduling will again
 + 		 * be disabled.
 + 		 */
 + 		_thread_kern_sched_lock();
 + 
   		/* Lock the mutex structure: */
   		_SPINLOCK(&(*mutex)->lock);
   
 ***************
 *** 215,220 ****
 --- 223,231 ----
   
   		/* Unlock the mutex structure: */
   		_SPINUNLOCK(&(*mutex)->lock);
 + 
 + 		/* Reenable thread scheduling. */
 + 		_thread_kern_sched_unlock();
   	}
   
   	/* Return the completion status: */
 ***************
 *** 234,239 ****
 --- 245,258 ----
   	 * initialization:
   	 */
   	else if (*mutex != NULL || (ret = init_static(mutex)) == 0) {
 + 		/*
 + 		 * Disable thread scheduling.  If the thread blocks
 + 		 * in some way, thread scheduling is reenabled.  When
 + 		 * the thread wakes up, thread scheduling will again
 + 		 * be disabled.
 + 		 */
 + 		_thread_kern_sched_lock();
 + 
   		/* Lock the mutex structure: */
   		_SPINLOCK(&(*mutex)->lock);
   
 ***************
 *** 241,270 ****
   		switch ((*mutex)->m_type) {
   		/* Fast mutexes do not check for any error conditions: */
   		case MUTEX_TYPE_FAST:
 ! 			/*
 ! 			 * Enter a loop to wait for the mutex to be locked by the
 ! 			 * current thread: 
 ! 			 */
 ! 			while ((*mutex)->m_owner != _thread_run) {
 ! 				/* Check if the mutex is not locked: */
 ! 				if ((*mutex)->m_owner == NULL) {
 ! 					/* Lock the mutex for this thread: */
 ! 					(*mutex)->m_owner = _thread_run;
 ! 				} else {
 ! 					/*
 ! 					 * Join the queue of threads waiting to lock
 ! 					 * the mutex: 
 ! 					 */
 ! 					_thread_queue_enq(&(*mutex)->m_queue, _thread_run);
 ! 
 ! 					/* Unlock the mutex structure: */
 ! 					_SPINUNLOCK(&(*mutex)->lock);
 ! 
 ! 					/* Block signals: */
 ! 					_thread_kern_sched_state(PS_MUTEX_WAIT, __FILE__, __LINE__);
 ! 
 ! 					/* Lock the mutex again: */
 ! 					_SPINLOCK(&(*mutex)->lock);
   				}
   			}
   			break;
 --- 260,295 ----
   		switch ((*mutex)->m_type) {
   		/* Fast mutexes do not check for any error conditions: */
   		case MUTEX_TYPE_FAST:
 ! 			if ((*mutex)->m_owner == _thread_run)
 ! 				ret = EDEADLK;
 ! 			else {
 ! 				/*
 ! 				 * Enter a loop to wait for the mutex to be
 ! 				 * locked by the current thread: 
 ! 				 */
 ! 				while ((*mutex)->m_owner != _thread_run) {
 ! 					/* Check if the mutex is not locked: */
 ! 					if ((*mutex)->m_owner == NULL) {
 ! 						/* Lock the mutex for this thread: */
 ! 						(*mutex)->m_owner = _thread_run;
 ! 					} else {
 ! 						/*
 ! 						 * Join the queue of threads
 ! 						 * waiting to lock the mutex: 
 ! 						 */
 ! 						_thread_queue_enq(&(*mutex)->m_queue,
 ! 						    _thread_run);
 ! 
 ! 						/* Unlock the mutex structure: */
 ! 						_SPINUNLOCK(&(*mutex)->lock);
 ! 	
 ! 						/* Schedule the next thread: */
 ! 						_thread_kern_sched_state(PS_MUTEX_WAIT,
 ! 						    __FILE__, __LINE__);
 ! 
 ! 						/* Lock the mutex structure again: */
 ! 						_SPINLOCK(&(*mutex)->lock);
 ! 					}
   				}
   			}
   			break;
 ***************
 *** 283,288 ****
 --- 308,316 ----
   
   					/* Reset the lock count for this mutex: */
   					(*mutex)->m_data.m_count = 0;
 + 
 + 					/* Unlock the mutex structure: */
 + 					_SPINUNLOCK(&(*mutex)->lock);
   				} else {
   					/*
   					 * Join the queue of threads waiting to lock
 ***************
 *** 293,302 ****
   					/* Unlock the mutex structure: */
   					_SPINUNLOCK(&(*mutex)->lock);
   
 ! 					/* Block signals: */
   					_thread_kern_sched_state(PS_MUTEX_WAIT, __FILE__, __LINE__);
   
 ! 					/* Lock the mutex again: */
   					_SPINLOCK(&(*mutex)->lock);
   				}
   			}
 --- 321,330 ----
   					/* Unlock the mutex structure: */
   					_SPINUNLOCK(&(*mutex)->lock);
   
 ! 					/* Schedule the next thread: */
   					_thread_kern_sched_state(PS_MUTEX_WAIT, __FILE__, __LINE__);
   
 ! 					/* Lock the mutex structure again: */
   					_SPINLOCK(&(*mutex)->lock);
   				}
   			}
 ***************
 *** 307,319 ****
   
   		/* Trap invalid mutex types: */
   		default:
   			/* Return an invalid argument error: */
   			ret = EINVAL;
   			break;
   		}
   
 ! 		/* Unlock the mutex structure: */
 ! 		_SPINUNLOCK(&(*mutex)->lock);
   	}
   
   	/* Return the completion status: */
 --- 335,350 ----
   
   		/* Trap invalid mutex types: */
   		default:
 + 			/* Unlock the mutex structure: */
 + 			_SPINUNLOCK(&(*mutex)->lock);
 + 
   			/* Return an invalid argument error: */
   			ret = EINVAL;
   			break;
   		}
   
 ! 		/* Reenable thread scheduling. */
 ! 		_thread_kern_sched_unlock();
   	}
   
   	/* Return the completion status: */
 ***************
 *** 328,333 ****
 --- 359,372 ----
   	if (mutex == NULL || *mutex == NULL) {
   		ret = EINVAL;
   	} else {
 + 		/*
 + 		 * Disable thread scheduling.  If the thread blocks
 + 		 * in some way, thread scheduling is reenabled.  When
 + 		 * the thread wakes up, thread scheduling will again
 + 		 * be disabled.
 + 		 */
 + 		_thread_kern_sched_lock();
 + 
   		/* Lock the mutex structure: */
   		_SPINLOCK(&(*mutex)->lock);
   
 ***************
 *** 381,386 ****
 --- 420,428 ----
   
   		/* Unlock the mutex structure: */
   		_SPINUNLOCK(&(*mutex)->lock);
 + 
 + 		/* Reenable thread scheduling. */
 + 		_thread_kern_sched_unlock();
   	}
   
   	/* Return the completion status: */
 *** uthread_sig.c.orig	Thu Oct 22 12:28:16 1998
 --- uthread_sig.c	Thu Oct 22 12:14:24 1998
 ***************
 *** 115,120 ****
 --- 115,128 ----
   			yield_on_unlock_thread = 1;
   
   		/*
 + 		 * Check if the scheduler interrupt has come when
 + 		 * the currently running thread has disabled thread
 + 		 * scheduling.
 + 		 */
 + 		else if (_thread_run->sched_lock)
 + 			_thread_run->yield_on_sched_unlock = 1;
 + 
 + 		/*
   		 * Check if the kernel has not been interrupted while
   		 * executing scheduler code:
   		 */
 
 --------------446B9B3D2781E494167EB0E7--
 

From: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
To: "Daniel M. Eischen" <eischen@vigrid.com>
Cc: freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU 
Date: Sat, 24 Oct 1998 03:22:31 +0400

 Daniel, 
 
 IMO, your _thread_kern_sched_[un]lock() is a bad idea. Theses functions 
 defeat the idea of spinlocks. What is the need to do _SPINLOCK/_SPINUNLOCK 
 when scheduling is blocked? Your code do it a lot. OTOH, spinlocks are 
 designed exactly to make rescheduling harmless. And they works; the only 
 problem is that spinlocks are released in a bit wrong time. (BTW, why you 
 disable scheduling in pthread_cond_signal and pthread_cond_broadcast?)
 
 The whole concept of disabling the scheduler is suspicious. There are data 
 structures, they has to be locked sometimes to provide atomic access to 
 them; why ever disable scheduling? Just lock and unlock properly...
 
 Dima
 
 

From: Daniel Eischen <eischen@vigrid.com>
To: dima@tejblum.dnttm.rssi.ru, eischen@vigrid.com
Cc: freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Sat, 24 Oct 1998 00:28:38 -0400 (EDT)

 > IMO, your _thread_kern_sched_[un]lock() is a bad idea. Theses functions 
 > defeat the idea of spinlocks. What is the need to do _SPINLOCK/_SPINUNLOCK 
 > when scheduling is blocked? Your code do it a lot. OTOH, spinlocks are 
 > designed exactly to make rescheduling harmless. And they works; the only 
 > problem is that spinlocks are released in a bit wrong time. (BTW, why you 
 > disable scheduling in pthread_cond_signal and pthread_cond_broadcast?)
 
 I don't see any other way of making pthread_cond_[timed]wait     
 bulletproof without disabling scheduling.  You shouldn't    
 allow nesting of spinlocks being taken if there is a chance
 of creating a deadlock.  Let's assume that you do not nest
 the condition variable and mutex spinlocks.  Then pthread_cond_wait
 looks something like this:
 
 		/* Lock the condition variable structure: */
 		_SPINLOCK(&(*cond)->lock);
 
 		/* Process according to condition variable type: */
 		switch ((*cond)->c_type) {
 		/* Fast condition variable: */
 		case COND_TYPE_FAST:
 			/* Set the wakeup time: */
 			_thread_run->wakeup_time.tv_sec = abstime->tv_sec;
 			_thread_run->wakeup_time.tv_nsec = abstime->tv_nsec;
 
 			/* Reset the timeout flag: */
 			_thread_run->timeout = 0;
 
 			/*
 			 * Queue the running thread for the condition
 			 * variable:
 			 */
 			_thread_queue_enq(&(*cond)->c_queue, _thread_run);
 
 			/* Unlock the condition variable structure: */
 			_SPINUNLOCK(&(*cond)->lock);
 
 			/* Unlock the mutex: */
 			if ((rval = pthread_mutex_unlock(mutex)) != 0) {
 
 What happens if you get a SIGVTALRM right after the spinunlock
 of the condition lock and before the pthread_mutex_lock?  Another
 thread can get scheduled and possibly dequeue the thread that
 you just placed on the condition variable queue.  When this
 thread comes back and unlocks the mutex, his state will change
 to PS_COND_WAIT but he'll never be woken up because he's not 
 in the queue anymore.
 
 You need to disable thread scheduling.  The way I've done it,
 it doesn't hurt anything and will prevent needless thrashing
 of threads.  You want mutex_lock and cond_wait and friends
 to be atomic and this does that without needless thrashing.
 Note that _thread_kern_sched_lock will *not* block scheduling
 when the thread waits during a spinlock or other blocking
 operations that invoke the thread scheduler.
 
 > The whole concept of disabling the scheduler is suspicious. There are data 
 > structures, they has to be locked sometimes to provide atomic access to 
 > them; why ever disable scheduling? Just lock and unlock properly...
 
 Sure, where you can.  BTW, I didn't invent this idea myself.
 I stole it from VxWorks and it has the same semantics as
 taskLock and taskUnlock do in VxWorks.  This is also how
 pthread_mutex_lock and pthread_cond_wait are implemented.
 
 Perhaps I was a little overzealous in my usage of _thread_kern_sched_lock
 in pthread_cond_signal and pthread_cond_broadcast.  They are
 probably not needed there, but they would let the operation
 continue without causing needless thrashing.
 
 I didn't take out the spinlocks because I thought they would
 work across processes, whereas locking out thread scheduling,
 at least as implemented here, only works within the process.
 
 Dan Eischen
 eischen@vigrid.com

From: Daniel Eischen <eischen@vigrid.com>
To: dima@tejblum.dnttm.rssi.ru, eischen@vigrid.com
Cc: freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Sat, 24 Oct 1998 00:46:23 -0400 (EDT)

 BTW, pthread_mutex_lock also has the same problem as
 pthread_cond_wait if thread scheduling comes at an
 inopportune time:
 
 		_SPINLOCK(&(*mutex)->lock);
 
 		/* Process according to mutex type: */
 		switch ((*mutex)->m_type) {
 		/* Fast mutexes do not check for any error conditions: */
 		case MUTEX_TYPE_FAST:
 			/*
 			 * Enter a loop to wait for the mutex to be locked by the
 			 * current thread: 
 			 */
 			while ((*mutex)->m_owner != _thread_run) {
 				/* Check if the mutex is not locked: */
 				if ((*mutex)->m_owner == NULL) {
 					/* Lock the mutex for this thread: */
 					(*mutex)->m_owner = _thread_run;
 				} else {
 					/*
 					 * Join the queue of threads waiting to lock
 					 * the mutex: 
 					 */
 					_thread_queue_enq(&(*mutex)->m_queue, _thread_run);
 
 					/* Unlock the mutex structure: */
 					_SPINUNLOCK(&(*mutex)->lock);
 
 					/* Block signals: */
 					_thread_kern_sched_state(PS_MUTEX_WAIT, __FILE__, __LINE__);
 
 If thread scheduling is kicked off right after the last
 SPINUNLOCK, then you can also have a thread removed
 from the mutex queue, but it'll never get woken up.
 
 Dan Eischen
 eischen@vigrid.com

From: John Birrell  <jb@cimlogic.com.au>
To: eischen@vigrid.com (Daniel Eischen)
Cc: dima@tejblum.dnttm.rssi.ru, eischen@vigrid.com,
        freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Sat, 24 Oct 1998 15:02:14 +1000 (EST)

 Daniel Eischen wrote:
 > BTW, pthread_mutex_lock also has the same problem as
 > pthread_cond_wait if thread scheduling comes at an
 > inopportune time:
 [...]
 > 					 * Join the queue of threads waiting to lock
 > 					 * the mutex: 
 > 					 */
 > 					_thread_queue_enq(&(*mutex)->m_queue, _thread_run);
 > 
 > 					/* Unlock the mutex structure: */
 > 					_SPINUNLOCK(&(*mutex)->lock);
 > 
 > 					/* Block signals: */
 > 					_thread_kern_sched_state(PS_MUTEX_WAIT, __FILE__, __LINE__);
 > 
 > If thread scheduling is kicked off right after the last
 > SPINUNLOCK, then you can also have a thread removed
 > from the mutex queue, but it'll never get woken up.
 
 The simple solution to this is to change the thread state to PS_MUTEX_WAIT
 while the mutex is locked, then enter the scheduler without changing the
 state. I don't think that the problem is one of locking - just the
 possibility that the thread state will be overwritten at an inoportune
 time (i.e. the thread state may be changed to PS_RUNNING before it
 gets a chance to set it's state to PS_MUTEX_WAIT).
 
 
 -- 
 John Birrell - jb@cimlogic.com.au; jb@freebsd.org http://www.cimlogic.com.au/
 CIMlogic Pty Ltd, GPO Box 117A, Melbourne Vic 3001, Australia +61 418 353 137

From: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
To: Daniel Eischen <eischen@vigrid.com>
Cc: freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU 
Date: Sat, 24 Oct 1998 15:49:13 +0400

 Daniel Eischen wrote:
 > I don't see any other way of making pthread_cond_[timed]wait     
 > bulletproof without disabling scheduling.  You shouldn't    
 > allow nesting of spinlocks being taken if there is a chance
 > of creating a deadlock.  Let's assume that you do not nest
 > the condition variable and mutex spinlocks.  
 
 Why? Frankly, I don't see any harm here. Sure, it is not safe to 
 _SPINUNLOCK the condition lock before pthread_mutex_unlock. So it has to be 
 done in other way around.
 
 > You need to disable thread scheduling.  The way I've done it,
 > it doesn't hurt anything and will prevent needless thrashing
 > of threads.  You want mutex_lock and cond_wait and friends
 > to be atomic and this does that without needless thrashing.
 
 I don't think the "trashing" is dangerous. Spinlocks are held due to very 
 short period of time (this is the main assumption in their design and 
 implementation). The chances that a thread holding a spinlock will be 
 preempted are very low. Even if the thread will be preempted, the chances 
 that another thread will try to lock the same spinlock are low.
 
 Dima
 
 

From: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
To: John Birrell <jb@cimlogic.com.au>
Cc: eischen@vigrid.com (Daniel Eischen), freebsd-gnats-submit@FreeBSD.ORG,
        jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU 
Date: Sat, 24 Oct 1998 15:49:22 +0400

 John Birrell wrote:
 > Daniel Eischen wrote:
 > > BTW, pthread_mutex_lock also has the same problem as
 > > pthread_cond_wait if thread scheduling comes at an
 > > inopportune time:
 > [...]
 > > 					 * Join the queue of threads waiting to lock
 > > 					 * the mutex: 
 > > 					 */
 > > 					_thread_queue_enq(&(*mutex)->m_queue, _thread_run);
 > > 
 > > 					/* Unlock the mutex structure: */
 > > 					_SPINUNLOCK(&(*mutex)->lock);
 > > 
 > > 					/* Block signals: */
 > > 					_thread_kern_sched_state(PS_MUTEX_WAIT, __FILE__, __LINE__);
 > > 
 > > If thread scheduling is kicked off right after the last
 > > SPINUNLOCK, then you can also have a thread removed
 > > from the mutex queue, but it'll never get woken up.
 > 
 > The simple solution to this is to change the thread state to PS_MUTEX_WAIT
 > while the mutex is locked, then enter the scheduler without changing the
             ^^^^^ spinlock?
 > state. I don't think that the problem is one of locking - just the
 > possibility that the thread state will be overwritten at an inoportune
 > time (i.e. the thread state may be changed to PS_RUNNING before it
 > gets a chance to set it's state to PS_MUTEX_WAIT).
 
 IMO, is this _SPINUNLOCK is too early or changing the thread state is too late 
 is not that important :-). Anyway, I would suggest to add 'spinlock_t *' 
 parameter to _thread_kern_sched_state. _thread_kern_sched_state would set 
 the state, unlock the spinlock and enter the scheduler. This would be useful 
 in quite a lot of places. (Richard Seaman sent a patch with a similar idea, but 
 I don't like something in it).
 
 Dima
 
 

From: Daniel Eischen <eischen@vigrid.com>
To: dima@tejblum.dnttm.rssi.ru, jb@cimlogic.com.au
Cc: eischen@vigrid.com, freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Sat, 24 Oct 1998 08:30:57 -0400 (EDT)

 > IMO, is this _SPINUNLOCK is too early or changing the thread state is too late 
 > is not that important :-). Anyway, I would suggest to add 'spinlock_t *' 
 > parameter to _thread_kern_sched_state. _thread_kern_sched_state would set 
 > the state, unlock the spinlock and enter the scheduler. This would be useful 
 > in quite a lot of places. (Richard Seaman sent a patch with a similar idea, but 
 > I don't like something in it).
 
 I don't like that idea because it adds complication.  It's not
 necessary in most cases.  The thread schedule locking is
 very simple and could be more generally useful in the future
 for other things.
 
 Dan Eischen
 eischen@vigrid.com

From: Daniel Eischen <eischen@vigrid.com>
To: eischen@vigrid.com, jb@cimlogic.com.au
Cc: dima@tejblum.dnttm.rssi.ru, freebsd-gnats-submit@FreeBSD.ORG,
        jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Sat, 24 Oct 1998 08:52:45 -0400 (EDT)

 > > If thread scheduling is kicked off right after the last
 > > SPINUNLOCK, then you can also have a thread removed
 > > from the mutex queue, but it'll never get woken up.
 > 
 > The simple solution to this is to change the thread state to PS_MUTEX_WAIT
 > while the mutex is locked, then enter the scheduler without changing the
 > state. I don't think that the problem is one of locking - just the
 > possibility that the thread state will be overwritten at an inoportune
 > time (i.e. the thread state may be changed to PS_RUNNING before it
 > gets a chance to set it's state to PS_MUTEX_WAIT).
 
 That works.  If you want to keep your __FILE__ and __LINE__
 debug thingys, you'll have to add another interface for
 _thread_kern_sched_state that doesn't change the state,
 though.
 
 Dan Eischen
 eischen@vigrid.com

From: Daniel Eischen <eischen@vigrid.com>
To: dima@tejblum.dnttm.rssi.ru, eischen@vigrid.com
Cc: freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Sat, 24 Oct 1998 09:54:39 -0400 (EDT)

 > > I don't see any other way of making pthread_cond_[timed]wait     
 > > bulletproof without disabling scheduling.  You shouldn't    
 > > allow nesting of spinlocks being taken if there is a chance
 > > of creating a deadlock.  Let's assume that you do not nest
 > > the condition variable and mutex spinlocks.  
 >
 > Why? Frankly, I don't see any harm here. Sure, it is not safe to 
 > _SPINUNLOCK the condition lock before pthread_mutex_unlock. So it has to be 
 > done in other way around.
 
 I thought I found a way that you could get a deadlock
 condition here, but it's early in the morning here and
 I can't see how it could happen anymore.  Surely after
 the thread returns from _thread_kern_sched_state you
 don't want to relock (spinlock) the condition variable.
 That fixes the original problem posted.
 
 Dan Eischen
 eischen@vigrid.com

From: Daniel Eischen <eischen@vigrid.com>
To: dima@tejblum.dnttm.rssi.ru, eischen@vigrid.com
Cc: freebsd-gnats-submit@FreeBSD.ORG, jb@FreeBSD.ORG
Subject: Re: kern/8375: pthread_cond_wait() spins the CPU
Date: Sat, 24 Oct 1998 11:08:34 -0400 (EDT)

 > IMO, your _thread_kern_sched_[un]lock() is a bad idea. Theses functions 
 > defeat the idea of spinlocks. What is the need to do _SPINLOCK/_SPINUNLOCK 
 > when scheduling is blocked? Your code do it a lot. OTOH, spinlocks are 
 > designed exactly to make rescheduling harmless. And they works; the only 
 > problem is that spinlocks are released in a bit wrong time. (BTW, why you 
 > disable scheduling in pthread_cond_signal and pthread_cond_broadcast?)
 
 Especially for pthread_cond_broadcast, you do not want to
 get preempted while walking the condition variable queue
 to signal waiters.  If you are preempted, other threads
 run, time increments, and that can cause a thread that
 would normally be signalled to be woken up with a timeout
 (assuming it was doing a pthread_cond_timedwait).  At the
 time of the pthread_cond_broadcast, all threads that are
 currently on the waiting list should be woken up.  You do
 not want the scheduler to run and wakeup some of these threads.
 
 Another thing that looks possible, is that the condition
 variable queue can be corrupted.  When a thread is on
 the queue and it times out, I don't see where the scheduler
 removes it from the queue (_thread_queue_remove isn't used).
 What if it timesout and immediately does another pthread_cond_timedwait?
 I'll look at this some more later today - perhaps I'm just
 overlooking something.
 
 Dan Eischen
 eischen@vigrid.com
State-Changed-From-To: open->closed 
State-Changed-By: steve 
State-Changed-When: Thu Dec 10 15:00:20 PST 1998 
State-Changed-Why:  
Dan Eischen <eischen@vigrid.com> says that this problem has since 
been fixed.  Thanks Dan for the quick reply. 
>Unformatted:
