From jin@iss-p1.lbl.gov Tue Jun  1 17:57:18 1999
Return-Path: <jin@iss-p1.lbl.gov>
Received: from iss-p1.lbl.gov (iss-p1.lbl.gov [131.243.2.47])
	by hub.freebsd.org (Postfix) with ESMTP id B0D4714E63
	for <FreeBSD-gnats-submit@freebsd.org>; Tue,  1 Jun 1999 17:57:18 -0700 (PDT)
	(envelope-from jin@iss-p1.lbl.gov)
Received: (from jin@localhost)
	by iss-p1.lbl.gov (8.9.3/8.9.3) id RAA22111;
	Tue, 1 Jun 1999 17:57:18 -0700 (PDT)
	(envelope-from jin)
Message-Id: <199906020057.RAA22111@iss-p1.lbl.gov>
Date: Tue, 1 Jun 1999 17:57:18 -0700 (PDT)
From: Jin Guojun (FTG staff) <jin@iss-p1.lbl.gov>
Reply-To: jin@iss-p1.lbl.gov
To: FreeBSD-gnats-submit@freebsd.org
Subject: pthread_kill cannot kill select() threads, etc.
X-Send-Pr-Version: 3.2

>Number:         11984
>Category:       kern
>Synopsis:       pthread_kill cannot kill select() threads, etc.
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jun  1 18:00:01 PDT 1999
>Closed-Date:    Wed Dec 15 22:27:42 PST 1999
>Last-Modified:  Wed Dec 15 22:40:07 PST 1999
>Originator:     Jin Guojun (FTG staff)
>Release:        FreeBSD 3.2-RELEASE i386
>Organization:
>Environment:

	All released version with libc_r

>Description:

	(1)	missing pthread_setcanceltype() and pthread_cancel()
	(2)	pthread_kill does not function correctly on a select()
		thread. See the small program in How-To-Repeat .
	The u_thread_* need to be replaced by simple thread/pthread calls.
	This version is for a universal thread programming.  You may get
	stdef.h and <uthread.h> from CCS library at
		ftp://george.lbl.gov/pub/ccs/ccs-2.1.tgz

	When you run the a.out, you will see all print-out.
	under Solaris/Linux, you will see the program terminated, but under
	FreeBSD, you will see a.out hanging. Use ktrace, you will see it
	hangs on select; however, we saw the "stest finished" which means
	the select() has been killed, but where is the other one from?

>How-To-Repeat:

	
/*	cc -DUSE_PTHREAD pselect.c -l[c_r/pthread/etc.]
*/

#ifdef	CCS_ECL2
#include <stdef.h>
#include <uthread.h>
#else
#include <pthread.h>
#endif

#include <sys/types.h>
#include <fcntl.h>
#ifdef	linux
#include <signal.h>
#endif

test(void * np)
{
int	fd = (int) np;
fd_set	rfds;
	FD_ZERO(&rfds);
	FD_SET(fd, &rfds);
	printf("stest started\n");
	select(2, &rfds, NULL, NULL,	NULL);
	printf("stest finished\n");
u_thread_exit(0);
}

main()
{
char	*fn="/tmp/stest", *fmt="can't open file %s\n";
u_thread_t	tid;
int	fd = open(fn, O_RDONLY | O_CREAT, 0660);
	if (fd < 0)	{
#ifdef	_FreeBSD_
		errx(fd, fmt, fn);
#else
		printf(fmt, fn);
		exit(fd);
#endif
	}

	u_thread_create(&tid, 0, test, fd);
	sleep(2);
	printf("main kill %d\n", tid);
	u_thread_kill(tid, /* SIGKILL | SIGINT */ SIGTERM);
	printf("main quit\n");
u_thread_exit(0);
}


>Fix:
	
	Not known yet.

>Release-Note:
>Audit-Trail:

From: Jin Guojun <j_guojun@lbl.gov>
To: freebsd-gnats-submit@freebsd.org, jin@iss-p1.lbl.gov
Cc:  
Subject: Re: kern/11984: pthread_kill cannot kill select() threads, etc.
Date: Thu, 08 Jul 1999 12:46:56 -0700

 By tracing downto c_r library, two problems have been found:
 
 (1) pthread_exit(void *status)
 {
 	int             sig;
 	long            l;
 	pthread_t       pthread;
 
 	/* Check if this thread is already in the process of exiting: */
 	if ((_thread_run->flags & PTHREAD_EXITING) != 0) {
 		char msg[128];
 		snprintf(msg,"Thread %p has called pthread_exit() from a destructor.
 POSIX 1003.1 1996 s16.2.5.2 does not allow this!",_thread_run);
 		PANIC(msg);
 	}
 
 	/* Flag this thread as exiting: */
 	_thread_run->flags |= PTHREAD_EXITING;
 ...
 }
 
 PTHREAD_EXITING is defined as 0x100, but _thread_run->flags is type of
 char.
 So, PTHREAD_EXITING can never be checked or set.
 
 
 (2) _thread_gc(pthread_addr_t arg)
 {
 ....
                 /*                  
                  * Check if this is not the last thread and there is no
                  * memory to free this time around. 
                  */                      
                 if (!f_done && p_stack == NULL && pthread_cln == NULL) {
                         /* Get the current time.
                          *      
                          * Note that we can't use clock_gettime(2) on
 2.2.x;
                          * use gettimeofday(2) instead.
                          */      
                         struct timeval abstimeval;
                                  
                         if (gettimeofday(&abstimeval, NULL) != 0)
                                 PANIC("gc cannot get time");
                         TIMEVAL_TO_TIMESPEC(&abstimeval, &abstime);
                 
                         /*       
                          * Do a backup poll in 10 seconds if no threads
                          * die before then. 
                          */
                         abstime.tv_sec += 10;
                                  
                         /*       
                          * Wait for a signal from a dying thread or a
                          * timeout (for a backup poll).
                          */     
 /* Line 215 */           if ((ret = pthread_cond_timedwait(&_gc_cond,
                             &_gc_mutex, &abstime)) != 0 && ret !=
 ETIMEDOUT)
                                 PANIC("gc cannot wait for a signal");  
                 }
                                          
                 /* Unlock the garbage collector mutex: */
                 if (pthread_mutex_unlock(&_gc_mutex) != 0)
                         PANIC("Cannot unlock gc mutex");
                                          
 ...
 }
 
 Line 215, dying thread will not able to give conditional signal to gc
 thread, so the pthread_exit() hang forever.
 
 
 Fixing: unknown -- it seems to me this logic needs to be tuned.
 Working around:	force gc exit(0) if timeout; but this is a temp
 solution.
 
 	-Jin
 
State-Changed-From-To: open->feedback 
State-Changed-By: nate 
State-Changed-When: Thu Jul 8 13:14:31 PDT 1999 
State-Changed-Why:  
Update to the most recent version of the code, in FreeBSD 3.2R or 3.2-stable, 
and see if the bugs still exist in the new codebase.  Many changes have been 
made to the code in 3.0R, which appears to be what you are using despite what 
the PR stated. 
. 

From: Alfred Perlstein <bright@wintelcom.net>
To: freebsd-gnats-submit@freebsd.org
Cc: jin@iss-p1.lbl.gov, nate@freebsd.org
Subject: Re: kern/11984: pthread_kill cannot kill select() threads, etc.
Date: Tue, 10 Aug 1999 23:21:02 +0000 (GMT)

 although pthread_cancel isn't implemented the select bug seems
 fixed on -current and -stable as of August 10th 1999 at least.
 
 can we close this PR?
 
 -Alfred Perlstein - [bright@rush.net|alfred@freebsd.org]
 Wintelcom systems administrator and programmer
    - http://www.wintelcom.net/ [bright@wintelcom.net]

----
15 December 1999 (jasone)
The select bug is fixed on both -current and -stable, and pthread_cancel()
has been added to -current.
 
State-Changed-From-To: feedback->closed 
State-Changed-By: jasone 
State-Changed-When: Wed Dec 15 22:27:42 PST 1999 
State-Changed-Why:  
Resolved. 
>Unformatted:
