From tijl@coosemans.org  Tue Jan 12 21:55:05 2010
Return-Path: <tijl@coosemans.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3144A1065672
	for <FreeBSD-gnats-submit@freebsd.org>; Tue, 12 Jan 2010 21:55:05 +0000 (UTC)
	(envelope-from tijl@coosemans.org)
Received: from mailrelay005.isp.belgacom.be (mailrelay005.isp.belgacom.be [195.238.6.171])
	by mx1.freebsd.org (Postfix) with ESMTP id 8BE988FC08
	for <FreeBSD-gnats-submit@freebsd.org>; Tue, 12 Jan 2010 21:55:04 +0000 (UTC)
Received: from 59.24-201-80.adsl-dyn.isp.belgacom.be (HELO kalimero.tijl.coosemans.org) ([80.201.24.59])
  by relay.skynet.be with ESMTP; 12 Jan 2010 22:55:02 +0100
Received: from kalimero.tijl.coosemans.org (kalimero.tijl.coosemans.org [127.0.0.1])
	by kalimero.tijl.coosemans.org (8.14.3/8.14.3) with ESMTP id o0CLt2Rt003164
	for <FreeBSD-gnats-submit@freebsd.org>; Tue, 12 Jan 2010 22:55:02 +0100 (CET)
	(envelope-from tijl@coosemans.org)
Message-Id: <201001122255.02052.tijl@coosemans.org>
Date: Tue, 12 Jan 2010 22:54:49 +0100
From: Tijl Coosemans <tijl@coosemans.org>
To: FreeBSD-gnats-submit@freebsd.org
Subject: [patch] race condition in traced process signal handling

>Number:         142757
>Category:       kern
>Synopsis:       [kernel] [patch] fix race condition in traced process signal handling
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jan 12 22:00:09 UTC 2010
>Closed-Date:    Tue May 11 20:04:32 UTC 2010
>Last-Modified:  Tue May 11 20:04:32 UTC 2010
>Originator:     Tijl Coosemans
>Release:        FreeBSD 8.0-STABLE i386
>Organization:
>Environment:
>Description:
There's a race condition in the kernel signal handling code that causes
signals to not be delivered to traced processes.

When a process is being traced it is stopped whenever it receives a
signal to allow the parent to intervene. The parent can let the signal
through or not or send a different signal. In the latter two cases the
original signal has to be taken off the child's signal queue. Currently
this happens after the child has resumed, but between resuming and
removing the signal the same signal number could be sent which
shouldn't be removed.

>How-To-Repeat:
I've attached two test cases that lock up (easily on a UP system).

race1.c:

This program calls ptrace(PT_ATTACH,...) and ptrace(PT_DETACH,...) in a
loop. It locks up while waiting for the child to stop after a PT_ATTACH
call. In that case the SIGSTOP from the PT_ATTACH isn't delivered. This
happens when the child only resumes from the previous PT_DETACH when
the parent has already done the next PT_ATTACH. The child removes the
next SIGSTOP from its set of pending signals when it thinks it removes
the previous one. Instead, the wait4 call should have acted as a
barrier guaranteeing that the previous SIGSTOP has been delivered,
handled and removed such that a new SIGSTOP can be sent.

This code was derived from a Windows API function in emulators/wine
to read the memory of another process. Calling this function multiple
times in a row tends to fail.

race2.c:

This program shows the same problem but looks more like a GDB debugging
session involving signals. It locks up when the signal sent by the kill
call is never delivered.

For the sake of simplicity the signal is sent by the parent which GDB
would never do, but imagine debugging a multi-threaded program where
threads send signals to one another. Under certain conditions those
signals may not be delivered.

>Fix:
I've attached a patch (RELENG_8) that I believe fixes the problem. 

Basically it takes the signal off the queue before the process is
stopped instead of after the process resumes. If the parent decides to
let the signal through, it is added to the queue again. However, one
cannot simply use sigqueue_get() and sigqueue_add() for this because
sigqueue_add() would put the siginfo data back at the end of the queue
which would change the delivery order. Moreover, sigqueue_add() might
fail and return error.

So the patch introduces a sigqueue_get_partial() function which removes
the signal from sigsets like sigqueue_get() does but leaves siginfo
data on the queue. If the parent decides to let the signal through, it
is simply added to the same sigsets again. If the signal shouldn't go
through or another signal should be delivered the siginfo data is taken
off the queue.

Additionally, the patch also adds a SIGADDSET(queue->sq_kill, sig) call
in the case where a different signal should be delivered. I believe it
is needed to match what sigqueue_add() would do in that case (adding a
signal without siginfo data).

--- race1.c begins here ---
#include <sys/types.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>

int main( void ) {
    pid_t pid;
    int status;

    /* fork dummy child process */
    pid = fork();
    if( pid == 0 ) {
        /* child does nothing */
        for( ;; ) {
            sleep( 1 );
        }
    } else {
        /* parent */
        sleep( 1 );
        for( ;; ) {
            /* loop: attach, wait, detach */
            printf( "attach " ); fflush( stdout );
            ptrace( PT_ATTACH, pid, ( caddr_t ) 0, 0 );

            printf( "wait " ); fflush( stdout );
            wait4( pid, &status, 0, NULL );

            printf( "detach " ); fflush( stdout );
            ptrace( PT_DETACH, pid, ( caddr_t ) 1, 0 );

            printf( "ok\n" ); fflush( stdout );
        }
    }

    return 0;
}
--- race1.c ends here ---

--- race2.c begins here ---
#include <sys/types.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>

int main( void ) {
    pid_t pid;
    int status;

    /* fork dummy child process */
    pid = fork();
    if( pid == 0 ) {
        /* child does nothing */
        for( ;; ) {
            sleep( 1 );
        }
    } else {
        /* parent */
        sleep( 1 );
        ptrace( PT_ATTACH, pid, ( caddr_t ) 0, 0 );
        wait4( pid, &status, 0, NULL );
        for( ;; ) {
            /* loop: continue, kill, wait */
            printf( "continue " ); fflush( stdout );
            ptrace( PT_CONTINUE, pid, ( caddr_t ) 1, 0 );

            printf( "kill " ); fflush( stdout );
            kill( pid, SIGINT );

            printf( "wait " ); fflush( stdout );
            wait4( pid, &status, 0, NULL );

            printf( "ok\n" ); fflush( stdout );
        }
    }

    return 0;
}
--- race2.c ends here ---

--- patch-ptrace begins here ---
--- sys/kern/kern_sig.c.orig	2010-01-12 15:34:42.000000000 +0100
+++ sys/kern/kern_sig.c	2010-01-12 16:24:22.000000000 +0100
@@ -312,6 +312,38 @@ sigqueue_get(sigqueue_t *sq, int signo, 
 	return (signo);
 }
 
+int
+sigqueue_get_partial(sigqueue_t *sq, int signo, ksiginfo_t **si)
+{
+	struct ksiginfo *ksi;
+	int count = 0;
+
+	KASSERT(sq->sq_flags & SQ_INIT, ("sigqueue not inited"));
+
+	*si = NULL;
+
+	if (!SIGISMEMBER(sq->sq_signals, signo))
+		return (0);
+
+	if (SIGISMEMBER(sq->sq_kill, signo)) {
+		count++;
+		SIGDELSET(sq->sq_kill, signo);
+	}
+
+	TAILQ_FOREACH(ksi, &sq->sq_list, ksi_link) {
+		if (ksi->ksi_signo == signo) {
+			if (count == 0)
+				*si = ksi;
+			if (++count > 1)
+				break;
+		}
+	}
+
+	if (count <= 1)
+		SIGDELSET(sq->sq_signals, signo);
+	return (signo);
+}
+
 void
 sigqueue_take(ksiginfo_t *ksi)
 {
@@ -2523,6 +2555,17 @@ issignal(struct thread *td, int stop_all
 			continue;
 		}
 		if (p->p_flag & P_TRACED && (p->p_flag & P_PPWAIT) == 0) {
+			ksiginfo_t *ksi;
+
+			/*
+			 * Clear old signal, but keep siginfo on queue.
+			 */
+			queue = &td->td_sigqueue;
+			if (sigqueue_get_partial(queue, sig, &ksi) == 0) {
+				queue = &p->p_sigqueue;
+				sigqueue_get_partial(queue, sig, &ksi);
+			}
+
 			/*
 			 * If traced, always stop.
 			 */
@@ -2531,17 +2574,12 @@ issignal(struct thread *td, int stop_all
 			mtx_lock(&ps->ps_mtx);
 
 			if (sig != newsig) {
-				ksiginfo_t ksi;
-
-				queue = &td->td_sigqueue;
 				/*
-				 * clear old signal.
-				 * XXX shrug off debugger, it causes siginfo to
-				 * be thrown away.
+				 * Remove old signal from queue.
 				 */
-				if (sigqueue_get(queue, sig, &ksi) == 0) {
-					queue = &p->p_sigqueue;
-					sigqueue_get(queue, sig, &ksi);
+				if( ksi != NULL ) {
+					sigqueue_take(ksi);
+					ksiginfo_tryfree(ksi);
 				}
 
 				/*
@@ -2557,10 +2595,18 @@ issignal(struct thread *td, int stop_all
 				 * Put the new signal into td_sigqueue. If the
 				 * signal is being masked, look for other signals.
 				 */
+				SIGADDSET(queue->sq_kill, sig);
 				SIGADDSET(queue->sq_signals, sig);
 				if (SIGISMEMBER(td->td_sigmask, sig))
 					continue;
 				signotify(td);
+			} else {
+				/*
+				 * Restore old signal.
+				 */
+				if( ksi == NULL )
+					SIGADDSET(queue->sq_kill, sig);
+				SIGADDSET(queue->sq_signals, sig);
 			}
 
 			/*
--- sys/sys/signalvar.h.orig	2010-01-12 16:25:43.000000000 +0100
+++ sys/sys/signalvar.h	2010-01-12 16:26:16.000000000 +0100
@@ -358,6 +358,7 @@ void	sigqueue_delete_set(struct sigqueue
 void	sigqueue_delete(struct sigqueue *queue, int sig);
 void	sigqueue_move_set(struct sigqueue *src, sigqueue_t *dst, sigset_t *);
 int	sigqueue_get(struct sigqueue *queue, int sig, ksiginfo_t *info);
+int	sigqueue_get_partial(struct sigqueue *queue, int sig, ksiginfo_t **info);
 int	sigqueue_add(struct sigqueue *queue, int sig, ksiginfo_t *info);
 void	sigqueue_collect_set(struct sigqueue *queue, sigset_t *set);
 void	sigqueue_move(struct sigqueue *, struct sigqueue *, int sig);
--- patch-ptrace ends here ---

>Release-Note:
>Audit-Trail:

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/142757: commit references a PR
Date: Wed, 20 Jan 2010 11:58:16 +0000 (UTC)

 Author: kib
 Date: Wed Jan 20 11:58:04 2010
 New Revision: 202692
 URL: http://svn.freebsd.org/changeset/base/202692
 
 Log:
   When traced process is about to receive the signal, the process is
   stopped and debugger may modify or drop the signal. After the changes to
   keep process-targeted signals on the process sigqueue, another thread
   may note the old signal on the queue and act before the thread removes
   changed or dropped signal from the process queue. Since process is
   traced, it usually gets stopped. Or, if the same signal is delivered
   while process was stopped, the thread may erronously remove it,
   intending to remove the original signal.
   
   Remove the signal from the queue before notifying the debugger. Restore
   the siginfo to the head of sigqueue when signal is allowed to be
   delivered to the debugee, using newly introduced KSI_HEAD ksiginfo_t
   flag. This preserves required order of delivery. Always restore the
   unchanged signal on the curthread sigqueue, not to the process queue,
   since the thread is about to get it anyway, because sigmask cannot be
   changed.
   
   Handle failure of reinserting the siginfo into the queue by falling
   back to sq_kill method, calling sigqueue_add with NULL ksi.
   
   If debugger changed the signal to be delivered, use sigqueue_add()
   with NULL ksi instead of only setting sq_signals bit.
   
   Reported by:	Gardner Bell <gbell72 rogers com>
   Analyzed and first version of fix by:	Tijl Coosemans <tijl coosemans org>
   PR:	142757
   Reviewed by:	davidxu
   MFC after:	2 weeks
 
 Modified:
   head/sys/kern/kern_sig.c
   head/sys/sys/signalvar.h
 
 Modified: head/sys/kern/kern_sig.c
 ==============================================================================
 --- head/sys/kern/kern_sig.c	Wed Jan 20 11:55:14 2010	(r202691)
 +++ head/sys/kern/kern_sig.c	Wed Jan 20 11:58:04 2010	(r202692)
 @@ -357,7 +357,10 @@ sigqueue_add(sigqueue_t *sq, int signo, 
  
  	/* directly insert the ksi, don't copy it */
  	if (si->ksi_flags & KSI_INS) {
 -		TAILQ_INSERT_TAIL(&sq->sq_list, si, ksi_link);
 +		if (si->ksi_flags & KSI_HEAD)
 +			TAILQ_INSERT_HEAD(&sq->sq_list, si, ksi_link);
 +		else
 +			TAILQ_INSERT_TAIL(&sq->sq_list, si, ksi_link);
  		si->ksi_sigq = sq;
  		goto out_set_bit;
  	}
 @@ -378,7 +381,10 @@ sigqueue_add(sigqueue_t *sq, int signo, 
  			p->p_pendingcnt++;
  		ksiginfo_copy(si, ksi);
  		ksi->ksi_signo = signo;
 -		TAILQ_INSERT_TAIL(&sq->sq_list, ksi, ksi_link);
 +		if (si->ksi_flags & KSI_HEAD)
 +			TAILQ_INSERT_HEAD(&sq->sq_list, ksi, ksi_link);
 +		else
 +			TAILQ_INSERT_TAIL(&sq->sq_list, ksi, ksi_link);
  		ksi->ksi_sigq = sq;
  	}
  
 @@ -2492,6 +2498,7 @@ issignal(struct thread *td, int stop_all
  	struct sigacts *ps;
  	struct sigqueue *queue;
  	sigset_t sigpending;
 +	ksiginfo_t ksi;
  	int sig, prop, newsig;
  
  	p = td->td_proc;
 @@ -2529,24 +2536,22 @@ issignal(struct thread *td, int stop_all
  		if (p->p_flag & P_TRACED && (p->p_flag & P_PPWAIT) == 0) {
  			/*
  			 * If traced, always stop.
 +			 * Remove old signal from queue before the stop.
 +			 * XXX shrug off debugger, it causes siginfo to
 +			 * be thrown away.
  			 */
 +			queue = &td->td_sigqueue;
 +			ksi.ksi_signo = 0;
 +			if (sigqueue_get(queue, sig, &ksi) == 0) {
 +				queue = &p->p_sigqueue;
 +				sigqueue_get(queue, sig, &ksi);
 +			}
 +
  			mtx_unlock(&ps->ps_mtx);
  			newsig = ptracestop(td, sig);
  			mtx_lock(&ps->ps_mtx);
  
  			if (sig != newsig) {
 -				ksiginfo_t ksi;
 -
 -				queue = &td->td_sigqueue;
 -				/*
 -				 * clear old signal.
 -				 * XXX shrug off debugger, it causes siginfo to
 -				 * be thrown away.
 -				 */
 -				if (sigqueue_get(queue, sig, &ksi) == 0) {
 -					queue = &p->p_sigqueue;
 -					sigqueue_get(queue, sig, &ksi);
 -				}
  
  				/*
  				 * If parent wants us to take the signal,
 @@ -2561,10 +2566,20 @@ issignal(struct thread *td, int stop_all
  				 * Put the new signal into td_sigqueue. If the
  				 * signal is being masked, look for other signals.
  				 */
 -				SIGADDSET(queue->sq_signals, sig);
 +				sigqueue_add(queue, sig, NULL);
  				if (SIGISMEMBER(td->td_sigmask, sig))
  					continue;
  				signotify(td);
 +			} else {
 +				if (ksi.ksi_signo != 0) {
 +					ksi.ksi_flags |= KSI_HEAD;
 +					if (sigqueue_add(&td->td_sigqueue, sig,
 +					    &ksi) != 0)
 +						ksi.ksi_signo = 0;
 +				}
 +				if (ksi.ksi_signo == 0)
 +					sigqueue_add(&td->td_sigqueue, sig,
 +					    NULL);
  			}
  
  			/*
 
 Modified: head/sys/sys/signalvar.h
 ==============================================================================
 --- head/sys/sys/signalvar.h	Wed Jan 20 11:55:14 2010	(r202691)
 +++ head/sys/sys/signalvar.h	Wed Jan 20 11:58:04 2010	(r202692)
 @@ -234,6 +234,7 @@ typedef struct ksiginfo {
  #define	KSI_EXT		0x02	/* Externally managed ksi. */
  #define KSI_INS		0x04	/* Directly insert ksi, not the copy */
  #define	KSI_SIGQ	0x08	/* Generated by sigqueue, might ret EGAIN. */
 +#define	KSI_HEAD	0x10	/* Insert into head, not tail. */
  #define	KSI_COPYMASK	(KSI_TRAP|KSI_SIGQ)
  
  #define	KSI_ONQ(ksi)	((ksi)->ksi_sigq != NULL)
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->closed 
State-Changed-By: kib 
State-Changed-When: Tue May 11 20:03:56 UTC 2010 
State-Changed-Why:  
Fix is committed into HEAD and RELENG_8. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=142757 
>Unformatted:
