From boris@brooknet.com.au  Sun Mar  6 01:20:41 2005
Return-Path: <boris@brooknet.com.au>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 02E4816A4D6; Sun,  6 Mar 2005 01:20:41 +0000 (GMT)
Received: from bloodwood.hunterlink.net.au (smtp-local.hunterlink.net.au [203.12.144.17])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 4F9FD43D58; Sun,  6 Mar 2005 01:20:37 +0000 (GMT)
	(envelope-from boris@brooknet.com.au)
Received: from localhost (ppp2DA6.dyn.pacific.net.au [61.8.45.166])
	by bloodwood.hunterlink.net.au (8.12.8/8.12.8) with ESMTP id j261KTlt010181;
	Sun, 6 Mar 2005 12:20:30 +1100
Received: by localhost (Postfix, from userid 1001)
	id 701FB17D8; Sun,  6 Mar 2005 12:21:46 +1100 (EST)
Message-Id: <20050306012146.701FB17D8@localhost>
Date: Sun,  6 Mar 2005 12:21:46 +1100 (EST)
From: Sam Lawrance <boris@brooknet.com.au>
Reply-To: Sam Lawrance <boris@brooknet.com.au>
To: FreeBSD-gnats-submit@freebsd.org
Cc: current@freebsd.org
Subject: Swapped out procs not brought in immediately after child exits
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         78474
>Category:       kern
>Synopsis:       [kernel] [patch] swapped out procs not brought in immediately after child exits
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Mar 06 01:30:19 GMT 2005
>Closed-Date:    Sun Dec 18 04:53:55 GMT 2005
>Last-Modified:  Sun Dec 18 04:53:55 GMT 2005
>Originator:     Sam Lawrance
>Release:        FreeBSD 5.4-PRERELEASE i386
>Organization:
>Environment:
System: FreeBSD dirk.no.domain 5.4-PRERELEASE FreeBSD 5.4-PRERELEASE #10: Sun Ma
r 6 10:45:13 EST 2005 root@dirk.no.domain:/usr/testbuild/src5/sys/i386/compile/G
ENERIC i386


>Description:

I run -stable on my lonely box, but AFAICS this affects current.

This problem is similar in flavour to one that I reported a while ago,
since fixed.

Here's an example. Below we have a login, shell and su which have
swapped out, and a shell which is active:

root 4291  0.0  0.0  1664     0  v3  IWs  -         0:00.00 login [pam] (login)
sam  4298  0.0  0.0  2260     0  v3  IW   -         0:00.00 -bash (bash)
root 4299  0.0  0.0  1644     0  v3  IW   -         0:00.00 su
root 4300  0.0  0.4  2952  1132  v3  S+    3:23PM   0:00.66 su (bash)

When 4300 exits, it will sit in the zombie state for a long
time, waiting for 4299 to be swapped in.  Same for 4299 and 4298.

The kernel call stack for 4300 would be something like

	exit1
	  kern_exit
	    wakeup (parent process as wait channel)
	      sleepq_broadcast
	        sleepq_resume_thread (on parent process)
	          setrunnable

In setrunnable, curthread->td_pflags is flagged with TDP_WAKEPROC0 to
indicate the vm scheduler should be awoken to do its thing.

David Xu's original change was to check for TDP_WAKEPROC0 in
critical_exit() and wakeup(&proc0) from there. Things were arranged
this way in order to prevent an LOR between sched_lock and sleepqueue
locks.

That scheme doesn't take into account that exit1() does a
critical_enter() that has no corresponding critical_exit() in that
thread (because the exiting thread grabs sched_lock which is held across
cpu_throw).

So the wakeup is not done, and we just have to wait for the vm's tsleep
on proc0 to time out. The same thing might occur across other exit
points, but I don't know what they are.

>How-To-Repeat:

Run a shell somewhere (first). Su or run another shell or similar (second).
Wait until the first shell has swapped out (might require running some other
memory hogs). Exit the second shell. Notice that the second shell takes a
long time to exit.

>Fix:

A possible solution might be to wakeup(&proc0) after waking the parent
and before grabbing sched_lock:

Index: kern_exit.c
===================================================================
RCS file: /home/ncvs/FreeBSD/src/sys/kern/kern_exit.c,v
retrieving revision 1.256
diff -u -r1.256 kern_exit.c
--- kern_exit.c	29 Jan 2005 14:03:41 -0000	1.256
+++ kern_exit.c	6 Mar 2005 01:17:35 -0000
@@ -503,6 +503,7 @@
 	mtx_unlock_spin(&sched_lock);
 	wakeup(p->p_pptr);
 	PROC_UNLOCK(p->p_pptr);
+	wakeup(&proc0);
 	mtx_lock_spin(&sched_lock);
 	critical_exit();
 
>Release-Note:
>Audit-Trail:

From: Sam Lawrance <boris@brooknet.com.au>
To: FreeBSD-gnats-submit@FreeBSD.org, freebsd-bugs@FreeBSD.org
Cc:  
Subject: Re: kern/78474: Swapped out procs not brought in immediately after
	child exits
Date: Sun, 06 Mar 2005 14:15:50 +1100

 A note.. that fix is a bit trivialised. Really you'd want to check for
 (td->td_pflags & TDP_WAKEPROC0) first. Pretend the words SAMPLE ONLY are
 stamped across it in red letters.
 
Adding to audit trail from misfiled PRs 78745 and 78746: 
Date: Sun, 06 Mar 2005 09:30:10 +0800
From: David Xu <davidxu@freebsd.org>

 My first patch is to put the TDP_WAKEPROC0 on per-cpu, so when you switch
 to another thread, there must have a critical_exit(), but scottl told me 
 that
 using per-cpu reduces performance in visible degree, I assume he is uing
 badly designed P4 --- long pipeline core. At least on PIII, I does not 
 see the
 performance reduced if using per-cpu flag.
 
 David Xu
Date: Sun, 06 Mar 2005 09:43:34 +0800
From: David Xu <davidxu@freebsd.org>

 >Run a shell somewhere (first). Su or run another shell or similar (second).
 >Wait until the first shell has swapped out (might require running some other
 >memory hogs). Exit the second shell. Notice that the second shell takes a
 >long time to exit.
 >
 >  
 >
 This reminds me that it is another swappable kernel stack problem, if we 
 don't have
 it, we even needn't TDP_WAKEPROC0  hack, interesting. :)
 
 David Xu
State-Changed-From-To: open->closed 
State-Changed-By: lawrance 
State-Changed-When: Sun Dec 18 04:26:03 UTC 2005 
State-Changed-Why:  
Fixed in HEAD kern_exit.c rev 1.261. 
Merged to 5 kern_exit.c rev 1.245.2.11. 
Does not affect 4. 

Issue resolved. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=78474 
>Unformatted:
