From nobody@FreeBSD.org  Fri Oct  1 17:58:13 2004
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DFC6B16A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  1 Oct 2004 17:58:13 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B6AB843D5A
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  1 Oct 2004 17:58:13 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.11/8.12.11) with ESMTP id i91HwDCL023343
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 1 Oct 2004 17:58:13 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.11/8.12.11/Submit) id i91HwDiG023342;
	Fri, 1 Oct 2004 17:58:13 GMT
	(envelope-from nobody)
Message-Id: <200410011758.i91HwDiG023342@www.freebsd.org>
Date: Fri, 1 Oct 2004 17:58:13 GMT
From: Aleksey Pesternikov <apesternikov@yahoo.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: kqueue + EVFILT_TIMER = kernel panic 
X-Send-Pr-Version: www-2.3

>Number:         72234
>Category:       kern
>Synopsis:       kqueue + EVFILT_TIMER = kernel panic
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    jmg
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Oct 01 18:00:45 GMT 2004
>Closed-Date:    Mon Oct 18 16:52:01 GMT 2004
>Last-Modified:  Mon Oct 18 16:52:01 GMT 2004
>Originator:     Aleksey Pesternikov
>Release:        5.3-BETA6
>Organization:
>Environment:
FreeBSD x2.reveredata.com 5.3-BETA6 FreeBSD 5.3-BETA6 #8: Thu Sep 30 16:22:11 PDT 2004     root@x2.reveredata.com:/usr/src/sys/i386/compile/X2  i386

kernel configuration:
include GENERIC
ident           X2
options         VFS_AIO
options         HZ=1000
options         SHMMAXPGS=65536
options         SEMMNI=40
options         SEMMNS=240
options         SEMUME=40
options         SEMMNU=120


FreeBSD loki.reveredata.com 5.3-BETA6 FreeBSD 5.3-BETA6 #3: Mon Sep 27 19:33:45 EDT 2004     root@loki.reveredata.com:/usr/obj/usr/src/sys/LOKI  i386

kernel configuration:
include GENERIC
ident           LOKI
options         HZ=1000
options         NMBCLUSTERS=65535


>Description:
After executing attached program (several times?) both systems crashes:

kernel trap 12 with interrupts disabled



Fatal trap 12: page fault while in kernel mode
cpuid=0; apic id = 00
fault virtual address          = 0x108
fault code                     = supervisor read, page not present
instruction pointer            = 0x8:0xc0649b14
stack pointer                  = 0x10:0xe4de6c5c
frame pointer                  = 0x10:0xe4de6c74
code segment                   = base 0x0, limit 0xfffff, type 0x1b
                               = DPL 0, pres 1, def32 1, gran 1
processor eflags               = resume, IOPL = 0
current process                = 36 (swi5: clock sio)
trap number                    = 12
panic: page fault
cpuid = 0
Uptime: 16h12m13s

Looks like a kernel does not clear (timer related?) kqueue structures related to process after the process exits or has been killed.

The bug appeared sometimes after 5.2.1

>How-To-Repeat:
The problem is 100% reproduceable:

#include <sys/types.h>
#include <sys/event.h>
#include <sys/time.h>
#include <assert.h>

int main(int argc, char* argv[])
{
int kq;
struct kevent ke;

  assert((kq=kqueue())!=-1);
  EV_SET(&ke, 12345, EVFILT_TIMER, EV_ADD/*|EV_ONESHOT*/,0, 1000/*msec*/, 0);
  assert(kevent(kq, &ke, 1, NULL, 0, NULL)==0);
  return 0;
}

>Fix:
      
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->jmg 
Responsible-Changed-By: maxim 
Responsible-Changed-When: Fri Oct 1 19:35:04 GMT 2004 
Responsible-Changed-Why:  
Reverting rev. 1.80 sys/kern/kern_event.c does help.  John, could 
you please take a look at this? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=72234 
State-Changed-From-To: open->feedback 
State-Changed-By: jmg 
State-Changed-When: Fri Oct 1 20:02:26 GMT 2004 
State-Changed-Why:  
could you get DDB compiled into your kernel, and give me a backtrace? (tr 
from the db> prompt)  This will help me isolate the problem... 

Thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=72234 

From: Aleksey Pesternikov <apesternikov@reveredata.com>
To: freebsd-gnats-submit@FreeBSD.org, apesternikov@yahoo.com
Cc:  
Subject: Re: kern/72234: kqueue + EVFILT_TIMER = kernel panic
Date: Fri, 01 Oct 2004 17:56:15 -0700

 Stopped at:       _mtx_lock_sleep+0xba: cmpl $0x4, 0x108(%ebx)
 
 db> tr
 _mtx_lock_sleep(c3066800,c22d3190,0,0,0) at _mtx_lock_sleep+0xba
 filt_timerexpire(c27c4264,ffc00014,0,34,8a1a2453) at filt_timerexpire+0x55
 softclock(0,0,0,0,0) at softclock+0x260
 ithread_loop(c2280c00,e4de3d48,0,0,0) at ithread_loop+0x1a4
 fork_exit(c0645d53,c2280c00,e4de3d48) at fork_exit+0x80
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xe4de3d7c, ebp = 0 ---
 
 

From: John-Mark Gurney <gurney_j@resnet.uoregon.edu>
To: Aleksey Pesternikov <apesternikov@reveredata.com>
Cc:  
Subject: Re: kern/72234: kqueue + EVFILT_TIMER = kernel panic
Date: Sat, 2 Oct 2004 21:06:01 -0700

 Aleksey Pesternikov wrote this message on Sat, Oct 02, 2004 at 01:00 +0000:
 > The following reply was made to PR kern/72234; it has been noted by GNATS.
 > 
 > From: Aleksey Pesternikov <apesternikov@reveredata.com>
 > To: freebsd-gnats-submit@FreeBSD.org, apesternikov@yahoo.com
 > Cc:  
 > Subject: Re: kern/72234: kqueue + EVFILT_TIMER = kernel panic
 > Date: Fri, 01 Oct 2004 17:56:15 -0700
 > 
 >  Stopped at:       _mtx_lock_sleep+0xba: cmpl $0x4, 0x108(%ebx)
 >  
 >  db> tr
 >  _mtx_lock_sleep(c3066800,c22d3190,0,0,0) at _mtx_lock_sleep+0xba
 >  filt_timerexpire(c27c4264,ffc00014,0,34,8a1a2453) at filt_timerexpire+0x55
 >  softclock(0,0,0,0,0) at softclock+0x260
 >  ithread_loop(c2280c00,e4de3d48,0,0,0) at ithread_loop+0x1a4
 >  fork_exit(c0645d53,c2280c00,e4de3d48) at fork_exit+0x80
 >  fork_trampoline() at fork_trampoline+0x8
 >  --- trap 0x1, eip = 0, esp = 0xe4de3d7c, ebp = 0 ---
 
 Ok, try attached patch...  v1.80 enforced the setting/clearning of              
 KN_DETACHED, but the timer filter was never updated to support it...            
 This normally happens through knlist_add/_remove, but since timers              
 don't need a global list or lock, they do not use these functions..             
 
 It appears I can't attach the patch, here it is in-line:
 Index: kern_event.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/kern_event.c,v
 retrieving revision 1.82
 diff -u -r1.82 kern_event.c
 --- kern_event.c	14 Sep 2004 18:38:16 -0000	1.82
 +++ kern_event.c	2 Oct 2004 21:11:32 -0000
 @@ -441,6 +441,7 @@
  	}
  
  	kn->kn_flags |= EV_CLEAR;		/* automatically set */
 +	kn->kn_flags &= ~EV_DETACHED;		/* knlist_add usually sets it */
  	MALLOC(calloutp, struct callout *, sizeof(*calloutp),
  	    M_KQUEUE, M_WAITOK);
  	callout_init(calloutp, 1);
 @@ -461,6 +462,7 @@
  	callout_drain(calloutp);
  	FREE(calloutp, M_KQUEUE);
  	atomic_add_int(&kq_ncallouts, -1);
 +	kn->kn_flags |= EV_DETACHED;	/* knlist_remove usually clears it */
  }
  
  /* XXX - move to kern_timeout.c? */
 
 -- 
   John-Mark Gurney				Voice: +1 415 225 5579
 
      "All that I will do, has been done, All that I have, has not."

From: John-Mark Gurney <gurney_j@resnet.uoregon.edu>
To: Aleksey Pesternikov <apesternikov@reveredata.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/72234: kqueue + EVFILT_TIMER = kernel panic
Date: Mon, 4 Oct 2004 10:14:40 -0700

 John-Mark Gurney wrote this message on Sat, Oct 02, 2004 at 21:06 -0700:
 > Aleksey Pesternikov wrote this message on Sat, Oct 02, 2004 at 01:00 +0000:
 > > The following reply was made to PR kern/72234; it has been noted by GNATS.
 > > 
 > > From: Aleksey Pesternikov <apesternikov@reveredata.com>
 > > To: freebsd-gnats-submit@FreeBSD.org, apesternikov@yahoo.com
 > > Cc:  
 > > Subject: Re: kern/72234: kqueue + EVFILT_TIMER = kernel panic
 > > Date: Fri, 01 Oct 2004 17:56:15 -0700
 > > 
 > >  Stopped at:       _mtx_lock_sleep+0xba: cmpl $0x4, 0x108(%ebx)
 > >  
 > >  db> tr
 > >  _mtx_lock_sleep(c3066800,c22d3190,0,0,0) at _mtx_lock_sleep+0xba
 > >  filt_timerexpire(c27c4264,ffc00014,0,34,8a1a2453) at filt_timerexpire+0x55
 > >  softclock(0,0,0,0,0) at softclock+0x260
 > >  ithread_loop(c2280c00,e4de3d48,0,0,0) at ithread_loop+0x1a4
 > >  fork_exit(c0645d53,c2280c00,e4de3d48) at fork_exit+0x80
 > >  fork_trampoline() at fork_trampoline+0x8
 > >  --- trap 0x1, eip = 0, esp = 0xe4de3d7c, ebp = 0 ---
 > 
 > Ok, try attached patch...  v1.80 enforced the setting/clearning of
 > KN_DETACHED, but the timer filter was never updated to support it...
 > This normally happens through knlist_add/_remove, but since timers
 > don't need a global list or lock, they do not use these functions..
 
 the last patch was completely misspelled, here is the updated version..
 Index: kern_event.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/kern_event.c,v
 retrieving revision 1.79.2.2
 diff -u -r1.79.2.2 kern_event.c
 --- kern_event.c	17 Sep 2004 17:53:16 -0000	1.79.2.2
 +++ kern_event.c	4 Oct 2004 17:11:31 -0000
 @@ -441,6 +441,7 @@
  	}
  
  	kn->kn_flags |= EV_CLEAR;		/* automatically set */
 +	kn->kn_status &= ~KN_DETACHED;		/* knlist_add usually sets it */
  	MALLOC(calloutp, struct callout *, sizeof(*calloutp),
  	    M_KQUEUE, M_WAITOK);
  	callout_init(calloutp, 1);
 @@ -461,6 +462,7 @@
  	callout_drain(calloutp);
  	FREE(calloutp, M_KQUEUE);
  	atomic_add_int(&kq_ncallouts, -1);
 +	kn->kn_status |= KN_DETACHED;	/* knlist_remove usually clears it */
  }
  
  /* XXX - move to kern_timeout.c? */
 
 -- 
   John-Mark Gurney				Voice: +1 415 225 5579
 
      "All that I will do, has been done, All that I have, has not."
State-Changed-From-To: feedback->patched 
State-Changed-By: jmg 
State-Changed-When: Wed Oct 13 20:56:46 GMT 2004 
State-Changed-Why:  
patched in -current, still needs to be applied to RELENG_5 

http://www.freebsd.org/cgi/query-pr.cgi?pr=72234 
State-Changed-From-To: patched->closed 
State-Changed-By: jmg 
State-Changed-When: Mon Oct 18 16:51:41 GMT 2004 
State-Changed-Why:  
patch mfc'd... 

http://www.freebsd.org/cgi/query-pr.cgi?pr=72234 
>Unformatted:
