From rb@gid.co.uk  Mon May 12 04:51:51 1997
Received: from agora.rdrop.com (root@agora.rdrop.com [199.2.210.241])
          by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id EAA14841
          for <FreeBSD-gnats-submit@freebsd.org>; Mon, 12 May 1997 04:51:50 -0700 (PDT)
Received: from isbalham.ist.co.uk (isbalham.ist.co.uk [192.31.26.1])
	by agora.rdrop.com (8.8.5/8.8.5) with ESMTP id EAA05853
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 12 May 1997 04:51:16 -0700 (PDT)
Received: from gid.co.uk (uucp@localhost)
          by isbalham.ist.co.uk (8.8.4/8.8.4) with UUCP
	  id MAA15459 for freebsd.org!FreeBSD-gnats-submit; Mon, 12 May 1997 12:36:52 +0100 (BST)
Message-Id: <11992.199705121141@seagoon.gid.co.uk>
Date: Mon, 12 May 1997 12:41:10 +0100
From: Bob Bishop <rb@gid.co.uk>
Reply-To: rb@gid.co.uk
To: FreeBSD-gnats-submit@freebsd.org
Subject: trap 12 in lockstatus()
X-Send-Pr-Version: 3.2

>Number:         3581
>Category:       kern
>Synopsis:       intermittent trap 12 in lockstatus()
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon May 12 05:00:01 PDT 1997
>Closed-Date:    Sun May 13 06:29:59 PDT 2001
>Last-Modified:  Sun May 13 06:30:57 PDT 2001
>Originator:     Bob Bishop
>Release:        FreeBSD 3.0-CURRENT i386
>Organization:
GID ltd
>Environment:
      -current from CTM src-cur.2880
      Kernel is generic except for no cpu I386_CPU and
      built with DDB and config -g. The problem also occurs
      without DDB and -g.

      FreeBSD 3.0-CURRENT #0: Sun May 11 16:12:44 BST 1997
      rb@hal:/source/cleansrc/sys/compile/HAL
      CPU: Cyrix 486DX2 (486-class CPU)
      Origin = "CyrixInstead"  Device ID = 0x321b  Stepping=3
      Revision=2
      real memory  = 20971520 (20480K bytes)
      avail memory = 8351744 (8156K bytes)
      bdevsw_add_generic: adding D_DISK flag for device 7
      bdevsw_add_generic: adding D_DISK flag for device 16
      bdevsw_add_generic: adding D_DISK flag for device 17
      Probing for devices on the ISA bus:
**** "disabled", "not probed", "not found" omitted for clarity
      sc0 at 0x60-0x6f irq 1 on motherboard
      sc0: VGA color <16 virtual consoles, flags=0x0>
      sio0 at 0x3f8-0x3ff irq 4 on isa
      sio0: type 16450
      sio1 at 0x2f8-0x2ff irq 3 on isa
      sio1: type 16450
      lpt0 at 0x378-0x37f irq 7 on isa
      lpt0: Interrupt-driven port
      lp0: TCP/IP capable interface
      fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
      fdc0: NEC 72065B
      fd0: 1.44MB 3.5in
      aha0 at 0x330-0x333 irq 11 drq 5 on isa
      aha0: waiting for scsi devices to settle
      scbus0 at aha0 bus 0
      sd0 at scbus0 target 3 lun 0
      sd0: <IBM DCAS-32160 S60B> type 0 fixed SCSI 2
      sd0: Direct-Access 2063MB (4226725 512 byte sectors)
      1 3C5x9 board(s) on ISA found at 0x300
      ep0 at 0x300-0x30f irq 10 on isa
      ep0: aui/utp/bnc[*BNC*] address 00:a0:24:c4:f3:62
      npx0 on motherboard
      npx0: INT 16 interface

>Description:

	Every so often (3-4 days), apparently uncorrelated with machine
	activity, I get:

	Fatal trap 12: page fault while in kernel mode.

	It's always in lockstatus(), the following trace is
	typical:

	lockstatus(lkp=34)
	ufs_islocked(ap=f409bf54)
	vfs_msync(mp=f0f1aa00, flags=2)
	sync()
	vfs_update()
	[etc]

	lockstatus' argument is clearly bogus, small values
	like 0x34, 0x44 are typical.

>How-To-Repeat:

	Fire up -current on this machine and wait patiently...

>Fix:
	
	Dunno. Assuming it keeps on happening I can probably
	collect more data if anyone has any ideas. However, I
	can't get a crash dump out of this machine at present,
	it wedges just after starting to dump.

>Release-Note:
>Audit-Trail:

From: Tor Egge <Tor.Egge@idi.ntnu.no>
To: rb@gid.co.uk
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/3581: trap 12 in lockstatus()
Date: Fri, 23 May 1997 18:59:57 +0200

 I just happened to reproduce this error.
 
 -----
 Current directory is /sys/compile/SKARVEN_SMP/
 GDB is free software and you are welcome to distribute copies of it
  under certain conditions; type "show copying" to see the conditions.
 There is absolutely no warranty for GDB; type "show warranty" for details.
 GDB 4.16 (i386-unknown-freebsd), 
 Copyright 1996 Free Software Foundation, Inc...
 IdlePTD 231000
 current pcb at 2141b8
 panic: page fault
 #0  boot (howto=260) at ../../kern/kern_shutdown.c:265
 (kgdb) where
 #0  boot (howto=260) at ../../kern/kern_shutdown.c:265
 #1  0xe0117379 in panic (fmt=0xe01cd0ef "page fault")
     at ../../kern/kern_shutdown.c:393
 #2  0xe01cddf1 in trap_fatal (frame=0xe9464da4) at ../../i386/i386/trap.c:754
 #3  0xe01cd824 in trap_pfault (frame=0xe9464da4, usermode=0)
     at ../../i386/i386/trap.c:661
 #4  0xe01cd43b in trap (frame={tf_es = -381288432, tf_ds = -535756784, 
       tf_edi = -486902272, tf_esi = -474829440, tf_ebp = -381268512, 
       tf_isp = -381268532, tf_ebx = -459927680, tf_edx = 52, tf_ecx = 36, 
       tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -535747344, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -381268500, tf_ss = -535102147})
     at ../../i386/i386/trap.c:319
 #5  0xe01124f0 in lockstatus (lkp=0x34) at ../../kern/kern_lock.c:139
 #6  0xe01afd3d in ufs_islocked (ap=0xe9464e04)
     at ../../ufs/ufs/ufs_vnops.c:1798
 #7  0xe0137678 in vfs_msync (mp=0xe2fa7600, flags=2) at vnode_if.h:911
 #8  0xe01380a4 in sync (p=0xe0222674, uap=0x0, retval=0x0)
     at ../../kern/vfs_syscalls.c:479
 #9  0xe0116f61 in boot (howto=256) at ../../kern/kern_shutdown.c:203
 #10 0xe0117379 in panic (fmt=0xe01cd0ef "page fault")
     at ../../kern/kern_shutdown.c:393
 #11 0xe01cddf1 in trap_fatal (frame=0xe9464ef4) at ../../i386/i386/trap.c:754
 #12 0xe01cd824 in trap_pfault (frame=0xe9464ef4, usermode=0)
     at ../../i386/i386/trap.c:661
 #13 0xe01cd43b in trap (frame={tf_es = -381288432, tf_ds = 16, 
       tf_edi = -486902272, tf_esi = -474829440, tf_ebp = -381268176, 
       tf_isp = -381268196, tf_ebx = -459927680, tf_edx = 52, tf_ecx = 36, 
       tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -535747344, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -381268164, tf_ss = -535102147})
     at ../../i386/i386/trap.c:319
 #14 0xe01124f0 in lockstatus (lkp=0x34) at ../../kern/kern_lock.c:139
 #15 0xe01afd3d in ufs_islocked (ap=0xe9464f54)
     at ../../ufs/ufs/ufs_vnops.c:1798
 #16 0xe0137678 in vfs_msync (mp=0xe2fa7600, flags=2) at vnode_if.h:911
 #17 0xe01380a4 in sync (p=0xe304bc00, uap=0x0, retval=0x0)
     at ../../kern/vfs_syscalls.c:479
 #18 0xe0132b3b in vfs_update () at ../../kern/vfs_bio.c:1702
 #19 0xe010a606 in kproc_start (udata=0xe0202b40) at ../../kern/init_main.c:249
 #20 0xe01c190b in ?? ()
 (kgdb)
 [...]
 (kgdb) proc 9144
 current pcb at e94b1000
 (kgdb) where
 #0  mi_switch () at ../../kern/kern_synch.c:610
 #1  0xe0118ffd in tsleep (ident=0xe4960f80, priority=8, 
     wmesg=0xe013c0c7 "vn_lock", timo=0) at ../../kern/kern_synch.c:369
 #2  0xe013c135 in vn_lock (vp=0xe4960f80, flags=65538, p=0xe4197e00)
     at ../../kern/vfs_vnops.c:531
 #3  0xe0136701 in vputrele (vp=0xe4960f80, put=0) at ../../kern/vfs_subr.c:1165
 #4  0xe013674d in vrele (vp=0xe4960f80) at ../../kern/vfs_subr.c:1184
 #5  0xe01369bc in vclean (vp=0xe4960f80, flags=8, p=0xe4197e00)
     at ../../kern/vfs_subr.c:1383
 #6  0xe0136b9e in vgonel (vp=0xe4960f80, p=0xe4197e00)
     at ../../kern/vfs_subr.c:1537
 #7  0xe0136ac5 in vop_revoke (ap=0xe94b2f14) at ../../kern/vfs_subr.c:1467
 #8  0xe011022e in exit1 (p=0xe4197e00, rv=1) at vnode_if.h:423
 #9  0xe011862e in sigexit (p=0xe4197e00, signum=1)
     at ../../kern/kern_sig.c:1218
 #10 0xe0118412 in postsig (signum=1) at ../../kern/kern_sig.c:1125
 #11 0xe01ce1b8 in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 0, 
       tf_esi = 0, tf_ebp = -541076416, tf_isp = -380948508, tf_ebx = 412092, 
       tf_edx = -541076400, tf_ecx = 415460, tf_eax = 4, tf_trapno = 12, 
       tf_err = 7, tf_eip = 374453, tf_cs = 31, tf_eflags = 663, 
       tf_esp = -541076440, tf_ss = 39}) at ../../i386/i386/trap.c:153
 #12 0x5b6b5 in ?? ()
 Cannot access memory at address 0xdfbfd444.
 (kgdb) up 1 
 #1  0xe0118ffd in tsleep (ident=0xe4960f80, priority=8, 
     wmesg=0xe013c0c7 "vn_lock", timo=0) at ../../kern/kern_synch.c:369
 (kgdb) up
 #2  0xe013c135 in vn_lock (vp=0xe4960f80, flags=65538, p=0xe4197e00)
     at ../../kern/vfs_vnops.c:531
 (kgdb) print vp->v_usecount
 $25 = 0
 [...]
 (kgdb) print/x *vp->v_un->vu_specinfo
 $30 = {si_hashchain = 0xe0221e5c, si_specnext = 0x0, si_flags = 0x0, 
   si_rdev = 0x500}
 (kgdb) 
 ----
 skarven:~$ ps axl -M /var/crash/vmcore.1  | grep 28620
 28620  9144 151711   0 -14  0   972    0 vn_loc DEs   p0-   0:00.00  (bash)
 28620  9159 151711 257  28  0   384    0 -      Z     p0-   0:00.00  (desclient-
 ----
 
 Here we have three errors:
 
 	1. A deadlock:
 		- vclean 
 		      1. sets VXLOCK
 		      2. Might block in VOP_LOCK due to 
 			 VOP_INACTIVE routine being in progress.
 			 During this time, other processes might
 			 call vrele, and reduce vp->v_usecount.
 		      3. Might block in VOP_CLOSE or VOP_INACTIVE.
 			 During this time, other processes might
 			 call vrele, and reduce vp->v_usecount.
 		      4. calls VOP_RECLAIM on an active vnode.
 		         Now VTOI(vp) returns a NULL pointer.
 		      5. calls vrele
 			 vrele calls vputrele
 			 If vp->v_usecount was reduced to 1 while
 			 blocking in step 2 or 3 agove, it is now
 			 reduced to 0, which means that 
 			 vputrele calls vn_lock.
 			 vn_lock then waits for VXLOCK to be cleared.
 			 This never happens.
 
 	2. ufs_islocked does not check the VXLOCK flag, and 
 	   proceeds to dereference a NULL pointer when a deadlock
 	   as mentioned above has occured.
 
 	3. ps axl does not give the correct parent process ID
 	   when working on a memory dump.
 
 Fix:
 
 	1. Avoid the deadlock. VOP_INACTIVE has been called.
 	   VOP_RECLAIM has been called. Thus, vrele should
 	   not be called. If vn_lock had not blocked, 
 	   VOP_INACTIVE (ufs_inactive) might have caused a 
 	   trap 12, since VTOI(vp) was NULL.
 
 	   What is needed is a subset of the operations performed
 	   in vrele, where
 		- all calls to locking primitives has been removed.
 		  They do not work after VOP_RECLAIM having 
 		  been called.
 		- the calls to vn_lock and VOP_INACTIVE has been
 		  removed.
 
 	2. always keep the vnode in a state where VOP_ operations
 	   don't crash.
 
 	3. ... 
 
 - Tor Egge
State-Changed-From-To: open->feedback 
State-Changed-By: iedowse 
State-Changed-When: Mon Mar 26 13:26:40 PST 2001 
State-Changed-Why:  
Is this still a problem? I don't think I've heard of this recently, 
and a lot has changed since 3.0-CURRENT. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=3581 

From: Bob Bishop <rb@gid.co.uk>
To: freebsd-gnats-submit@FreeBSD.org, rb@gid.co.uk
Cc:  
Subject: Re: kern/3581: intermittent trap 12 in lockstatus()
Date: Sun, 13 May 2001 10:24:31 +0100

 I'd say this can safely be closed :-)
 
 
 --
 Bob Bishop              (0118) 977 4017  international code +44 118
 rb@gid.co.uk        fax (0118) 989 4254
 
 
State-Changed-From-To: feedback->closed 
State-Changed-By: dwmalone 
State-Changed-When: Sun May 13 06:29:59 PDT 2001 
State-Changed-Why:  
Safe to close, according to submitter. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=3581 
>Unformatted:
