From mjacob@feral.com Sat Oct 23 12:55:57 1999
Return-Path: <mjacob@feral.com>
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP id 4961614C25
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 23 Oct 1999 12:55:57 -0700 (PDT)
	(envelope-from mjacob@feral.com)
Received: from quarm.feral.com (quarm [192.67.166.252])
	by feral.com (8.8.7/8.8.7) with ESMTP id MAA22966
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 23 Oct 1999 12:55:55 -0700
Received: (from mjacob@localhost)
	by quarm.feral.com (8.9.3/8.9.3) id MAA00406;
	Sat, 23 Oct 1999 12:55:56 -0700 (PDT)
	(envelope-from mjacob@mailhost.feral.com)
Message-Id: <199910231955.MAA00406@quarm.feral.com>
Date: Sat, 23 Oct 1999 12:55:56 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: FreeBSD-gnats-submit@freebsd.org
Subject: repeated arrival/departure of disks leads to panic in dscheck
X-Send-Pr-Version: 3.2

>Number:         14486
>Category:       kern
>Synopsis:       repeated arrival/departure of disks leads to panic in dscheck
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Oct 23 13:00:00 PDT 1999
>Closed-Date:    Sat Nov 6 02:26:05 PST 1999
>Last-Modified:  Tue Jul 11 00:11:51 MDT 2000
>Originator:     Matthew Jacob
>Release:        FreeBSD 4.0-19991022-CURRENT i386
>Organization:
Feral Software
>Environment:

2xPPro x86.


>Description:

repeated arrivals/departures of disks, sometimes with the labels changed,
yields a panic:

Stopped at      dscheck+0x53:   movl    0x10(%edx),%esi
db> t
dscheck(c3327a38,0) at dscheck+0x53
diskstrategy(c3327a38,c0ae4780,200,c0a4db80,0) at diskstrategy+0xad
readdisklabel(c0ae4780,c0a51600,c0a4db80,c0b1e8e0,c0a4db80) at readdisklabel+0x5
3
dsopen(c0a4db80,2000,0,c0b1e8ec,c0b1e8f0) at dsopen+0x248
diskopen(c0a4db80,1,2000,c7c77800,0) at diskopen+0xdb
spec_open(c86afe08,c86afde0,c020ecd9,c86afe08,c86afe7c) at spec_open+0x154
spec_vnoperate(c86afe08,c86afe7c,c0190ace,c86afe08,0) at spec_vnoperate+0x15
ufs_vnoperatespec(c86afe08,0,c86aff80,fffffffc,100) at ufs_vnoperatespec+0x15
vn_open(c86afed8,1,dfd,c7c77800,3) at vn_open+0x37e
open(c7c77800,c86aff80,1,1b0a,bfbfdad1) at open+0xbb
syscall(2f,2f,2f,bfbfdad1,1b0a) at syscall+0x1b1
Xsyscall() at Xsyscall+0x3a

with Supervisor page niot present... This should be enough information
for the responsible designers to do the appropriate hardening.


>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:

From: Bruce Evans <bde@zeta.org.au>
To: Matthew Jacob <mjacob@feral.com>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/14486: repeated arrival/departure of disks leads to panic
 in dscheck
Date: Fri, 29 Oct 1999 22:01:28 +1000 (EST)

 > >Description:
 > 
 > repeated arrivals/departures of disks, sometimes with the labels changed,
 > yields a panic:
 > 
 > Stopped at      dscheck+0x53:   movl    0x10(%edx),%esi
 > db> t
 > dscheck(c3327a38,0) at dscheck+0x53
 > diskstrategy(c3327a38,c0ae4780,200,c0a4db80,0) at diskstrategy+0xad
 > readdisklabel(c0ae4780,c0a51600,c0a4db80,c0b1e8e0,c0a4db80) at readdisklabel+0x5
 > 3
 > dsopen(c0a4db80,2000,0,c0b1e8ec,c0b1e8f0) at dsopen+0x248
 > diskopen(c0a4db80,1,2000,c7c77800,0) at diskopen+0xdb
 
 Rev.1.39 of scsi_da.c (actually, all revs. of subr_disk.c) seems to be
 quite buggy.  diskopen(), at least, doesn't seem to do sufficient locking.
 The old daopen() holds a lock for essentially the whole open, but
 diskopen() allows concurrent opens (and closes!).  Bad things probably
 happen if the label is changed.  Even null changes may cause problems if
 they are not atomic.
 
 Bruce
 
 
State-Changed-From-To: open->closed 
State-Changed-By: phk 
State-Changed-When: Sat Nov 6 02:26:05 PST 1999 
State-Changed-Why:  
fixed. 
>Unformatted:

I don't think this was fixed when phk says it was fixed.  I think it
wasn't until version 1.20 of subr_disk.c when I corrected the behavior
when devices left and came back.  I got exactly this same panic before
I patched the kernel.  I'd get it on a camcontrol rescan when I
add/remove disks as well as wehn I inserted/removed ata drives from my
PC.  I'm pretty sure that I've fixed the problem.  See the commit log
for details.

Bruce's concerns about races are likely valid, but not likely what
caused this problem.
