From nobody@FreeBSD.org  Wed Jun 16 11:37:09 2010
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3EB53106566C
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 16 Jun 2010 11:37:09 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 142448FC12
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 16 Jun 2010 11:37:09 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o5GBb8K1022233
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 16 Jun 2010 11:37:08 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id o5GBb8Oq022232;
	Wed, 16 Jun 2010 11:37:08 GMT
	(envelope-from nobody)
Message-Id: <201006161137.o5GBb8Oq022232@www.freebsd.org>
Date: Wed, 16 Jun 2010 11:37:08 GMT
From: Michiel Leenaars <michiel.ml@nlnet.nl>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Kernel panics on faulty zfs device
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         147903
>Category:       kern
>Synopsis:       [zfs] [panic] Kernel panics on faulty zfs device
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-fs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jun 16 11:40:01 UTC 2010
>Closed-Date:    
>Last-Modified:  Mon Jun 21 03:30:35 UTC 2010
>Originator:     Michiel Leenaars
>Release:        8.0-RELEASE-p3
>Organization:
-
>Environment:
FreeBSD hostname 8.0-RELEASE-p3 FreeBSD 8.0-RELEASE-p3 #0: Tue May 25 20:54:11 UTC 2010     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
Hi there,

one of the four drives in my desktop machine attached to a zpool has apparently something corrupted. Whenever I boot with that (external) disk attached, my machine crashes with a panic and reboots. It took some time to find out that when I unplugged that particular drive, I am able to boot the system, the zpool is not online then. When I replug the (usb) drive back in, it keeps on connecting and disconnecting - repeating this message: 

umass0:0:0:-1: Attached to scbus0
ugen4.2: <LaCie> at usbus4 (disconnected)
umass0: at uhub4, port 5, addr 2 (disconnected)
ugen4.2: <LaCie> at usbus4
umass0: <Bulk Only Interface> on usbus4
umass0:  SCSI over Bulk-Only; quirks = 0x0000

When I run zpool status obviously it gives that it cannot open that disk:

  pool: tank
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        tank        UNAVAIL      0     0     0  insufficient replicas
          ad6       ONLINE       0     0     0
          ad10      ONLINE       0     0     0
          ad12      ONLINE       0     0     0
          da0       UNAVAIL      0     0     0  cannot open

The first when I cleared the errors and had the disk plugged in

$ zpool clear tank da0

zfs tried to get the device online and the system rebooted. Meanwhile, some more damage occurred, because somehow I don't have sufficient rights to clear the zpool errors - even though I am logged in as root. 

cannot clear errors for tank: permission denied

So there seem to be two issues: first is an zfs error handling issue that somehow manages to crash the system. The second is the lock out for the superuser to reattach a device (hopefully for resilvering). 

>How-To-Repeat:
Well, my system does it every time I reboot with this hard drive attached, but I'm not sure how other could reproduce it.
>Fix:


>Release-Note:
>Audit-Trail:

From: Michiel Leenaars <michiel.ml@nlnet.nl>
To: bug-followup@freebsd.org, michiel.ml@nlnet.nl
Cc:  
Subject: Re: misc/147903: Kernel panics on faulty zfs device
Date: Sat, 19 Jun 2010 22:53:23 +0200

 I have removed the failing disk from the LaCie USB-casing, and have connected to 
 another USB device. Now the USB error has disappeared, but when I boot without 
 the disk and then attach it I still get this kernel trap and a rebooting system 
 when it automatically starts the ZFS pool after the missing disk is found:
 
 kernel trap 12 with interrupts disabled
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id - 00
 fault virtual address  = 0x11
 fault code                 = supervisor write data, page not present
 instruction pointer    = 0x20:0xffffffff80586940
 stack pointer            = 0x28:0xffffff8000027980
 frame pointer            = 0x28:0xffffff8000027990
 code segment           = base 0x0, limit 0xfffff, type 0x1b
                                   = DPL 0, pres 1, long 1, def32 0, grna 1
 processor eflags       = resume, IOPL =0
 current process         = 11 (idel: cpu0)
 trap number              = 12
 panic: page fault
 cpuid = 0
 Uptime: 10m12s
 Cannot dump. Device not defined or unavailable
 Automatic reboot in 15 seconds - press a key on the console to abort
 Rebooting...
 cpu_reset: Stopping other CPU's
 
 Would upgrading to another FreeBSD environment help?
 
 Best,
 Michiel Leenaars
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Jun 21 03:30:16 UTC 2010 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=147903 
>Unformatted:
