From nobody@FreeBSD.org  Mon Nov  9 21:00:51 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CDCAC106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Mon,  9 Nov 2009 21:00:51 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id BC8C58FC0A
	for <freebsd-gnats-submit@FreeBSD.org>; Mon,  9 Nov 2009 21:00:51 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id nA9L0pJt056408
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 9 Nov 2009 21:00:51 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id nA9L0p3m056407;
	Mon, 9 Nov 2009 21:00:51 GMT
	(envelope-from nobody)
Message-Id: <200911092100.nA9L0p3m056407@www.freebsd.org>
Date: Mon, 9 Nov 2009 21:00:51 GMT
From: Gerrit Khn <gerrit@pmp.uni-hannover.de>
To: freebsd-gnats-submit@FreeBSD.org
Subject: ZFS panics kernel while replaying ZIL after crash
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         140433
>Category:       kern
>Synopsis:       [zfs] [panic] panic while replaying ZIL after crash
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    mm
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Nov 09 21:10:00 UTC 2009
>Closed-Date:    Mon Sep 06 12:01:28 UTC 2010
>Last-Modified:  Mon Sep 06 12:01:28 UTC 2010
>Originator:     Gerrit Khn
>Release:        FreeBSD 8.0-RC2
>Organization:
AEI
>Environment:
FreeBSD 8.0-RC2 on AMD64
>Description:
After a crash (by starting powerd, if that matters) zpool comes back fine, but panics when trying to mount one particular fs inside the pool. All other fs are fine, also the properties of the broken fs can be accessed. A picture of the crash and a trace using ddb can be found here: <http://www.pmp.uni-hannover.de/test/Mitarbeiter/g_kuehn/data/zfs-panic2.jpg>
It looks like there is a problem replaying the ZIL.

Some more info about the hardware and setup:
These are 4x2.5" 400GB drives (WD4000BEVT) in a RAID-Z1 setup on a
Supermicro AOC-USAS-L8i controller (LSI chip, mpt driver) in a VIA VB8001
board (powered by a Via Nano 1.6GHz) with 4GB of memory. System runs off a UFS-FS CF-card and uses ZFS for data, /var and /tmp.
>How-To-Repeat:
I had a similar issue before when the system had crashed once for a different reason. So the situation is probably easily triggered here. I have not yet tried to re-do the pool and trigger it again to be able to give feedback on the problem at hand.
>Fix:
Unknown to me. However, imho zfs should not panic the kernel, even if there is a corrupted zil. If these cases cannot be avoided 100%, something like an --disgard-zil switch would be very helpful from my (user's) point of view.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: gavin 
Responsible-Changed-When: Mon Nov 9 21:30:36 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer(s).  To submitter: can you also please give the output of 
"zdb -C"? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=140433 

From: Gerrit =?ISO-8859-1?Q?K=FChn?= <gerrit@weinberg2.de>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/140433: [zfs] [panic] panic while replaying ZIL after
 crash
Date: Tue, 10 Nov 2009 19:40:42 +0100

 Output of "zdb -C" as requested:
 
 
 tank
     version=13
     name='tank'
     state=0
     txg=32618
     pool_guid=17523106262699816181
     hostname=''
     vdev_tree
         type='root'
         id=0
         guid=17523106262699816181
         children[0]
                 type='raidz'
                 id=0
                 guid=2668789775933362751
                 nparity=1
                 metaslab_array=14
                 metaslab_shift=33
                 ashift=9
                 asize=1600334594048
                 is_log=0
                 children[0]
                         type='disk'
                         id=0
                         guid=4872680480919708890
                         path='/dev/label/disk0'
                         whole_disk=0
                         DTL=63
                 children[1]
                         type='disk'
                         id=1
                         guid=14727435584907659484
                         path='/dev/label/disk1'
                         whole_disk=0
                         DTL=60
                 children[2]
                         type='disk'
                         id=2
                         guid=1501397252321623055
                         path='/dev/label/disk2'
                         whole_disk=0
                         DTL=62
                 children[3]
                         type='disk'
                         id=3
                         guid=15105917771654568537
                         path='/dev/label/disk3'
                         whole_disk=0
                         DTL=61

From: Gerrit =?ISO-8859-1?Q?K=FChn?= <gerrit@pmp.uni-hannover.de>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/140433: [zfs] [panic] panic while replaying ZIL after
 crash
Date: Wed, 10 Feb 2010 14:01:22 +0100

 For the record: I fixed my pool now by booting OpenSolaris dev 131 and
 simply importing/exporting the pool. Now it works fine again under
 FreeBSD. Since v128 OSL also has a -F(ix) feature for importing corrupt
 pools. This would be a very useful feature in FreeBSD, too.
 
 
 cu
   Gerrit
Responsible-Changed-From-To: freebsd-fs->mm 
Responsible-Changed-By: mm 
Responsible-Changed-When: Sun May 16 13:00:28 UTC 2010 
Responsible-Changed-Why:  
I'll take it. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=140433 

From: Martin Matuska <mm@FreeBSD.org>
To: bug-followup@FreeBSD.org, gerrit@pmp.uni-hannover.de
Cc:  
Subject: Re: kern/140433: [zfs] [panic] panic while replaying ZIL after crash
Date: Sun, 16 May 2010 15:01:42 +0200

 There are two new patches in 8-STABLE thad fixed ZIL replay crashes.
 Could you try the lastest 8-STABLE?
 
 Or can this PR be closed?
State-Changed-From-To: open->closed 
State-Changed-By: mm 
State-Changed-When: Mon Sep 6 12:01:26 UTC 2010 
State-Changed-Why:  
Closing on feedback timeout. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=140433 
>Unformatted:
