From nobody@FreeBSD.org  Thu Nov 26 07:18:02 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C719A106568B
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 26 Nov 2009 07:18:02 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id B60198FC1B
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 26 Nov 2009 07:18:02 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id nAQ7I1cw002421
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 26 Nov 2009 07:18:01 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id nAQ7I1nd002420;
	Thu, 26 Nov 2009 07:18:01 GMT
	(envelope-from nobody)
Message-Id: <200911260718.nAQ7I1nd002420@www.freebsd.org>
Date: Thu, 26 Nov 2009 07:18:01 GMT
From: Alexei Volkov <Alexei.Volkov@softlynx.ru>
To: freebsd-gnats-submit@FreeBSD.org
Subject: boot fail from zfs root while the pool resilvering
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         140888
>Category:       kern
>Synopsis:       [zfs] boot fail from zfs root while the pool resilvering
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-fs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Nov 26 07:20:01 UTC 2009
>Closed-Date:    
>Last-Modified:  Sat Nov 28 22:49:16 UTC 2009
>Originator:     Alexei Volkov
>Release:        8.0-RELEASE
>Organization:
SoftLynx
>Environment:
FreeBSD livecd8 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Tue Nov 24 08:29:59 UTC 2009 root@80AMD64:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
On the system boots directly from ZFS mirror or raidz pool (http://wiki.freebsd.org/RootOnZFS) replacing one of the components with subsequent accident power failure or reboot while the resilvering is running, stops boot process with the message:

ZFS: can only boot from disk, mirror or raidz vdevs
ZFS: inconsistent nvlist contents
ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS
ZFS: unexpected object set type lld
ZFS: unexpected object set type lld

FreeBSD/i386 boot
Default: tank0:/boot/kernel/kernel
boot:
ZFS: unexpected object set type lld

FreeBSD/i386 boot
Default: tank0:/boot/kernel/kernel
boot:


>How-To-Repeat:
Install the system as described on http://wiki.freebsd.org/RootOnZFS  for non single device installation i.e. mirror, raidz.

Eventually has something like 

[root@fresh-inst:~]# zpool status
  pool: tank0
 state: ONLINE
 scrub: none requested
config:

        NAME              STATE     READ WRITE CKSUM
        tank0             ONLINE       0     0     0
          raidz1          ONLINE       0     0     0
            gpt/QM00002   ONLINE       0     0     0
            gpt/SN091234  ONLINE       0     0     0

errors: No known data errors

Lets assume that one of the component has a prefail condition and have to be replaced with new one. Power off the system and replace one of the HDD with onother one. Boot back to OS. Booting just fine for now.

Get zpool status to see missing component.

[root@fresh-inst:~]# zpool status
  pool: tank0
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: none requested
config:

        NAME              STATE     READ WRITE CKSUM
        tank0             DEGRADED     0     0     0
          raidz1          DEGRADED     0     0     0
            gpt/QM00002   UNAVAIL      0   327     0  cannot open
            gpt/SN091234  ONLINE       0     0     0

errors: No known data errors

Partition the new disk as required and get new gpt component ready for zpool replacement.

[root@fresh-inst:~]# gpart show -l
=>     34  8388541  ad0  GPT  (4.0G)
       34      128    1  (null)  (64K)
      162  8388413    2  SN091234  (4.0G)

=>     34  8388541  ad1  GPT  (4.0G)
       34      128    1  (null)  (64K)
      162  8388413    2  SN023432  (4.0G)

Run replacement command.

[root@fresh-inst:~]# zpool replace tank0 gpt/QM00002 gpt/SN023432

[root@fresh-inst:~]# zpool status
  pool: tank0
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 7.88% done, 0h4m to go
config:

        NAME                STATE     READ WRITE CKSUM
        tank0               DEGRADED     0     0     0
          raidz1            DEGRADED     0     0     0
            replacing       DEGRADED     0     0     0
              gpt/QM00002   UNAVAIL      0 2.17K     0  cannot open
              gpt/SN023432  ONLINE       0     0     0  39.5M resilvered
            gpt/SN091234    ONLINE       0     0     0  372K resilvered

errors: No known data errors

Initiate regular reboot (but could simulate be an instant power failure).

[root@fresh-inst:~]# reboot

The systems fails to boot with the following message:

Booting from Hard Disk...
ZFS: can only boot from disk, mirror or raidz vdevs
ZFS: inconsistent nvlist contents
ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS
ZFS: unexpected object set type lld
ZFS: unexpected object set type lld

FreeBSD/i386 boot
Default: tank0:/boot/kernel/kernel
boot:
ZFS: unexpected object set type lld

FreeBSD/i386 boot
Default: tank0:/boot/kernel/kernel
boot:



>Fix:
As a workaround reboot from CD/DVD in fixit mode, then import the pool and wait resilvering complete.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Thu Nov 26 08:08:58 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=140888 

From: kot@softlynx.ru
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/140888: [zfs] boot fail from zfs root while the pool 
     resilvering
Date: Thu, 26 Nov 2009 14:02:12 +0300 (MSK)

 I found, that it keep fail booting if has at least one device not ONLINE
 and pool state DEGRADED.
 
 For instance
 
 [root@livecd8:/]# zpool status
   pool: tank0
  state: DEGRADED
  scrub: none requested
 config:
 
         NAME                        STATE     READ WRITE CKSUM
         tank0                       DEGRADED     0     0     0
           raidz1                    DEGRADED     0     0     0
             replacing               DEGRADED     0     0     0
               12996219703647995136  UNAVAIL      0   298     0  was
 /dev/gpt/QM00002
               gpt/SN023432          ONLINE       0     0     0
             gpt/SN091234            ONLINE       0     0     0
 
 errors: No known data errors
 
 considered as degraded even it has replace gpt/QM00002 with new gpt/SN023432.
 
 Detaching UNAVAIL component turns pool to ONLINE state back.
 
  [root@livecd8:/]# zpool detach tank0 12996219703647995136
  [root@livecd8:/]# zpool status
    pool: tank0
   state: ONLINE
   scrub: none requested
  config:
 
          NAME              STATE     READ WRITE CKSUM
          tank0             ONLINE       0     0     0
            raidz1          ONLINE       0     0     0
              gpt/SN023432  ONLINE       0     0     0
              gpt/SN091234  ONLINE       0     0     0
 
  errors: No known data errors
 
 This case lets to boot from tank0.
 
 It also keeps booting fine in case of component is manually turns to
 OFFLINE state in any combination, for instance like
 
 [root@fresh-inst:~]# zpool status
   pool: tank0
  state: DEGRADED
 status: One or more devices has experienced an unrecoverable error.  An
         attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
         using 'zpool clear' or replace the device with 'zpool replace'.
    see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: none requested
 config:
 
         NAME              STATE     READ WRITE CKSUM
         tank0             DEGRADED     0     0     0
           raidz1          DEGRADED     0     0     0
             gpt/SN023432  ONLINE       0     0     0
             gpt/SN091234  OFFLINE      0   921     0
 
 errors: No known data errors
 
 
>Unformatted:
