From nobody@FreeBSD.org  Tue Oct 25 16:53:33 2011
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 765691065680
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 25 Oct 2011 16:53:33 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 65E298FC0A
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 25 Oct 2011 16:53:33 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id p9PGrXNi073222
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 25 Oct 2011 16:53:33 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id p9PGrW67073205;
	Tue, 25 Oct 2011 16:53:33 GMT
	(envelope-from nobody)
Message-Id: <201110251653.p9PGrW67073205@red.freebsd.org>
Date: Tue, 25 Oct 2011 16:53:33 GMT
From: Robert Millan <rmh@debian.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [zfs] Latest 9-STABLE and 10-CURRENT fail to boot from ZFS v15 root
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         162008
>Category:       kern
>Synopsis:       [zfs] Latest 9-STABLE and 10-CURRENT fail to boot from ZFS v15 root [regression]
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    pjd
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Oct 25 17:00:19 UTC 2011
>Closed-Date:    Thu Nov 24 07:40:39 UTC 2011
>Last-Modified:  Thu Jan  5 10:00:33 UTC 2012
>Originator:     Robert Millan
>Release:        Debian GNU/kFreeBSD "sid"
>Organization:
>Environment:
see description
>Description:
With both 9-STABLE and 10-CURRENT, since recently the kernel is no longer
able to boot from my ZFS pool as root file system.

The on-disk pool is ZFS version 15 and was created with 8.2 kernel.

I've bisected the problem in stable/9/sys/ and found that it'd been
introduced by r226405 (commit that disables debug options in GENERIC),
which is obviously just exposing the bug and not causing it.

Ironically, in head/sys/ the same problem is present but disappears
when removing the debug options.

If I attempt to replicate the disk (by creating a new v15 pool and zfs
send/receive'ing the data), the destination ZFS pool is bootable unlike
the source one. This makes me suspect the problem has something to do
with /boot/zfs/zpool.cache.

I'm currently dd'ing the raw partition to another disk to check if the
pool can be imported/exported manually, and if "zpool upgrade" has any
effect on the problem (I don't want to risk losing the testcase). Please
let me know if there's anything else I can try.

>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Wed Oct 26 04:20:39 UTC 2011 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=162008 

From: Andriy Gapon <avg@FreeBSD.org>
To: bug-followup@FreeBSD.org, rmh@debian.org
Cc:  
Subject: Re: kern/162008: [zfs] Latest 9-STABLE and 10-CURRENT fail to boot
 from ZFS v15 root [regression]
Date: Wed, 26 Oct 2011 09:35:08 +0300

 Please let us know how _exactly_ your "kernel is no longer
 able to boot from my ZFS pool as root file system".
 That is, what boot stage fails and what output you see - (gpt)zfsboot,
 zfsloader, kernel, root fs mounting, something else...
 
 -- 
 Andriy Gapon

From: Robert Millan <rmh@debian.org>
To: Andriy Gapon <avg@freebsd.org>
Cc: bug-followup@freebsd.org
Subject: Re: kern/162008: [zfs] Latest 9-STABLE and 10-CURRENT fail to boot
 from ZFS v15 root [regression]
Date: Wed, 26 Oct 2011 19:34:15 +0200

 2011/10/26 Andriy Gapon <avg@freebsd.org>:
 >
 > Please let us know how _exactly_ your "kernel is no longer
 > able to boot from my ZFS pool as root file system".
 > That is, what boot stage fails and what output you see - (gpt)zfsboot,
 > zfsloader, kernel, root fs mounting, something else...
 
 I'm sorry, I thought there was no meaningful error, but in closer look I notice:
 
   Mounting from zfs:eeepc/root failed with error 6.
 
 Assuming this means ENXIO, could it be a race condition?
 
 -- 
 Robert Millan

From: Andriy Gapon <avg@FreeBSD.org>
To: Robert Millan <rmh@debian.org>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/162008: [zfs] Latest 9-STABLE and 10-CURRENT fail to boot
 from ZFS v15 root [regression]
Date: Sun, 30 Oct 2011 13:06:21 +0200

 on 26/10/2011 20:34 Robert Millan said the following:
 > 2011/10/26 Andriy Gapon <avg@freebsd.org>:
 >>
 >> Please let us know how _exactly_ your "kernel is no longer
 >> able to boot from my ZFS pool as root file system".
 >> That is, what boot stage fails and what output you see - (gpt)zfsboot,
 >> zfsloader, kernel, root fs mounting, something else...
 > 
 > I'm sorry, I thought there was no meaningful error, but in closer look I notice:
 > 
 >   Mounting from zfs:eeepc/root failed with error 6.
 > 
 > Assuming this means ENXIO, could it be a race condition?
 > 
 
 IMO, not likely.
 Please try setting vfs.zfs.debug=1 via loader.conf.
 Maybe additional debug information will make the situation clearer.
 
 -- 
 Andriy Gapon

From: Robert Millan <rmh@debian.org>
To: Andriy Gapon <avg@freebsd.org>
Cc: bug-followup@freebsd.org
Subject: Re: kern/162008: [zfs] Latest 9-STABLE and 10-CURRENT fail to boot
 from ZFS v15 root [regression]
Date: Sat, 5 Nov 2011 00:11:34 +0100

 2011/10/30 Andriy Gapon <avg@freebsd.org>:
 > IMO, not likely.
 > Please try setting vfs.zfs.debug=1 via loader.conf.
 > Maybe additional debug information will make the situation clearer.
 
 Strangely, the system boots now, but kernel panics as soon as "zfs
 volinit" is attempted:
 
 vdev_geom_open_by_guid:352[1]: Searching by guid [13849114725133984793].
 panic: _sx_xlock_hard: recursed on non-recursive sx spa_namespace_lock
 @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c:877
 
 It also drops me to a debug prompt. Backtrace:
 
 kdb_enter
 panic
 _sx_xlock_hard
 _sx_xlock
 zvol_geom_access
 g_access
 vdev_geom_open
 vdev_open
 vdev_open_children
 vdev_root_open
 vdev_open
 spa_load
 spa_load_best
 spa_open_common
 spa_get_stats
 zfs_ioc_pool_stats
 zfsdev_ioctl
 devfs_ioctl_f
 kern_ioctl
 
 (this happened with 9-STABLE, SVN r226626)
 
 -- 
 Robert Millan
State-Changed-From-To: open->patched 
State-Changed-By: pjd 
State-Changed-When: sob 5 lis 2011 16:29:46 UTC 
State-Changed-Why:  
Fix committed to HEAD. Thanks for the report! 


Responsible-Changed-From-To: freebsd-fs->pjd 
Responsible-Changed-By: pjd 
Responsible-Changed-When: sob 5 lis 2011 16:29:46 UTC 
Responsible-Changed-Why:  
I'll take this one. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=162008 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162008: commit references a PR
Date: Sat,  5 Nov 2011 16:29:14 +0000 (UTC)

 Author: pjd
 Date: Sat Nov  5 16:29:03 2011
 New Revision: 227110
 URL: http://svn.freebsd.org/changeset/base/227110
 
 Log:
   In zvol_open() if the spa_namespace_lock is already held, it means that
   ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
   so return an error instead of panicing on spa_namespace_lock recursion.
   
   Reported by:	Robert Millan <rmh@debian.org>
   PR:		kern/162008
   MFC after:	3 days
 
 Modified:
   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 
 Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 ==============================================================================
 --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Sat Nov  5 16:04:57 2011	(r227109)
 +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Sat Nov  5 16:29:03 2011	(r227110)
 @@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
  	zvol_state_t *zv;
  	int err = 0;
  
 +	if (MUTEX_HELD(&spa_namespace_lock)) {
 +		/*
 +		 * If the spa_namespace_lock is being held, it means that ZFS
 +		 * is trying to open ZVOL as its VDEV. This i not supported.
 +		 */
 +		return (EOPNOTSUPP);
 +	}
 +
  	mutex_enter(&spa_namespace_lock);
  
  	zv = pp->private;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162008: commit references a PR
Date: Thu, 24 Nov 2011 07:25:53 +0000 (UTC)

 Author: pjd
 Date: Thu Nov 24 07:25:43 2011
 New Revision: 227923
 URL: http://svn.freebsd.org/changeset/base/227923
 
 Log:
   MFC r227110,r227111:
   
   r227110:
   
   In zvol_open() if the spa_namespace_lock is already held, it means that
   ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
   so return an error instead of panicing on spa_namespace_lock recursion.
   
   Reported by:	Robert Millan <rmh@debian.org>
   PR:		kern/162008
   
   r227111:
   
   Correct typo in comment.
   
   Reported by:	Fabian Keil <fk@fabiankeil.de>
   
   Approved by:	re (kib)
 
 Modified:
   stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 Directory Properties:
   stable/9/sys/   (props changed)
   stable/9/sys/amd64/include/xen/   (props changed)
   stable/9/sys/boot/   (props changed)
   stable/9/sys/boot/i386/efi/   (props changed)
   stable/9/sys/boot/ia64/efi/   (props changed)
   stable/9/sys/boot/ia64/ski/   (props changed)
   stable/9/sys/boot/powerpc/boot1.chrp/   (props changed)
   stable/9/sys/boot/powerpc/ofw/   (props changed)
   stable/9/sys/cddl/contrib/opensolaris/   (props changed)
   stable/9/sys/conf/   (props changed)
   stable/9/sys/contrib/dev/acpica/   (props changed)
   stable/9/sys/contrib/octeon-sdk/   (props changed)
   stable/9/sys/contrib/pf/   (props changed)
   stable/9/sys/contrib/x86emu/   (props changed)
 
 Modified: stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 ==============================================================================
 --- stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 06:27:47 2011	(r227922)
 +++ stable/9/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 07:25:43 2011	(r227923)
 @@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
  	zvol_state_t *zv;
  	int err = 0;
  
 +	if (MUTEX_HELD(&spa_namespace_lock)) {
 +		/*
 +		 * If the spa_namespace_lock is being held, it means that ZFS
 +		 * is trying to open ZVOL as its VDEV. This is not supported.
 +		 */
 +		return (EOPNOTSUPP);
 +	}
 +
  	mutex_enter(&spa_namespace_lock);
  
  	zv = pp->private;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162008: commit references a PR
Date: Thu, 24 Nov 2011 07:39:16 +0000 (UTC)

 Author: pjd
 Date: Thu Nov 24 07:39:01 2011
 New Revision: 227927
 URL: http://svn.freebsd.org/changeset/base/227927
 
 Log:
   MFC r227110,r227111:
   
   r227110:
   
   In zvol_open() if the spa_namespace_lock is already held, it means that
   ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
   so return an error instead of panicing on spa_namespace_lock recursion.
   
   Reported by:	Robert Millan <rmh@debian.org>
   PR:		kern/162008
   
   r227111:
   
   Correct typo in comment.
   
   Reported by:	Fabian Keil <fk@fabiankeil.de>
   
   Approved by:	re (kib)
 
 Modified:
   releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 Directory Properties:
   releng/9.0/sys/   (props changed)
   releng/9.0/sys/amd64/include/xen/   (props changed)
   releng/9.0/sys/boot/   (props changed)
   releng/9.0/sys/boot/i386/efi/   (props changed)
   releng/9.0/sys/boot/ia64/efi/   (props changed)
   releng/9.0/sys/boot/ia64/ski/   (props changed)
   releng/9.0/sys/boot/powerpc/boot1.chrp/   (props changed)
   releng/9.0/sys/boot/powerpc/ofw/   (props changed)
   releng/9.0/sys/cddl/contrib/opensolaris/   (props changed)
   releng/9.0/sys/conf/   (props changed)
   releng/9.0/sys/contrib/dev/acpica/   (props changed)
   releng/9.0/sys/contrib/octeon-sdk/   (props changed)
   releng/9.0/sys/contrib/pf/   (props changed)
   releng/9.0/sys/contrib/x86emu/   (props changed)
 
 Modified: releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 ==============================================================================
 --- releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 07:37:19 2011	(r227926)
 +++ releng/9.0/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Nov 24 07:39:01 2011	(r227927)
 @@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
  	zvol_state_t *zv;
  	int err = 0;
  
 +	if (MUTEX_HELD(&spa_namespace_lock)) {
 +		/*
 +		 * If the spa_namespace_lock is being held, it means that ZFS
 +		 * is trying to open ZVOL as its VDEV. This is not supported.
 +		 */
 +		return (EOPNOTSUPP);
 +	}
 +
  	mutex_enter(&spa_namespace_lock);
  
  	zv = pp->private;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: pjd 
State-Changed-When: czw 24 lis 2011 07:40:12 UTC 
State-Changed-Why:  
Fix merged to stable/9 and releng/9.0. Thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=162008 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162008: commit references a PR
Date: Thu,  5 Jan 2012 09:51:01 +0000 (UTC)

 Author: mm
 Date: Thu Jan  5 09:50:47 2012
 New Revision: 229567
 URL: http://svn.freebsd.org/changeset/base/229567
 
 Log:
   MFC r227110, r227111:
   
   MFC r227110 (pjd) [1]:
   In zvol_open() if the spa_namespace_lock is already held, it means that
   ZFS is trying to open and taste ZVOL as its VDEV. This is not supported,
   so return an error instead of panicing on spa_namespace_lock recursion.
   
   MFC r227111 (pjd) [2]:
   Correct typo in comment.
   
   PR:		kern/162008
   Reported by:	Robert Millan <rmh@debian.org> [1]
   		Fabian Keil <fk@fabiankeil.de> [2]
 
 Modified:
   stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
 
 Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
 ==============================================================================
 --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Jan  5 09:39:29 2012	(r229566)
 +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c	Thu Jan  5 09:50:47 2012	(r229567)
 @@ -875,6 +875,14 @@ zvol_open(struct g_provider *pp, int fla
  	zvol_state_t *zv;
  	int err = 0;
  
 +	if (MUTEX_HELD(&spa_namespace_lock)) {
 +		/*
 +		 * If the spa_namespace_lock is being held, it means that ZFS
 +		 * is trying to open ZVOL as its VDEV. This is not supported.
 +		 */
 +		return (EOPNOTSUPP);
 +	}
 +
  	mutex_enter(&spa_namespace_lock);
  
  	zv = pp->private;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
>Unformatted:
