From nobody@FreeBSD.org  Thu Oct 22 20:03:24 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6EB87106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 22 Oct 2009 20:03:24 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 45B198FC1A
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 22 Oct 2009 20:03:24 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n9MK3NwR004100
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 22 Oct 2009 20:03:23 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n9MK3N3m004099;
	Thu, 22 Oct 2009 20:03:23 GMT
	(envelope-from nobody)
Message-Id: <200910222003.n9MK3N3m004099@www.freebsd.org>
Date: Thu, 22 Oct 2009 20:03:23 GMT
From: Yuri <yuri@tsoft.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: gem_mbr load/unload causes system to hang
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         139847
>Category:       kern
>Synopsis:       [geom_mbr] [patch] load/unload causes system to hang
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    jh
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Oct 22 20:10:01 UTC 2009
>Closed-Date:    Wed Mar 02 15:46:45 UTC 2011
>Last-Modified:  Wed Mar 02 15:46:45 UTC 2011
>Originator:     Yuri
>Release:        FreeBSD 9.0-RC2
>Organization:
n/a
>Environment:
>Description:
I did 'kldload geom_mbr'. After this it printed many messages into dmesg
that various partitions don't start/end on the track boundary.

Subsequent 'kldunload geom_mbr' command hanged. kill -9 didn't kill it either.
System completely hanged after exiting from X server.

The only special circumstance was that another, unused IDE harddrive was
attached to the system. And it was formatted by some OEM with Windows.

>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-geom 
Responsible-Changed-By: gavin 
Responsible-Changed-When: Thu Oct 22 21:20:59 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer(s) 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139847 

From: Gavin Atkinson <gavin@FreeBSD.org>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/139847: [geom_mbr] load/unload causes system to hang
Date: Sat, 31 Oct 2009 15:28:11 +0000 (GMT)

 I can (accidentally) confirm that this bug exists. :-(
 
 FreeBSD rho.york.ac.uk 9.0-CURRENT FreeBSD 9.0-CURRENT #5: Mon Oct 19 
 20:34:46 BST 2009     root@rho.york.ac.uk:/usr/obj/usr/src/sys/RHO  i386
 
 "kldunload geom_mbr" hangs, with the backtrace below.  It hangs while 
 holding the GEOM topology lock and the kernel linker lock, which is 
 presumably the cause of the hang reported by the original submitter (which 
 I haven't witnessed yet but expect to soon...)
 
 KDB: enter: manual escape to debugger
 [thread pid 12 tid 100024 ]
 Stopped at      kdb_enter+0x3a: movl    $0,kdb_why
 db> tr 54907
 Tracing pid 54907 tid 100193 td 0xc5666690
 sched_switch(c5666690,0,104,191,7439752a,...) at sched_switch+0x418
 mi_switch(104,0,c0cb4ff6,1d6,4c,...) at mi_switch+0x200
 sleepq_switch(c5666690,0,c0cb4ff6,26e,0,...) at sleepq_switch+0x15f
 sleepq_timedwait(cac08280,4c,c0ca78f4,0,0,...) at sleepq_timedwait+0x6b
 _sleep(cac08280,0,4c,c0ca78f4,3e8,...) at _sleep+0x339
 g_waitfor_event(c08413f0,c5105090,2,0,ca656980,...) at g_waitfor_event+0x9c
 g_modevent(ca656980,1,cb8a6440,109,0,...) at g_modevent+0x14f
 module_unload(ca656980,c0cad345,274,271,c0884be6,...) at module_unload+0x43
 linker_file_unload(c62b4000,0,c0cad345,42c,cb8a3000,...) at linker_file_unload+0x15e
 kern_kldunload(c5666690,a,0,e79dcd2c,c0be0953,...) at kern_kldunload+0xd5
 kldunloadf(c5666690,e79dccf8,8,c0cb7d91,c0d9f210,...) at kldunloadf+0x2b
 syscall(e79dcd38) at syscall+0x2d3
 Xint0x80_syscall() at Xint0x80_syscall+0x20
 --- syscall (444, FreeBSD ELF32, kldunloadf), eip = 0x280d62fb, esp = 
 0xbfbfe48c, ebp = 0xbfbfecd8 ---
 db>
 db> sh alllocks
 Process 58972 (procstat) thread 0xc53ee8c0 (100089)
 shared sx sysctl lock (sysctl lock) r = 0 (0xc0e25844) locked @ 
 /usr/src/sys/kern/kern_sysctl.c:1521
 Process 54907 (kldunload) thread 0xc5666690 (100193)
 exclusive sx kernel linker (kernel linker) r = 0 (0xc0e23f98) locked @ 
 /usr/src/sys/kern/kern_linker.c:1068
 Process 2 (g_event) thread 0xc4e07000 (100011)
 exclusive sx GEOM topology (GEOM topology) r = 0 (0xc0e23508) locked @ 
 /usr/src/sys/geom/geom_event.c:185
 Process 12 (intr) thread 0xc4e07d20 (100024)
 exclusive sleep mutex Giant (Giant) r = 1 (0xc0e25130) locked @ 
 /usr/src/sys/dev/usb/usb_transfer.c:2954
 db>
 
 (Giant held by USB due to escaping to debugger via USB keyboard.  The 
 sysctl lock is held because I ran "procstat -kk" on the kldunload process, 
 which also wedged waiting for the kernel linker lock.)
 
 Gavin

From: Jaakko Heinonen <jh@FreeBSD.org>
To: Gavin Atkinson <gavin@FreeBSD.org>, yuri@tsoft.com
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/139847: [geom_mbr] load/unload causes system to hang
Date: Sun, 1 Nov 2009 10:29:29 +0200

 Hi,
 
 On 2009-10-31, Gavin Atkinson wrote:
 >  "kldunload geom_mbr" hangs, with the backtrace below.  It hangs while 
 >  holding the GEOM topology lock and the kernel linker lock, which is 
 
 I have described this GEOM problem here:
 
 http://docs.freebsd.org/cgi/mid.cgi?20081216210311.GA5229
 
 (You are seeing the deadlock described in section 2.)
 
 There's also a link to a patch you could try.
 
 -- 
 Jaakko
Responsible-Changed-From-To: freebsd-geom->jh 
Responsible-Changed-By: jh 
Responsible-Changed-When: Thu Apr 29 18:25:27 UTC 2010 
Responsible-Changed-Why:  
Take. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139847 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/139847: commit references a PR
Date: Wed,  5 May 2010 18:53:38 +0000 (UTC)

 Author: jh
 Date: Wed May  5 18:53:24 2010
 New Revision: 207671
 URL: http://svn.freebsd.org/changeset/base/207671
 
 Log:
   Fix deadlock between GEOM class unloading and withering. Withering can't
   proceed while g_unload_class() blocks the event thread. Fix this by not
   running g_unload_class() as a GEOM event and dropping the topology lock
   when withering needs to proceed.
   
   PR:		kern/139847
   Silence on:	freebsd-geom
 
 Modified:
   head/sys/geom/geom.h
   head/sys/geom/geom_subr.c
 
 Modified: head/sys/geom/geom.h
 ==============================================================================
 --- head/sys/geom/geom.h	Wed May  5 18:22:29 2010	(r207670)
 +++ head/sys/geom/geom.h	Wed May  5 18:53:24 2010	(r207671)
 @@ -353,6 +353,9 @@ g_free(void *ptr)
  		sx_assert(&topology_lock, SX_UNLOCKED);		\
  	} while (0)
  
 +#define g_topology_sleep(chan, timo)				\
 +	sx_sleep(chan, &topology_lock, 0, "gtopol", timo)
 +
  #define DECLARE_GEOM_CLASS(class, name) 			\
  	static moduledata_t name##_mod = {			\
  		#name, g_modevent, &class			\
 
 Modified: head/sys/geom/geom_subr.c
 ==============================================================================
 --- head/sys/geom/geom_subr.c	Wed May  5 18:22:29 2010	(r207670)
 +++ head/sys/geom/geom_subr.c	Wed May  5 18:53:24 2010	(r207671)
 @@ -134,65 +134,73 @@ g_load_class(void *arg, int flag)
  	}
  }
  
 -static void
 -g_unload_class(void *arg, int flag)
 +static int
 +g_unload_class(struct g_class *mp)
  {
 -	struct g_hh00 *hh;
 -	struct g_class *mp;
  	struct g_geom *gp;
  	struct g_provider *pp;
  	struct g_consumer *cp;
  	int error;
  
 -	g_topology_assert();
 -	hh = arg;
 -	mp = hh->mp;
 -	G_VALID_CLASS(mp);
 +	g_topology_lock();
  	g_trace(G_T_TOPOLOGY, "g_unload_class(%s)", mp->name);
 -
 -	/*
 -	 * We allow unloading if we have no geoms, or a class
 -	 * method we can use to get rid of them.
 -	 */
 -	if (!LIST_EMPTY(&mp->geom) && mp->destroy_geom == NULL) {
 -		hh->error = EOPNOTSUPP;
 -		return;
 -	}
 -
 -	/* We refuse to unload if anything is open */
 +retry:
 +	G_VALID_CLASS(mp);
  	LIST_FOREACH(gp, &mp->geom, geom) {
 +		/* We refuse to unload if anything is open */
  		LIST_FOREACH(pp, &gp->provider, provider)
  			if (pp->acr || pp->acw || pp->ace) {
 -				hh->error = EBUSY;
 -				return;
 +				g_topology_unlock();
 +				return (EBUSY);
  			}
  		LIST_FOREACH(cp, &gp->consumer, consumer)
  			if (cp->acr || cp->acw || cp->ace) {
 -				hh->error = EBUSY;
 -				return;
 +				g_topology_unlock();
 +				return (EBUSY);
  			}
 +		/* If the geom is withering, wait for it to finish. */
 +		if (gp->flags & G_GEOM_WITHER) {
 +			g_topology_sleep(mp, 1);
 +			goto retry;
 +		}
 +	}
 +
 +	/*
 +	 * We allow unloading if we have no geoms, or a class
 +	 * method we can use to get rid of them.
 +	 */
 +	if (!LIST_EMPTY(&mp->geom) && mp->destroy_geom == NULL) {
 +		g_topology_unlock();
 +		return (EOPNOTSUPP);
  	}
  
  	/* Bar new entries */
  	mp->taste = NULL;
  	mp->config = NULL;
  
 -	error = 0;
 +	LIST_FOREACH(gp, &mp->geom, geom) {
 +		error = mp->destroy_geom(NULL, mp, gp);
 +		if (error != 0) {
 +			g_topology_unlock();
 +			return (error);
 +		}
 +	}
 +	/* Wait for withering to finish. */
  	for (;;) {
  		gp = LIST_FIRST(&mp->geom);
  		if (gp == NULL)
  			break;
 -		error = mp->destroy_geom(NULL, mp, gp);
 -		if (error != 0)
 -			break;
 +		KASSERT(gp->flags & G_GEOM_WITHER,
 +		   ("Non-withering geom in class %s", mp->name));
 +		g_topology_sleep(mp, 1);
  	}
 -	if (error == 0) {
 -		if (mp->fini != NULL)
 -			mp->fini(mp);
 -		LIST_REMOVE(mp, class);
 -	}
 -	hh->error = error;
 -	return;
 +	G_VALID_CLASS(mp);
 +	if (mp->fini != NULL)
 +		mp->fini(mp);
 +	LIST_REMOVE(mp, class);
 +	g_topology_unlock();
 +
 +	return (0);
  }
  
  int
 @@ -213,12 +221,12 @@ g_modevent(module_t mod, int type, void 
  		g_ignition++;
  		g_init();
  	}
 -	hh = g_malloc(sizeof *hh, M_WAITOK | M_ZERO);
 -	hh->mp = data;
  	error = EOPNOTSUPP;
  	switch (type) {
  	case MOD_LOAD:
 -		g_trace(G_T_TOPOLOGY, "g_modevent(%s, LOAD)", hh->mp->name);
 +		g_trace(G_T_TOPOLOGY, "g_modevent(%s, LOAD)", mp->name);
 +		hh = g_malloc(sizeof *hh, M_WAITOK | M_ZERO);
 +		hh->mp = mp;
  		/*
  		 * Once the system is not cold, MOD_LOAD calls will be
  		 * from the userland and the g_event thread will be able
 @@ -236,18 +244,14 @@ g_modevent(module_t mod, int type, void 
  		}
  		break;
  	case MOD_UNLOAD:
 -		g_trace(G_T_TOPOLOGY, "g_modevent(%s, UNLOAD)", hh->mp->name);
 -		error = g_waitfor_event(g_unload_class, hh, M_WAITOK, NULL);
 -		if (error == 0)
 -			error = hh->error;
 +		g_trace(G_T_TOPOLOGY, "g_modevent(%s, UNLOAD)", mp->name);
 +		DROP_GIANT();
 +		error = g_unload_class(mp);
 +		PICKUP_GIANT();
  		if (error == 0) {
 -			KASSERT(LIST_EMPTY(&hh->mp->geom),
 -			    ("Unloaded class (%s) still has geom", hh->mp->name));
 +			KASSERT(LIST_EMPTY(&mp->geom),
 +			    ("Unloaded class (%s) still has geom", mp->name));
  		}
 -		g_free(hh);
 -		break;
 -	default:
 -		g_free(hh);
  		break;
  	}
  	return (error);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: jh 
State-Changed-When: Thu May 6 14:38:12 UTC 2010 
State-Changed-Why:  
Patched in head (r207671). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139847 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/139847: commit references a PR
Date: Fri, 30 Jul 2010 13:25:11 +0000 (UTC)

 Author: jh
 Date: Fri Jul 30 13:23:21 2010
 New Revision: 210646
 URL: http://svn.freebsd.org/changeset/base/210646
 
 Log:
   MFC r207671:
   
   Fix deadlock between GEOM class unloading and withering. Withering can't
   proceed while g_unload_class() blocks the event thread. Fix this by not
   running g_unload_class() as a GEOM event and dropping the topology lock
   when withering needs to proceed.
   
   PR:		kern/139847
 
 Modified:
   stable/8/sys/geom/geom.h
   stable/8/sys/geom/geom_subr.c
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
   stable/8/sys/dev/xen/xenpci/   (props changed)
 
 Modified: stable/8/sys/geom/geom.h
 ==============================================================================
 --- stable/8/sys/geom/geom.h	Fri Jul 30 12:56:34 2010	(r210645)
 +++ stable/8/sys/geom/geom.h	Fri Jul 30 13:23:21 2010	(r210646)
 @@ -353,6 +353,9 @@ g_free(void *ptr)
  		sx_assert(&topology_lock, SX_UNLOCKED);		\
  	} while (0)
  
 +#define g_topology_sleep(chan, timo)				\
 +	sx_sleep(chan, &topology_lock, 0, "gtopol", timo)
 +
  #define DECLARE_GEOM_CLASS(class, name) 			\
  	static moduledata_t name##_mod = {			\
  		#name, g_modevent, &class			\
 
 Modified: stable/8/sys/geom/geom_subr.c
 ==============================================================================
 --- stable/8/sys/geom/geom_subr.c	Fri Jul 30 12:56:34 2010	(r210645)
 +++ stable/8/sys/geom/geom_subr.c	Fri Jul 30 13:23:21 2010	(r210646)
 @@ -134,65 +134,73 @@ g_load_class(void *arg, int flag)
  	}
  }
  
 -static void
 -g_unload_class(void *arg, int flag)
 +static int
 +g_unload_class(struct g_class *mp)
  {
 -	struct g_hh00 *hh;
 -	struct g_class *mp;
  	struct g_geom *gp;
  	struct g_provider *pp;
  	struct g_consumer *cp;
  	int error;
  
 -	g_topology_assert();
 -	hh = arg;
 -	mp = hh->mp;
 -	G_VALID_CLASS(mp);
 +	g_topology_lock();
  	g_trace(G_T_TOPOLOGY, "g_unload_class(%s)", mp->name);
 -
 -	/*
 -	 * We allow unloading if we have no geoms, or a class
 -	 * method we can use to get rid of them.
 -	 */
 -	if (!LIST_EMPTY(&mp->geom) && mp->destroy_geom == NULL) {
 -		hh->error = EOPNOTSUPP;
 -		return;
 -	}
 -
 -	/* We refuse to unload if anything is open */
 +retry:
 +	G_VALID_CLASS(mp);
  	LIST_FOREACH(gp, &mp->geom, geom) {
 +		/* We refuse to unload if anything is open */
  		LIST_FOREACH(pp, &gp->provider, provider)
  			if (pp->acr || pp->acw || pp->ace) {
 -				hh->error = EBUSY;
 -				return;
 +				g_topology_unlock();
 +				return (EBUSY);
  			}
  		LIST_FOREACH(cp, &gp->consumer, consumer)
  			if (cp->acr || cp->acw || cp->ace) {
 -				hh->error = EBUSY;
 -				return;
 +				g_topology_unlock();
 +				return (EBUSY);
  			}
 +		/* If the geom is withering, wait for it to finish. */
 +		if (gp->flags & G_GEOM_WITHER) {
 +			g_topology_sleep(mp, 1);
 +			goto retry;
 +		}
 +	}
 +
 +	/*
 +	 * We allow unloading if we have no geoms, or a class
 +	 * method we can use to get rid of them.
 +	 */
 +	if (!LIST_EMPTY(&mp->geom) && mp->destroy_geom == NULL) {
 +		g_topology_unlock();
 +		return (EOPNOTSUPP);
  	}
  
  	/* Bar new entries */
  	mp->taste = NULL;
  	mp->config = NULL;
  
 -	error = 0;
 +	LIST_FOREACH(gp, &mp->geom, geom) {
 +		error = mp->destroy_geom(NULL, mp, gp);
 +		if (error != 0) {
 +			g_topology_unlock();
 +			return (error);
 +		}
 +	}
 +	/* Wait for withering to finish. */
  	for (;;) {
  		gp = LIST_FIRST(&mp->geom);
  		if (gp == NULL)
  			break;
 -		error = mp->destroy_geom(NULL, mp, gp);
 -		if (error != 0)
 -			break;
 +		KASSERT(gp->flags & G_GEOM_WITHER,
 +		   ("Non-withering geom in class %s", mp->name));
 +		g_topology_sleep(mp, 1);
  	}
 -	if (error == 0) {
 -		if (mp->fini != NULL)
 -			mp->fini(mp);
 -		LIST_REMOVE(mp, class);
 -	}
 -	hh->error = error;
 -	return;
 +	G_VALID_CLASS(mp);
 +	if (mp->fini != NULL)
 +		mp->fini(mp);
 +	LIST_REMOVE(mp, class);
 +	g_topology_unlock();
 +
 +	return (0);
  }
  
  int
 @@ -213,12 +221,12 @@ g_modevent(module_t mod, int type, void 
  		g_ignition++;
  		g_init();
  	}
 -	hh = g_malloc(sizeof *hh, M_WAITOK | M_ZERO);
 -	hh->mp = data;
  	error = EOPNOTSUPP;
  	switch (type) {
  	case MOD_LOAD:
 -		g_trace(G_T_TOPOLOGY, "g_modevent(%s, LOAD)", hh->mp->name);
 +		g_trace(G_T_TOPOLOGY, "g_modevent(%s, LOAD)", mp->name);
 +		hh = g_malloc(sizeof *hh, M_WAITOK | M_ZERO);
 +		hh->mp = mp;
  		/*
  		 * Once the system is not cold, MOD_LOAD calls will be
  		 * from the userland and the g_event thread will be able
 @@ -236,18 +244,14 @@ g_modevent(module_t mod, int type, void 
  		}
  		break;
  	case MOD_UNLOAD:
 -		g_trace(G_T_TOPOLOGY, "g_modevent(%s, UNLOAD)", hh->mp->name);
 -		error = g_waitfor_event(g_unload_class, hh, M_WAITOK, NULL);
 -		if (error == 0)
 -			error = hh->error;
 +		g_trace(G_T_TOPOLOGY, "g_modevent(%s, UNLOAD)", mp->name);
 +		DROP_GIANT();
 +		error = g_unload_class(mp);
 +		PICKUP_GIANT();
  		if (error == 0) {
 -			KASSERT(LIST_EMPTY(&hh->mp->geom),
 -			    ("Unloaded class (%s) still has geom", hh->mp->name));
 +			KASSERT(LIST_EMPTY(&mp->geom),
 +			    ("Unloaded class (%s) still has geom", mp->name));
  		}
 -		g_free(hh);
 -		break;
 -	default:
 -		g_free(hh);
  		break;
  	}
  	return (error);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: jh 
State-Changed-When: Wed Mar 2 15:46:44 UTC 2011 
State-Changed-Why:  
Fixed in head and stable/8. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139847 
>Unformatted:
