From cperciva@xps.daemonology.net  Wed Dec 29 05:30:44 2010
Return-Path: <cperciva@xps.daemonology.net>
Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35])
	by hub.freebsd.org (Postfix) with ESMTP id 1E7A6106564A
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 29 Dec 2010 05:30:44 +0000 (UTC)
	(envelope-from cperciva@xps.daemonology.net)
Received: from xps.daemonology.net (freefall.freebsd.org [IPv6:2001:4f8:fff6::28])
	by mx2.freebsd.org (Postfix) with SMTP id C9600177749
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 29 Dec 2010 05:30:32 +0000 (UTC)
Received: (qmail 8677 invoked by uid 1001); 29 Dec 2010 05:30:32 -0000
Message-Id: <20101229053032.8676.qmail@xps.daemonology.net>
Date: 29 Dec 2010 05:30:32 -0000
From: Colin Percival <cperciva@freebsd.org>
Reply-To: Colin Percival <cperciva@freebsd.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: kernel panic when xen disk is detached
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         153511
>Category:       kern
>Synopsis:       [xen] [panic] kernel panic when xen disk is detached
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-xen
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Dec 29 05:40:10 UTC 2010
>Closed-Date:    Thu Jan 06 23:03:37 UTC 2011
>Last-Modified:  Thu Jan 06 23:03:37 UTC 2011
>Originator:     Colin Percival
>Release:        FreeBSD 9.0-CURRENT i386/XEN
>Organization:
>Environment:

FreeBSD 9.0-CURRENT @ 2010-12-28 i386/XEN running in EC2.

>Description:

Attaching a new disk to a running FreeBSD instance works fine:

xbd2: 1024MB <Virtual Block Device> at device/vbd/2128 on xenbusb_front0
xbd2: attaching as da5
GEOM: new disk da5

but detaching it causes a panic:

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex intr sources (intr sources) r = 0 (0xc0576b00) locked @ /usr/src/sys/i386/i386/intr_machdep.c:190

KDB: stack backtrace:
X_db_sym_numargs(c037e9b5,c0117cd0,c273aa0c,a,c273aa48,...) at X_db_sym_numargs+0x146
kdb_backtrace(be,1,ffffffff,c0536fc4,c273aa9c,...) at kdb_backtrace+0x2a
witness_display_spinlock(c038106f,c273aab0,4,1,0,...) at witness_display_spinlock+0x75
witness_warn(5,0,c03aaea4,c0893888,c03f2de0,...) at witness_warn+0x1fe
trap(c273ab34) at trap+0x16a
alltraps(c2adf000,c2d028b8,c273abcc,c031e15d,88,...) at alltraps+0x1b
unbind_from_irqhandler(88,c03e0780,c03a421c,4e1,c2ae3ef4,...) at unbind_from_irqhandler+0x21
balloon_update_driver_allowance(c2d53100,c296d060,c03bc02c,a6e,c2d53100,...) at balloon_update_driver_allowance+0x94d
device_detach(c2d53100,c2adfa80,c2adfa80,c2d53100,c273ac24,...) at device_detach+0x8c
device_delete_child(c2960400,c2d53100,c2d53100,c2960400,c,...) at device_delete_child+0x35
xenbusb_identify(0,c2990530,c038477a,c273ac60,8,...) at xenbusb_identify+0xa7
xenbusb_add_device(c03f49b0,0,c03a3a97,1c0,c273acc4,...) at xenbusb_add_device+0x604
xenbusb_attach(c2960400,1,c03802df,f1,c2987858,...) at xenbusb_attach+0x192
taskqueue_thread_enqueue(c2987840,c2987858,0,c0370aea,0,...) at taskqueue_thread_enqueue+0x12b
taskqueue_thread_loop(c0408708,c273ad28,c0376a5f,35b,c03f2de0,...) at taskqueue_thread_loop+0x67
fork_exit(c01211e0,c0408708,c273ad28) at fork_exit+0xb8
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xc273ad60, ebp = 0 ---

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x0
fault code		= supervisor read, page not present
instruction pointer	= 0x21:0x0
stack pointer	        = 0x29:0xc273ab74
frame pointer	        = 0x29:0xc273ab94
code segment		= base 0x0, limit 0xf9800, type 0x1b
			= DPL 1, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 0 (thread taskq)
trap number		= 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
X_db_sym_numargs(c037e9b5,c298c000,c273aa18,c01177cb,c037b930,...) at X_db_sym_numargs+0x146
kdb_backtrace(c037b930,0,c036d3a5,c273aa64,0,...) at kdb_backtrace+0x2a
panic(c036d3a5,c03aaeab,c298c1ac,1,1,...) at panic+0x117
dblfault_handler() at dblfault_handler+0x3c3
--- trap 0x17, eip = 0, esp = 0, ebp = 0 ---
Uptime: 1h43m31
Physical memory: 607 MB
Dumping 53 MB: 38 22 6
Dump complete

>How-To-Repeat:
Launch a FreeBSD instance in EC2.  Create a new EBS volume.  Attach it
to the instance.  Detach it from the instance.
>Fix:

The presence of unbind_from_irqhandler and the EIP=0 panic makes me
suspect that we're doing something silly like invoking a hanlder
after it has been (re)set to NULL.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-xen 
Responsible-Changed-By: cperciva 
Responsible-Changed-When: Wed Dec 29 05:40:49 UTC 2010 
Responsible-Changed-Why:  
Assign Xen bug to freebsd-xen list. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=153511 

From: Colin Percival <cperciva@freebsd.org>
To: bug-followup@FreeBSD.org, cperciva@freebsd.org
Cc:  
Subject: Re: kern/153511: [xen] [panic] kernel panic when xen disk is detached
Date: Wed, 29 Dec 2010 08:08:54 -0800

 This panic is happening because we don't have a .pic_disable_intr -- when
 the interrupt is being killed, pic_disable_intr gets invoked which doesn't
 work very well when it's NULL.
 
 I have a patch which seems to fix this panic, which I'll commit later today
 or tomorrow assuming no problems turn up in further testing.
 
 -- 
 Colin Percival
 Security Officer, FreeBSD | freebsd.org | The power to serve
 Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/153511: commit references a PR
Date: Thu, 30 Dec 2010 01:29:04 +0000 (UTC)

 Author: cperciva
 Date: Thu Dec 30 01:28:56 2010
 New Revision: 216812
 URL: http://svn.freebsd.org/changeset/base/216812
 
 Log:
   Add xenpic_dynirq_disable_intr and set it as the .pic_disable_intr method
   for xenpic_dynirq_template.  This fixes a panic when a virtual disk is
   removed, since that results in an interrupt channel being disabled and
   NULL isn't very good function for disabling interrupts.
   
   We should probably have a xenpic_pirq_disable_intr as well; I'm not adding
   that here because (a) I'm not sure what uses pirqs so I don't have a test
   case, and (b) the xenpic_pirq_enable_intr code is significantly more
   complex than the xenpic_dynirq_enable_intr code, so I'm not sure what
   should go into a xenpic_pirq_disable_intr routine.
   
   PR:		kern/153511
   MFC after:	3 days
 
 Modified:
   head/sys/xen/evtchn/evtchn.c
 
 Modified: head/sys/xen/evtchn/evtchn.c
 ==============================================================================
 --- head/sys/xen/evtchn/evtchn.c	Thu Dec 30 01:13:42 2010	(r216811)
 +++ head/sys/xen/evtchn/evtchn.c	Thu Dec 30 01:28:56 2010	(r216812)
 @@ -628,6 +628,7 @@ static void     xenpic_dynirq_enable_sou
  static void     xenpic_dynirq_disable_source(struct intsrc *isrc, int); 
  static void     xenpic_dynirq_eoi_source(struct intsrc *isrc); 
  static void     xenpic_dynirq_enable_intr(struct intsrc *isrc); 
 +static void     xenpic_dynirq_disable_intr(struct intsrc *isrc); 
  
  static void     xenpic_pirq_enable_source(struct intsrc *isrc); 
  static void     xenpic_pirq_disable_source(struct intsrc *isrc, int); 
 @@ -647,6 +648,7 @@ struct pic xenpic_dynirq_template  =  { 
  	.pic_disable_source	=	xenpic_dynirq_disable_source,
  	.pic_eoi_source		=	xenpic_dynirq_eoi_source, 
  	.pic_enable_intr	=	xenpic_dynirq_enable_intr, 
 +	.pic_disable_intr	=	xenpic_dynirq_disable_intr,
  	.pic_vector		=	xenpic_vector, 
  	.pic_source_pending	=	xenpic_source_pending,
  	.pic_suspend		=	xenpic_suspend, 
 @@ -716,6 +718,20 @@ xenpic_dynirq_enable_intr(struct intsrc 
  }
  
  static void 
 +xenpic_dynirq_disable_intr(struct intsrc *isrc)
 +{
 +	unsigned int irq;
 +	struct xenpic_intsrc *xp;
 +	
 +	xp = (struct xenpic_intsrc *)isrc;	
 +	mtx_lock_spin(&irq_mapping_update_lock);
 +	irq = xenpic_vector(isrc);
 +	mask_evtchn(evtchn_from_irq(irq));
 +	xp->xp_masked = 1;
 +	mtx_unlock_spin(&irq_mapping_update_lock);
 +}
 +
 +static void 
  xenpic_dynirq_eoi_source(struct intsrc *isrc)
  {
  	unsigned int irq;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: cperciva 
State-Changed-When: Thu Dec 30 01:33:29 UTC 2010 
State-Changed-Why:  
Fixed in HEAD, will MFC soon. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=153511 
State-Changed-From-To: patched->closed 
State-Changed-By: cperciva 
State-Changed-When: Thu Jan 6 23:03:21 UTC 2011 
State-Changed-Why:  
Fixed in HEAD, 8-STABLE, and 8.2-RC2. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=153511 
>Unformatted:
