From adsharma@sharma-home.net  Sun Feb  9 14:06:41 2003
Return-Path: <adsharma@sharma-home.net>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 89C4F37B401; Sun,  9 Feb 2003 14:06:41 -0800 (PST)
Received: from eagle.sharma-home.net (cpe-66-1-147-119.ca.sprintbbd.net [66.1.147.119])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 721E543F85; Sun,  9 Feb 2003 14:06:40 -0800 (PST)
	(envelope-from adsharma@sharma-home.net)
Received: from astra.mirabella.net (astra.mirabella.net [192.168.1.3])
	by eagle.sharma-home.net (Postfix) with ESMTP
	id B632980D8; Sun,  9 Feb 2003 14:11:42 -0800 (PST)
Received: by astra.mirabella.net (Postfix, from userid 1001)
	id A55C02E; Sun,  9 Feb 2003 14:06:39 -0800 (PST)
Message-Id: <20030209220639.A55C02E@astra.mirabella.net>
Date: Sun,  9 Feb 2003 14:06:39 -0800 (PST)
From: Arun Sharma <adsharma@sharma-home.net>
Reply-To: Arun Sharma <adsharma@sharma-home.net>
To: FreeBSD-gnats-submit@freebsd.org
Cc: smp@freebsd.org
Subject: SMP machine hang during boot related to idle proc and sched_lock
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         48117
>Category:       kern
>Synopsis:       SMP machine hang during boot related to idle proc and sched_lock
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Feb 09 14:10:04 PST 2003
>Closed-Date:    Sun Mar 11 20:58:51 GMT 2007
>Last-Modified:  Sun Mar 11 20:58:51 GMT 2007
>Originator:     Arun Sharma
>Release:        FreeBSD 5.0-CURRENT i386
>Organization:
>Environment:
System: FreeBSD astra.mirabella.net 5.0-CURRENT FreeBSD 5.0-CURRENT #16: Sat Feb 8 09:08:58 PST 2003 root@astra.mirabella.net:/usr/src/sys/i386/compile/astra i386


>Description:

The machine hangs randomly during bootup on a 2 way SMP box. In some of those hangs, it gets into ddb and I could collect the following info:

db> show pcpu
cpuid        = 0
curthread    = 0xc0d19380: pid 46 "sh"
curpcb       = 0xcad54da0
fpcurthread  = none
idlethread   = 0xc0d18b60: pid 12 "idle: cpu0"
currentldt   = 0x28
db> tr
Debugger(c0364696,0,c036423d,cad54a64,1) at Debugger+0x55
panic(c036423d,c036426b,c0d18a80,0,cad54af8) at panic+0x11f
_mtx_lock_spin(c038b6c0,2,0,0,c1fc4dc8) at _mtx_lock_spin+0x93
hardclock_process(cad54af8,0,c02f682b,20,0) at hardclock_process+0x76
hardclock(cad54af8,c0cf239c,c0334d57,c0829000,c1fc8b28) at hardclock+0x18
clkintr(0) at clkintr+0xec
Xfastintr0() at Xfastintr0+0xba
--- interrupt, eip = 0xc01cc580, esp = 0xcad54b3c, ebp = 0xcad54b58 ---
_mtx_lock_spin(c038b6c0,0,0,0,0) at _mtx_lock_spin+0x50
vm_fault(c0d1f114,80f8000,2,8,c0d19380) at vm_fault+0x1379
trap_pfault(cad54d48,1,80f8a78,202,80f8a78) at trap_pfault+0x125
trap(2f,2f,2f,2f,80fc000) at trap+0x2a3
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0x8052653, esp = 0xbfbff304, ebp = 0xbfbff308 ---
db> show pcpu 1
cpuid        = 1
curthread    = 0xc0d18a80: pid 11 "idle: cpu1"
curpcb       = 0xcad36da0
fpcurthread  = none
idlethread   = 0xc0d18a80: pid 11 "idle: cpu1"
currentldt   = 0x28
db > show msgbuf
[...]
panic: spin lock sched lock held by 0xc0d18a80 for > 5 seconds
cpuid = 0; lapic.id = 00000000

The only piece not captured above is the stack of the
idle process - which was in mi_switch().

Invariants and witness code were not configured-in.

>How-To-Repeat:

	Boot the SMP kernel repeatedly.

>Fix:

	Not clear. Need to figure out why the idle proc (cpu1) was sitting
	in mi_switch() for more than 5 secs.

>Release-Note:
>Audit-Trail:

From: John Baldwin <jhb@FreeBSD.org>
To: freebsd-gnats-submit@FreeBSD.org, adsharma@sharma-home.net
Cc:  
Subject: Re: kern/48117: SMP machine hang during boot related to idle proc and sched_lock
Date: Wed, 23 Mar 2005 16:07:48 -0500

 Does this problem still persist in 5.3 or later?
 
 -- 
 John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
 "Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
State-Changed-From-To: open->closed 
State-Changed-By: remko 
State-Changed-When: Sun Mar 11 20:58:47 UTC 2007 
State-Changed-Why:  
feedback timeout 

http://www.freebsd.org/cgi/query-pr.cgi?pr=48117 
>Unformatted:
