From nobody@FreeBSD.org  Sun Mar  1 15:58:15 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5F5521065673
	for <freebsd-gnats-submit@FreeBSD.org>; Sun,  1 Mar 2009 15:58:15 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 4DCC08FC19
	for <freebsd-gnats-submit@FreeBSD.org>; Sun,  1 Mar 2009 15:58:15 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n21FwFrD031564
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 1 Mar 2009 15:58:15 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n21FwFmm031563;
	Sun, 1 Mar 2009 15:58:15 GMT
	(envelope-from nobody)
Message-Id: <200903011558.n21FwFmm031563@www.freebsd.org>
Date: Sun, 1 Mar 2009 15:58:15 GMT
From: Ethan <hsiao.ethan@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: The latest kernel causes my machine panic!
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         132222
>Category:       kern
>Synopsis:       [panic] The latest kernel causes my machine panic!
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    rwatson
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Mar 01 16:00:08 UTC 2009
>Closed-Date:    Mon Mar 09 09:37:01 UTC 2009
>Last-Modified:  Mon Mar 09 09:37:01 UTC 2009
>Originator:     Ethan
>Release:        7.1-STABLE
>Organization:
>Environment:
FreeBSD dns.ethan-hsiao.idv.tw 7.1-STABLE FreeBSD 7.1-STABLE #0: Sun Mar  1 19:45:23 CST 2009     root@dns.ethan-hsiao.idv.tw:/usr/src/sys/amd64/compile/dns  amd64
>Description:
I updated my system to the latest stable source via CVS, but it *ALWAYS* crashed after I tried reboot it.

The kernel message is listed as below:
Sleeping thread (tid 100075, pid 486) owns a non-sleepable lock
panic: sleeping thread
cpuid = 0
Uptime: 1m1s
Physical memory: 4085 MB
Dumping 254 MB: balabalabala......

The core dumps is also attached for your reference.
http://www.ethan-hsiao.idv.tw/~hsiao/info.3
http://www.ethan-hsiao.idv.tw/~hsiao/vmcore.3
http://www.ethan-hsiao.idv.tw/~hsiao/info.4
http://www.ethan-hsiao.idv.tw/~hsiao/vmcore.4
http://www.ethan-hsiao.idv.tw/~hsiao/info.5
http://www.ethan-hsiao.idv.tw/~hsiao/vmcore.5
http://www.ethan-hsiao.idv.tw/~hsiao/info.6
http://www.ethan-hsiao.idv.tw/~hsiao/vmcore.6

I've tried to downgrade source code from CTM src-7.0540 to CTM src-7.0568.
This issue is happened after src-7.0568 patched.

Please help me to resolve this issue.
Thanks a lot!
>How-To-Repeat:
Update to CTM src-7.0568
>Fix:


>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback 
State-Changed-By: gavin 
State-Changed-When: Sun Mar 1 18:36:56 UTC 2009 
State-Changed-Why:  
To submitter:  Sending the core file (or even providing it for download) is 
not hugely useful, as getting useful information out of it is easiest done 
on the system with the issues.  Can you please follow the instructions at 
http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html 
(specifically, obtaining the backtrace) and provide the result?  Please do 
all of this with the source code in /usr/src that the problem kernel was 
compiled from. 

Also, I'm not familiar with CTM or how it works.  Are you able to tell me 
what dates correspond to the source for the working and non-working kernels? 


Responsible-Changed-From-To: freebsd-bugs->gavin 
Responsible-Changed-By: gavin 
Responsible-Changed-When: Sun Mar 1 18:36:56 UTC 2009 
Responsible-Changed-Why:  
Track 

http://www.freebsd.org/cgi/query-pr.cgi?pr=132222 

From: Robert Watson <rwatson@FreeBSD.org>
To: Ethan Hsiao <hsiao.ethan@gmail.com>
Cc: Gavin Atkinson <gavin@freebsd.org>, freebsd-bugs@freebsd.org, 
    bug-followup@FreeBSD.org
Subject: Re: kern/132222: The latest kernel causes my machine panic!
Date: Sun, 8 Mar 2009 10:20:50 +0000 (GMT)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
 
 --621616949-1796833543-1236507651=:1340
 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 8BIT
 
 
 On Sun, 8 Mar 2009, Ethan Hsiao wrote:
 
 > I'm using PPP. This system is always crashed after PPP dail-up.
 
 Hi Ethan--
 
 What's going on here is that the current thread is trying to acquire a lock 
 held by a second thread (100075) that is improperly sleeping while holding the 
 lock.  We would like to know two things (a) what lock is it and (b) what is 
 the other thread doing that caused it to sleep -- probably it should be 
 dropping the lock before sleeping.  To extract this information, use kgdb on a 
 vmcore matching the panic message you're looking at (doesn't matter which as 
 long as you're using the right thread id (tid)).
 
 (a) Do a backtrace of the default thread in the vmcore using the "backtrace"
      command.  It will be operating on a lock, which we'd like to print -- most
      likely it's a mutex, in which case you can just print *mutexpointer from
      the arguments in the trace.  If you don't know how to extract this
      information, just send me the backtrace output and I can tell you how to
      do that.
 
 (b) Type in  "tid 100075" (or whatever the tid from the specific panic you're
      using is, if not the one listed in the original message), and then
      "backtrace".
 
 Follow up to this e-mail with that information, and my hope is we can fix this 
 in pretty short order.
 
 Thanks,
 
 Robert N M Watson
 Computer Laboratory
 University of Cambridge
 
 
 >
 > Thanks!
 >
 > Regards,
 > Ethan Hsiao
 >
 > 2009/3/6 Gavin Atkinson <gavin@freebsd.org>:
 >> On Tue, 2009-03-03 at 10:11 +0800, Ethan Hsiao wrote:
 >>> Hi,
 >>>
 >>> kern/132222 should be the same as kern/132215 (threads/132215).
 >>> It is happened after patched CTM src-7.0568.
 >>
 >> Hi,
 >>
 >> To be honest, I can't actually see how you have come to the conclusion
 >> that this PR (132222) is related to 132215.
 >>
 >> Are you using IPv6 and/or PPP? Can you please give some more details
 >> about what this system is used for?
 >>
 >> Thanks,
 >>
 >> Gaivn
 >>
 >
 --621616949-1796833543-1236507651=:1340--
Responsible-Changed-From-To: gavin->rwatson 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Sun Mar 8 11:25:28 UTC 2009 
Responsible-Changed-Why:  
Take ownership since I'm working on this bug (hope that's OK, Gavin). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=132222 
State-Changed-From-To: feedback->analyzed 
State-Changed-By: rwatson 
State-Changed-When: Sun Mar 8 11:27:42 UTC 2009 
State-Changed-Why:  
Transition to analyzed: this panic may be the result of a missed MFC of 
r186061, which corrected a leaked lock in routing socket error handling. 
I've merged the commit to stable/7; hopefully it will appear in this PR 
shortly.  Please let me know if it resolves the problem and follow up on 
this PR with that information.  Thanks! 


http://www.freebsd.org/cgi/query-pr.cgi?pr=132222 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/132222: commit references a PR
Date: Sun,  8 Mar 2009 11:21:09 +0000 (UTC)

 Author: rwatson
 Date: Sun Mar  8 11:20:54 2009
 New Revision: 189531
 URL: http://svn.freebsd.org/changeset/base/189531
 
 Log:
   Merge missed routing lock fix r186061 from head to stable/7:
   
     Dont leak the rnh lock on error.
   
   Original change was from thompsa.  This may correct routing-related panics
   seen by uses of ppp, including the following PRs:
   
   PR:		132215, 132222, 132404
   Reported by:	Ethan <hsiao.ethan at gmail.com>
 
 Modified:
   stable/7/sys/   (props changed)
   stable/7/sys/contrib/pf/   (props changed)
   stable/7/sys/dev/ath/ath_hal/   (props changed)
   stable/7/sys/dev/cxgb/   (props changed)
   stable/7/sys/net/rtsock.c
 
 Modified: stable/7/sys/net/rtsock.c
 ==============================================================================
 --- stable/7/sys/net/rtsock.c	Sun Mar  8 11:12:23 2009	(r189530)
 +++ stable/7/sys/net/rtsock.c	Sun Mar  8 11:20:54 2009	(r189531)
 @@ -629,10 +629,10 @@ route_output(struct mbuf *m, struct sock
  				       rt->rt_ifa->ifa_addr))) {
  				RT_UNLOCK(rt);
  				RADIX_NODE_HEAD_LOCK(rnh);
 -				if ((error = rt_getifa_fib(&info,
 -				    rt->rt_fibnum)) != 0)
 -					senderr(error);
 +				error = rt_getifa_fib(&info, rt->rt_fibnum);
  				RADIX_NODE_HEAD_UNLOCK(rnh);
 +				if (error != 0)
 +					senderr(error);
  				RT_LOCK(rt);
  			}
  			if (info.rti_ifa != NULL &&
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: Robert Watson <rwatson@FreeBSD.org>
To: Ethan Hsiao <hsiao.ethan@gmail.com>
Cc: Gavin Atkinson <gavin@freebsd.org>, freebsd-bugs@freebsd.org, 
    bug-followup@FreeBSD.org
Subject: Re: kern/132222: The latest kernel causes my machine panic!
Date: Sun, 8 Mar 2009 11:21:52 +0000 (GMT)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
 
 --621616949-1766678010-1236511312=:1340
 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 8BIT
 
 On Sun, 8 Mar 2009, Ethan Hsiao wrote:
 
 > I'm using PPP. This system is always crashed after PPP dail-up.
 
 Hi Ethan:
 
 I've just committed r189531 to stable/7 (may take a few minutes to make it out 
 to cvsup/ctm) which merges a bug fix already present in head but not in 
 stable/7 that may (should) resolve this problem.  Could I ask you to confirm 
 that it does?
 
 Robert N M Watson
 Computer Laboratory
 University of Cambridge
 
 >
 > Thanks!
 >
 > Regards,
 > Ethan Hsiao
 >
 > 2009/3/6 Gavin Atkinson <gavin@freebsd.org>:
 >> On Tue, 2009-03-03 at 10:11 +0800, Ethan Hsiao wrote:
 >>> Hi,
 >>>
 >>> kern/132222 should be the same as kern/132215 (threads/132215).
 >>> It is happened after patched CTM src-7.0568.
 >>
 >> Hi,
 >>
 >> To be honest, I can't actually see how you have come to the conclusion
 >> that this PR (132222) is related to 132215.
 >>
 >> Are you using IPv6 and/or PPP? Can you please give some more details
 >> about what this system is used for?
 >>
 >> Thanks,
 >>
 >> Gaivn
 >>
 >
 --621616949-1766678010-1236511312=:1340--
State-Changed-From-To: analyzed->patched 
State-Changed-By: rwatson 
State-Changed-When: Sun Mar 8 12:10:39 UTC 2009 
State-Changed-Why:  
Change r189531 is believed to correct this problem, could you confirm? 


http://www.freebsd.org/cgi/query-pr.cgi?pr=132222 

From: Ethan Hsiao <hsiao.ethan@gmail.com>
To: rwatson@freebsd.org
Cc:  
Subject: Re: kern/132222: [panic] The latest kernel causes my machine panic!
Date: Mon, 9 Mar 2009 12:01:03 +0800

 Hi,
 
 I've just patched rtsock.c.
 My test platform is working fine now.
 
 Thanks a lot for your help!
 
 Regards,
 Ethan Hsiao
 
 2009/3/8  <rwatson@freebsd.org>:
 > Synopsis: [panic] The latest kernel causes my machine panic!
 >
 > State-Changed-From-To: analyzed->patched
 > State-Changed-By: rwatson
 > State-Changed-When: Sun Mar 8 12:10:39 UTC 2009
 > State-Changed-Why:
 > Change r189531 is believed to correct this problem, could you confirm?
 >
 >
 > http://www.freebsd.org/cgi/query-pr.cgi?pr=132222
 >
 
State-Changed-From-To: patched->closed 
State-Changed-By: rwatson 
State-Changed-When: Mon Mar 9 09:36:43 UTC 2009 
State-Changed-Why:  
Close, as the problem appears resolved -- thanks for the report, and please 
follow up to the PR if it appears to recur. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=132222 
>Unformatted:
