From nobody@FreeBSD.org  Wed Apr 20 18:58:38 2005
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 781ED16A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 20 Apr 2005 18:58:38 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 59E7943D3F
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 20 Apr 2005 18:58:38 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j3KIwcou023410
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 20 Apr 2005 18:58:38 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id j3KIwciS023409;
	Wed, 20 Apr 2005 18:58:38 GMT
	(envelope-from nobody)
Message-Id: <200504201858.j3KIwciS023409@www.freebsd.org>
Date: Wed, 20 Apr 2005 18:58:38 GMT
From: Steve Sears <stevenjsears@yahoo.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Debugger SMP race panic
X-Send-Pr-Version: www-2.3

>Number:         80166
>Category:       kern
>Synopsis:       [panic] Debugger SMP race panic
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    ups
>State:          suspended
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Apr 20 19:00:39 GMT 2005
>Closed-Date:    
>Last-Modified:  Sun Mar 02 02:27:40 UTC 2008
>Originator:     Steve Sears
>Release:        5.3-RELEASE
>Organization:
>Environment:
FreeBSD sjs-linux 5.3-RELEASE FreeBSD 5.3-RELEASE #19: Mon Apr 18 11:28:21 EDT 2005     root@sjs-linux:/usr/src/sys-sjs/i386/compile/SJSKERN  i386

>Description:
      panic: mi_switch: did not reenter debugger

This panic happens often when setting a breakpoint and then hitting
it a few times.  I think I see the problem.

The bsd code in kern/subr_kdb.c does this:


#ifdef SMP
        if (did_stop_cpus)
                restart_cpus(stopped_cpus);
#endif

        kdb_active--;


The panic is this (mi_switch):

        /*
         * Don't perform context switches from the debugger.
         */
        if (kdb_active) {
                mtx_unlock_spin(&sched_lock);
                kdb_backtrace();
                kdb_reenter();
                panic("%s: did not reenter debugger", __func__);
        }


It's an mp-race.  It restarts the other CPU's before decrementing
kdb_active.   If the other CPU races into mi_switch first, a
panic is guaranteed.
>How-To-Repeat:
      Hit the same breakpoint several times on an SMP box.
>Fix:
      Decrement kdb_active before restarting cpus.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->ups  
Responsible-Changed-By: ups 
Responsible-Changed-When: Fri Apr 22 21:12:42 GMT 2005 
Responsible-Changed-Why:  
I am currently working on some kdb changes to make  
SMP debugging easier that will include a fix for the  
problem. 
This does NOT imply that I adopted KDB ;-) 

http://www.freebsd.org/cgi/query-pr.cgi?pr=80166 
State-Changed-From-To: open->feedback 
State-Changed-By: kmacy 
State-Changed-When: Fri Nov 16 09:06:36 UTC 2007 
State-Changed-Why:  

Your last comments implied that you were going to fix this, did that happen? 
Thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=80166 
State-Changed-From-To: feedback->suspended 
State-Changed-By: linimon 
State-Changed-When: Sun Mar 2 02:27:07 UTC 2008 
State-Changed-Why:  
Feedback wasn't received, but it sounds like the problem might still 
exist. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=80166 
>Unformatted:
