From ken@poseidon.alicorntech.com  Sat Aug 27 11:59:56 2005
Return-Path: <ken@poseidon.alicorntech.com>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 900DD16A41F
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 27 Aug 2005 11:59:56 +0000 (GMT)
	(envelope-from ken@poseidon.alicorntech.com)
Received: from poseidon.alicorntech.com (host86-132-224-167.range86-132.btcentralplus.com [86.132.224.167])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E5EAC43D45
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 27 Aug 2005 11:59:55 +0000 (GMT)
	(envelope-from ken@poseidon.alicorntech.com)
Received: from poseidon.alicorntech.com (localhost.alicorntech.com [127.0.0.1])
	by poseidon.alicorntech.com (8.13.4/8.13.4) with ESMTP id j7RBxr0u003702
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 27 Aug 2005 12:59:53 +0100 (BST)
	(envelope-from ken@poseidon.alicorntech.com)
Received: (from ken@localhost)
	by poseidon.alicorntech.com (8.13.4/8.13.1/Submit) id j7RBxqKw003701;
	Sat, 27 Aug 2005 12:59:53 +0100 (BST)
	(envelope-from ken)
Message-Id: <200508271159.j7RBxqKw003701@poseidon.alicorntech.com>
Date: Sat, 27 Aug 2005 12:59:53 +0100 (BST)
From: Pegasus McCleaft <ken@alicorntech.com>
Reply-To: Pegasus McCleaft <ken@alicorntech.com>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: PREEMPTION causes unstability in Alpha4000 SMP kernel
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         85346
>Category:       alpha
>Synopsis:       PREEMPTION causes unstability in Alpha4000 SMP kernel
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-alpha
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Aug 27 12:00:43 GMT 2005
>Closed-Date:    Wed Nov 03 13:28:43 UTC 2010
>Last-Modified:  Wed Nov 03 13:28:43 UTC 2010
>Originator:     Pegasus McCleaft
>Release:        FreeBSD 6.0-BETA3 alpha
>Organization:
>Environment:
System: FreeBSD poseidon.alicorntech.com 6.0-BETA3 FreeBSD 6.0-BETA3 #4: Sat Aug 27 01:51:20 BST 2005 ken@poseidon.alicorntech.com:/usr/obj/usr/src/sys/Poseidon alpha

>Description:
	The default option and build of a kernel for the 6.0 BETA3 is
to have preemption enabled. This option seems to still cause problems on my
machine after a few hours of running. The problems range between kernel traping
to hard locking. If a kernel trap is presented, the usual message is
a illegal address load within the syncer. (normally 0xffff....ff)

>How-To-Repeat:
	Build and load any kernel with PREEMPTION enabled. Do large amounts
of disk IO and ethernet, wait for the trap message (about 1 hour)

>Fix:

	Remove the "options PREEMPTION" from the config file, rebuild kernel.
I found that running the below script in another xterm keeps the machine
running long enough to build the new kernel:

	while true
	do
	sync
	echo -n "Sleeping.."
	sleep 5
	done

	
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-alpha 
Responsible-Changed-By: kris 
Responsible-Changed-When: Wed Sep 21 06:46:32 GMT 2005 
Responsible-Changed-Why:  
Sounds alpha-specific 

http://www.freebsd.org/cgi/query-pr.cgi?pr=85346 

From: John Baldwin <jhb@FreeBSD.org>
To: bug-followup@FreeBSD.org,
 ken@alicorntech.com
Cc:  
Subject: Re: alpha/85346: PREEMPTION causes unstability in Alpha4000 SMP kernel
Date: Wed, 21 Sep 2005 15:17:01 -0400

 Hmm, I still don't run with PREEMPTION enabled on my DS20 on HEAD.  In my 
 experience I haven't gotten any panics, just hard locks when I have enabled 
 PREEMPTION and SMP on Alpha.  You can try testing this patch to see if it 
 helps things at all though:
 
 --- //depot/projects/smpng/sys/alpha/alpha/interrupt.c	2005/04/14 18:55:16
 +++ //depot/user/jhb/preemption/alpha/alpha/interrupt.c	2005/04/14 19:32:16
 @@ -427,6 +427,13 @@
  	atomic_add_long(i->cntp, 1);
  
  	/*
 +	 * It seems that we need to return from an interrupt back to PAL
 +	 * on the same CPU that received the interrupt, so pin the interrupted
 +	 * thread to the current CPU until we return from the interrupt.
 +	 */
 +	sched_pin();
 +
 +	/*
  	 * Handle a fast interrupt if there is no actual thread for this
  	 * interrupt by calling the handler directly without Giant.  Note
  	 * that this means that any fast interrupt handler must be MP safe.
 @@ -435,26 +442,18 @@
  	if ((ih->ih_flags & IH_FAST) != 0) {
  		critical_enter();
  		ih->ih_handler(ih->ih_argument);
 -		/* XXX */
 -		curthread->td_owepreempt = 0;
  		critical_exit();
 -		return;
 -	}
 +	} else {
 +		if (ithd->it_disable) {
 +			CTR1(KTR_INTR,
 +			    "alpha_dispatch_intr: disabling vector 0x%x",
 +			    i->vector);
 +			ithd->it_disable(ithd->it_vector);
 +		}
  
 -	if (ithd->it_disable) {
 -		CTR1(KTR_INTR,
 -		    "alpha_dispatch_intr: disabling vector 0x%x", i->vector);
 -		ithd->it_disable(ithd->it_vector);
 +		error = ithread_schedule(ithd);
 +		KASSERT(error == 0, ("got an impossible stray interrupt"));
  	}
 -
 -	/*
 -	 * It seems that we need to return from an interrupt back to PAL
 -	 * on the same CPU that received the interrupt, so pin the interrupted
 -	 * thread to the current CPU until we return from the interrupt.
 -	 */
 -	sched_pin();
 -	error = ithread_schedule(ithd);
 -	KASSERT(error == 0, ("got an impossible stray interrupt"));
  	sched_unpin();
  }
  
 --- //depot/projects/smpng/sys/kern/kern_thread.c	2005/05/27 14:58:46
 +++ //depot/user/jhb/preemption/kern/kern_thread.c	2005/05/27 19:03:12
 @@ -955,9 +957,11 @@
  	mtx_assert(&sched_lock, MA_OWNED);
  	PROC_LOCK_ASSERT(p, MA_OWNED);
  	if (!P_SHOULDSTOP(p)) {
 +		critical_enter();
  		while ((td = TAILQ_FIRST(&p->p_suspended))) {
  			thread_unsuspend_one(td);
  		}
 +		critical_exit();
  	} else if ((P_SHOULDSTOP(p) == P_STOPPED_SINGLE) &&
  	    (p->p_numthreads == p->p_suspcount)) {
  		/*
 @@ -992,9 +996,11 @@
  	 * to continue however as this is a bad place to stop.
  	 */
  	if ((p->p_numthreads != 1) && (!P_SHOULDSTOP(p))) {
 +		critical_enter();
  		while ((td = TAILQ_FIRST(&p->p_suspended))) {
  			thread_unsuspend_one(td);
  		}
 +		critical_exit();
  	}
  	mtx_unlock_spin(&sched_lock);
  }
 --- //depot/projects/smpng/sys/kern/subr_sleepqueue.c	2005/09/15 19:40:43
 +++ //depot/user/jhb/preemption/kern/subr_sleepqueue.c	2005/09/15 20:09:55
 @@ -410,9 +410,10 @@
  	 * just return.
  	 */
  	if (td->td_sleepqueue != NULL) {
 -		MPASS(!TD_ON_SLEEPQ(td));
  		mtx_unlock_spin(&sc->sc_lock);
  		mtx_lock_spin(&sched_lock);
 +		MPASS(!TD_ON_SLEEPQ(td));
 +		MPASS(!TD_IS_SLEEPING(td));
  		return;
  	}
  
 --- //depot/projects/smpng/sys/vm/vm_glue.c	2005/05/27 14:58:46
 +++ //depot/user/jhb/preemption/vm/vm_glue.c	2005/05/27 19:03:12
 @@ -556,6 +556,7 @@
  			vm_thread_swapin(td);
  
  		PROC_LOCK(p);
 +		critical_enter();
  		mtx_lock_spin(&sched_lock);
  		p->p_sflag &= ~PS_SWAPPINGIN;
  		p->p_sflag |= PS_INMEM;
 @@ -570,6 +571,7 @@
  
  		/* Allow other threads to swap p out now. */
  		--p->p_lock;
 +		critical_exit();
  	}
  #endif /* NO_SWAPPING */
  }
 
 -- 
 John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
 "Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
State-Changed-From-To: open->closed 
State-Changed-By: jhb 
State-Changed-When: Wed Nov 3 13:28:05 UTC 2010 
State-Changed-Why:  
To the best of my knowledge, the source of this was Alpha specific and not 
MI.  Since development of Alpha has ceased, this will not be fixed. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=85346 
>Unformatted:
