From nobody@FreeBSD.org  Mon Jul  9 10:18:22 2001
Return-Path: <nobody@FreeBSD.org>
Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21])
	by hub.freebsd.org (Postfix) with ESMTP id C44FB37B401
	for <freebsd-gnats-submit@FreeBSD.org>; Mon,  9 Jul 2001 10:18:21 -0700 (PDT)
	(envelope-from nobody@FreeBSD.org)
Received: (from nobody@localhost)
	by freefall.freebsd.org (8.11.3/8.11.3) id f69HIL899370;
	Mon, 9 Jul 2001 10:18:21 -0700 (PDT)
	(envelope-from nobody)
Message-Id: <200107091718.f69HIL899370@freefall.freebsd.org>
Date: Mon, 9 Jul 2001 10:18:21 -0700 (PDT)
From: Tracy Camp <campt@miralink.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Possible interrupt masking trouble in sys/cam/cam_xpt.c
X-Send-Pr-Version: www-1.0

>Number:         28840
>Category:       kern
>Synopsis:       [cam] Possible interrupt masking trouble in sys/cam/cam_xpt.c
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gibbs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jul 09 10:20:01 PDT 2001
>Closed-Date:    Tue Sep 06 19:25:07 UTC 2011
>Last-Modified:  Tue Sep 06 19:25:07 UTC 2011
>Originator:     Tracy Camp
>Release:        4.3-RELEASE
>Organization:
MiraLink Corp
>Environment:
FreeBSD 4.3-RELEASE FreeBSD 4.3-RELEAE #126: Fri Jul 6 13:37:26 EDT 2001
root@bsd-4-3.miralink.com:/usr/src/sys/compile/TEST i386
>Description:
We have over a long period of time been experiencing seemingly random
crashes using FreeBSD 3.1 and now FreeBSD 4.3 all related to the disk i/o
system.  Our application uses a custom driver with a Qlogic ISP controller
operating in target mode.  After extensive source code auditing in our
driver code we could find no further problems.  However setting up sanity
checks in portions of the sys/cam/cam_xpt.c code showed what appeared to
be queue corruption due to invalid interrupt masking.  This problem only
shows up under rather heavy load.  Sorry to say our driver does a fair
amount of work at interrupt level so this may be the underlying trigger
problem.  However removing and replacing all splsoftcam() calls in
sys/cam/cam_xpt.c with splcam() entirely eliminated the problem.

Specific problems we had encountered:

devstat_end_transaction HELP!! busy_count for da2 < 0 (-1)

this was shown to allways result from a devstat_end_transaction_buf
occuring within cam/sys/scsi/scsi_da.c:dadone()

panic: xpt_run_dev_allocq: Device on queue without any work to do

This was found after a bit of testing to be related directly to the next one:

Fatal Trap 12: page fault while in kernel mode

this was occuring within xpt_run_dev_allocq and was actually due to a NULL
pointer being returned by camq_remove on the device queue.

Checks added to camq_insert and camq_remove showed that occasionally a
queue entry could be added and before camq_insert had finished the
entries count would be 0 rather than the expected 1.  Particularly
convincing was a test inserted that did something similar to this:

camq_insert(..)
{
/* near the top */
saved_entries = queue->entries;
/* later */
if(queue->entries < 1) {
printf("entries < 1 %d", queue->entries);
}
else {
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->gibbs 
Responsible-Changed-By: dwmalone 
Responsible-Changed-When: Mon Jul 9 14:36:52 PDT 2001 
Responsible-Changed-Why:  
Justin - this PR seems to contain some interesting info which looks 
like it applies to the generic cam code. Maybe you could take a look? 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=28840 
State-Changed-From-To: open->suspended 
State-Changed-By: linimon 
State-Changed-When: Tue Nov 29 05:59:44 GMT 2005 
State-Changed-Why:  
Mark this aging PR as suspended. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=28840 
State-Changed-From-To: suspended->closed 
State-Changed-By: gibbs 
State-Changed-When: Tue Sep 6 19:24:22 UTC 2011 
State-Changed-Why:  
CAM locking has radically changed since this was submitted.  No longer relevant. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=28840 
>Unformatted:
