From pete@rms21.rommon.net  Sat Oct 18 11:56:53 2003
Return-Path: <pete@rms21.rommon.net>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B9D2116A4B3
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 18 Oct 2003 11:56:53 -0700 (PDT)
Received: from rms21.rommon.net (rms21.rommon.net [193.64.42.200])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8DB1C43FDD
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 18 Oct 2003 11:56:52 -0700 (PDT)
	(envelope-from pete@rms21.rommon.net)
Received: from rms21.rommon.net (localhost [127.0.0.1])
	by rms21.rommon.net (8.12.10/8.12.9) with ESMTP id h9IIunuX086510
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 18 Oct 2003 21:56:49 +0300 (EEST)
	(envelope-from pete@rms21.rommon.net)
Received: (from pete@localhost)
	by rms21.rommon.net (8.12.10/8.12.10/Submit) id h9IIunCu086509;
	Sat, 18 Oct 2003 21:56:49 +0300 (EEST)
	(envelope-from pete)
Message-Id: <200310181856.h9IIunCu086509@rms21.rommon.net>
Date: Sat, 18 Oct 2003 21:56:49 +0300 (EEST)
From: Petri Helenius <pete@helenius.fi>
Reply-To: Petri Helenius <pete@helenius.fi>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: atacontrol rebuild always panics
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         58228
>Category:       kern
>Synopsis:       atacontrol rebuild always panics
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    sos
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Oct 18 12:00:27 PDT 2003
>Closed-Date:    Mon Jan 12 11:22:41 PST 2004
>Last-Modified:  Mon Jan 12 11:22:41 PST 2004
>Originator:     Petri Helenius
>Release:        FreeBSD 5.1-CURRENT i386
>Organization:
>Environment:
System: FreeBSD rms21.rommon.net 5.1-CURRENT FreeBSD 5.1-CURRENT #6: Sat Oct 4 00:50:59 EEST 2003 pete@rms21.rommon.net:/usr/src/sys/i386/compile/ROMMON-SERVER5 i386


	
>Description:
	atacontrol rebuild crashes when it either completes successfully or 
 	fails when rebuilding.
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x10
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0163e37
stack pointer           = 0x10:0xd2120ce4
frame pointer           = 0x10:0xd2120d0c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                       = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 482 (rebuilding ar0 0%)
kernel: type 12 trap, code=0
Stopped at      ar_rebuild+0x247:       movl    0x10(%eax),%edx
db> trace
ar_rebuild(c25f2000,d2120d48,c03aaf9c,314,0) at ar_rebuild+0x247
fork_exit(c0163bf0,c25f2000,d2120d48) at fork_exit+0xcf
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xd2120d7c, ebp = 0 ---
db>

This points to the below line in ata-raid.c:
         if (rdp->disks[disk].flags & AR_DF_ONLINE)
               adp = AD_SOFTC(rdp->disks[disk + rdp->width]);
           else
>How-To-Repeat:
	Use atacontrol to detach, attach, addspare and rebuild a mirror or
	raid0+1 set. To make it crash faster, detach while rebuilding.
>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->sos 
Responsible-Changed-By: simon 
Responsible-Changed-When: Sun Oct 19 06:08:23 PDT 2003 
Responsible-Changed-Why:  
Over to ata(4) maintainer. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=58228 
State-Changed-From-To: open->suspended 
State-Changed-By: sos 
State-Changed-When: Sat Oct 25 12:29:50 PDT 2003 
State-Changed-Why:  
The problem is known and will get fixed when ata-raid gets updated 
to take advantage of ATAng's features (WIP). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=58228 

From: Doug White <dwhite@gumbysoft.com>
To: freebsd-gnats-submit@FreeBSD.org, pete@helenius.fi,
	sos@freebsd.org
Cc:  
Subject: Re: kern/58228: atacontrol rebuild always panics
Date: Thu, 4 Dec 2003 23:42:01 -0800 (PST)

 I got ahold of a Intel S845WD1 board with an onboard Promise controller to
 test this PR out with.
 
 I can reproduce the panic on detach, but thats not suprising. It'd be one
 hell of a foot-shooting maneuver since you need to use 'atacontrol
 addspare' to tee up the replacement disk then 'atacontrol rebuild' to kick
 the rebuild off. Assumably you would know where your RAIDs are so you
 don't detach them while rebuilding.  'atacontrol detach' should return
 EBUSY on channels with rebuilding disks for safety though.
 
 I am currently running the test to see if a rebuild actually finishes. It
 seems odd that rebuilds would fail entirely every time and there not be a
 huge storm of protest on -current about it.  I don't quite see how a
 rebuild causes a disk device to go away and make rdp->disks walk off the
 end of the array.  If it dies on my box we'll at least get a crashdump or
 ddb that I can poke with.
 
 It would help to get a description of exactly what hardware was in use,
 the status of the array prior to the rebuild, and steps taken to initiate
 the rebuild so I can attempt to recreate it. If there's a specific method
 you're using to break the mirror, that might help too; I just detached the
 channel with one of the disks on it.
 
 For the record, my test system is:
 Intel S845WD1 w/ latest BIOS
 1.4GHz P4, 512MB RAM
 2xSeagate ST340016A, each master on one of the two promise channels
 
 -- 
 Doug White                    |  FreeBSD: The Power to Serve
 dwhite@gumbysoft.com          |  www.FreeBSD.org
State-Changed-From-To: suspended->closed 
State-Changed-By: sos 
State-Changed-When: Mon Jan 12 11:21:44 PST 2004 
State-Changed-Why:  
fixed in -current. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=58228 
>Unformatted:
 >>>>>>>>>                adp = AD_SOFTC(rdp->disks[disk]);     <<<<<<<<<<<
            if ((error = ar_rw(adp, rdp->lock_start,
                               size * DEV_BSIZE, buffer, AR_WRITE | AR_WAIT)))
                break;
 
 	
 
 There is also another variation of this panic, which Ill submit later.
 
