From nobody@FreeBSD.ORG  Wed Oct 18 08:05:58 2000
Return-Path: <nobody@FreeBSD.ORG>
Received: by hub.freebsd.org (Postfix, from userid 32767)
	id 5946F37B4E5; Wed, 18 Oct 2000 08:05:56 -0700 (PDT)
Message-Id: <20001018150556.5946F37B4E5@hub.freebsd.org>
Date: Wed, 18 Oct 2000 08:05:56 -0700 (PDT)
From: rasmith@aristotle.tamu.edu
Sender: nobody@FreeBSD.ORG
To: freebsd-gnats-submit@FreeBSD.org
Subject: DMA errors during intensive disk activity on vinum volume
X-Send-Pr-Version: www-1.0

>Number:         22086
>Category:       kern
>Synopsis:       DMA errors during intensive disk activity on vinum volume
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    grog
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Oct 18 08:10:01 PDT 2000
>Closed-Date:    Thu May 3 01:29:11 PDT 2001
>Last-Modified:  Thu May 03 01:33:12 PDT 2001
>Originator:     Robin Smith
>Release:        4.1.1-RELEASE #1
>Organization:
Dept. of Philosophy, Texas A&M University
>Environment:
FreeBSD test-drive.tamu.edu 4.1.1-RELEASE FreeBSD 4.1.1-RELEASE #1: Tue Oct 10 18:09:51 CDT 2000     root@test-drive.tamu.edu:/usr/src/sys/compile/TEST-DRIVE  i386
>Description:
System is a dual-PIII, 512MB, 4 AT disks (see below); a raid5 vinum
volume is mounted as /usr (striped across all four disks).  Heavy access
to a vinum raid5 device causes DMA errors; if the system does not 
recover (i.e. a sync fails after multiple attempts), a reboot is triggered.  
The initial error message is of the following form:

Oct 18 08:36:19 test-drive /kernel: ata0-slave: zero length DMA transfer attempted

This message may mention either controller.  If there is no recovery,
then either the terminal hangs indefinitely or the system restarts.

From a hang caused by 'du /usr' (which worked its way to /usr/src/games
before giving up:

Oct 18 08:36:19 test-drive /kernel: ata0-slave: zero length DMA transfer attempted
Oct 18 08:36:29 test-drive /kernel: ad1: WRITE command timeout - resetting
Oct 18 08:36:29 test-drive /kernel: ata0: resetting devices .. done
Oct 18 08:36:29 test-drive /kernel: ata0-slave: zero length DMA transfer attempted

The error message "zero length DMA transfer attempted" is found in 
/
>How-To-Repeat:
any intensive disk activity on the vinum raid5 volume triggers it.  A good
choice is 'du /usr'.

Here is some tracking information on the error message "zero length DMA transfer"

512: grep -C10 "zero length DMA transfer" /usr/src/sys/dev/ata/ata-dma.c 
ata_dmasetup(struct ata_softc *scp, int32_t device, 
             int8_t *data, int32_t count, int32_t flags)
###                                ^^^^  ARGUMENT 4
{
    struct ata_dmaentry *dmatab;
    u_int32_t dma_count, dma_base;
    int i = 0;

    if (((uintptr_t)data & 1) || (count & 1))
        return -1;
##### FOR COUNT=0
    if (!count) {
        ata_printf(scp, device, "zero length DMA transfer attempted\n");
        return -1;
    }
    
    dmatab = scp->dmatab[ATA_DEV(device)];
    dma_base = vtophys(data);
    dma_count = min(count, (PAGE_SIZE - ((uintptr_t)data & PAGE_MASK)));
    data += dma_count;
    count -= dma_count;

    while (count) {


Called by:

516: grep dmasetup /usr/src/sys/dev/ata/* |less
/usr/src/sys/dev/ata/ata-all.h:int32_t ata_dmasetup(struct ata_softc *, int32_t, int8_t *, int32_t, int32_t);
/usr/src/sys/dev/ata/ata-disk.c:            !ata_dmasetup(adp->controller, adp->unit, 
/usr/src/sys/dev/ata/ata-dma.c:ata_dmasetup(struct ata_softc *scp, int32_t device, 
/usr/src/sys/dev/ata/ata-dma.c:ata_dmasetup(struct ata_softc *scp, int32_t device,
/usr/src/sys/dev/ata/atapi-all.c:       !ata_dmasetup(atp->controller, atp->unit,

In ad_transfer(struct ad_request *request):


524: grep -C6 -n dmasetup /usr/src/sys/dev/ata/ata-disk.c
394-
395-    devstat_start_transaction(&adp->stats);
396-
397-    /* does this drive & transfer work with DMA ? */
398-    request->flags &= ~ADR_F_DMA_USED;
399-    if ((adp->controller->mode[ATA_DEV(adp->unit)] >= ATA_DMA) &&
400:        !ata_dmasetup(adp->controller, adp->unit, 
401-                      (void *)request->data, request->bytecount,
#######                                          ^^^^^^^^^ = count ####
402-                      (request->flags & ADR_F_READ))) {
403-        request->flags |= ADR_F_DMA_USED;
404-        cmd = request->flags&ADR_F_READ ? ATA_C_READ_DMA : ATA_C_WRITE_DMA;
405-        request->currentsize = request->bytecount;
406-    }

The only call to ata_dmasetup in ata-disk is from ad_transfer.

There are two calls to ad_transfer:

At line 570:

    else {
        request->bytecount -= request->currentsize;
        request->donecount += request->currentsize;
        if (request->bytecount > 0) {
            ad_transfer(request);
            return ATA_OP_CONTINUES;
        }

At line 290:

     while (request.bytecount > 0) {
            ad_transfer(&request);


Both are protected by comparisons; bytecount==0 should never happen.

root@test-drive [/usr/home/rasmith]
525: grep -C6 -n dmasetup /usr/src/sys/dev/ata/atapi-all.c
254-    if ((atp->controller->mode[ATA_DEV(atp->unit)] >= ATA_DMA) &&
255-    (request->ccb[0] == ATAPI_READ ||
256-     request->ccb[0] == ATAPI_READ_BIG ||
257-     ((request->ccb[0] == ATAPI_WRITE ||
258-       request->ccb[0] == ATAPI_WRITE_BIG) &&
259-      !(atp->controller->flags & ATA_ATAPI_DMA_RO))) &&
260:    !ata_dmasetup(atp->controller, atp->unit,
261-                  (void *)request->data, request->bytecount,
###########                                  ^^^^^^^^^^^^^ = count ####
262-                  request->flags & ATPR_F_READ)) {
263-    request->flags |= ATPR_F_DMA_USED;
264-    }
265-
266-    /* start ATAPI operation */

518: cat /var/log/mount.today 
/dev/ad0s1a     /       ufs rw  1 1
/dev/ad0s1e     /mnt/tmp        ufs rw  2 2
/dev/ad0s1f     /var    ufs rw  2 2
/dev/vinum/raid5        /usr    ufs rw  2 2
procfs  /proc   procfs rw       0 0

520: cat /var/log/dmesg.today

Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 4.1.1-RELEASE #1: Tue Oct 10 18:09:51 CDT 2000
    root@test-drive.tamu.edu:/usr/src/sys/compile/TEST-DRIVE
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (501.14-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x683  Stepping = 3
  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV
,PAT,PSE36,MMX,FXSR,XMM>
real memory  = 536870912 (524288K bytes)
avail memory = 518524928 (506372K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
IOAPIC #0 intpin 16 -> irq 5
IOAPIC #0 intpin 17 -> irq 11
IOAPIC #0 intpin 18 -> irq 10
IOAPIC #0 intpin 19 -> irq 9
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
Preloaded elf kernel "kernel" at 0xc041d000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443GX host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib2: <Intel 82443GX (440 GX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib2
pci1: <Trident model 9750 VGA-compatible display device> at 0.0 irq 11
isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0xffa0-0xffaf at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: <Intel 82371AB/EB (PIIX4) USB controller> port 0xef80-0xef9f irq 9 at devic
e 7.2 on pci0
usb0: <Intel 82371AB/EB (PIIX4) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Timecounter "PIIX"  frequency 3579545 Hz
chip1: <Intel 82371AB Power management controller> port 0x440-0x44f at device 7.3 
on pci0
xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0xec00-0xec7f mem 0xfebfef80-0xfebfef
ff irq 5 at device 15.0 on pci0
xl0: Ethernet address: 00:01:02:be:6d:85
miibus0: <MII bus> on xl0
xlphy0: <3Com internal media interface> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl1: <3Com 3c905B-TX Fast Etherlink XL> port 0xe480-0xe4ff mem 0xfebfef00-0xfebfef
7f irq 11 at device 16.0 on pci0
xl1: Ethernet address: 00:01:02:be:6d:41
miibus1: <MII bus> on xl1
xlphy1: <3Com internal media interface> on miibus1
xlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xe800-0xe8ff mem 0xfebff000-0xfe
bfffff irq 10 at device 18.0 on pci0
aic7892: Wide Channel A, SCSI Id=7, 32/255 SCBs
pcib1: <Intel 82443GX host to AGP bridge> on motherboard
pci2: <PCI bus> on pcib1
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model MouseMan+, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
lip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
SMP: AP CPU #1 Launched!
ad0: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata0-master using UDMA33
ad1: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata0-slave using UDMA33
ad2: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata1-master using UDMA33
ad3: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata1-slave using UDMA33
Waiting 15 seconds for SCSI devices to settle
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
vinum: loaded
vinum: reading configuration from /dev/ad3s1e
vinum: updating configuration from /dev/ad2s1e
vinum: updating configuration from /dev/ad1s1e
vinum: updating configuration from /dev/ad0s1h
cd0 at ahc0 bus 0 target 4 lun 0
cd0: <TOSHIBA CD-ROM XM-6401TA 1001> Removable CD-ROM SCSI-2 device 
cd0: 20.000MB/s transfers (20.000MHz, offset 16)
cd0: cd present [328499 x 2048 byte records]


>Fix:
None known

>Release-Note:
>Audit-Trail:

From: Robin Smith <rasmith@aristotle.tamu.edu>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/22086: DMA errors during intensive disk activity on vinum volume
Date: Sun, 22 Oct 2000 15:24:31 -0500

 This now appears to be a bug specifically in vinum's raid-5 implementation.
 I have reconfigured the volume as raid-10 (simple concatenation+mirroring),
 and there have been no errors.  
 

From: Carlos Amengual <webmaster@healthnet-sl.es>
To: freebsd-gnats-submit@FreeBSD.org, rasmith@aristotle.tamu.edu
Cc:  
Subject: Re: kern/22086: DMA errors during intensive disk activity on vinum 
 volume
Date: Tue, 28 Nov 2000 18:52:43 +0100

 I found the same problem on a 4.2-RELEASE system:
 
 Vinum configuration:
 
 drive idea device /dev/ad0s1e
 drive ideb device /dev/ad2s1f
 drive idec device /dev/ad4s1f
 volume raid5a
   plex org raid5 512k
     sd length 0 drive idea
     sd length 0 drive ideb
     sd length 0 drive idec
 
 The error messages:
 
 Nov 27 22:24:43 vega /kernel: ata0-master: zero length DMA transfer attempted
 Nov 27 22:25:06 vega /kernel: ad0: WRITE command timeout tag=0 serv=0 - resetting
 Nov 27 22:25:06 vega /kernel: ata0: resetting devices .. done
 Nov 27 22:25:06 vega /kernel: ata0-master: zero length DMA transfer attempted
 Nov 27 22:25:06 vega /kernel: ad0: WRITE command timeout tag=0 serv=0 - resetting
 Nov 27 22:25:06 vega /kernel: ata0: resetting devices .. done
 
State-Changed-From-To: open->feedback 
State-Changed-By: grog 
State-Changed-When: Tue Nov 28 18:05:56 PST 2000 
State-Changed-Why:  
Request further action from submitter. 


Responsible-Changed-From-To: freebsd-bugs->grog 
Responsible-Changed-By: grog 
Responsible-Changed-When: Tue Nov 28 18:05:56 PST 2000 
Responsible-Changed-Why:  
Grog maintains Vinum. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22086 
State-Changed-From-To: feedback->closed 
State-Changed-By: grog 
State-Changed-When: Thu May 3 01:29:11 PDT 2001 
State-Changed-Why:  
Timeout on feedback.  Suspect problem fixed in revisions 1.20 and 
1.18.2.2 of vinumlock.c. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22086 
>Unformatted:

Greg Lehey, 29 November 2000

Corrected wrapped log lines.

Thanks for the code walkthrough.  Could you please do the following:

1.  Change the code in ata_dmasetup to panic instead of printing the
    error message.

2.  In /sys/dev/vinum/request.h, change the value of RQINFO_SIZE from
    128 to 1024.

3.  Build and install a new kernel and modules, both with debugging
    symbols.

4.  On rebooting, ensure that dumping is enabled.

5.  When starting Vinum, issue the command

	 debug 72

    This will cause Vinum to log what it is doing.  You can watch
    online with "vinum info"--see the man page for further details.

6.  When you get a dump, contact me directly for further information.
    If possible, the easiest would be to give me access to the dumps
    and sources directly.

Greg Lehey, 3 May 2001

Timed out on feedback.  I suspect that this problem is related to PR
21915, which was caused by a race condition in band locking.  This
problem was fixed in revisions 1.20 and 1.18.2.2 of vinumlock.c.

PR closed.
