From nobody@FreeBSD.org  Mon Feb 28 22:45:18 2005
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E26CD16A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 28 Feb 2005 22:45:18 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 92E0443D39
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 28 Feb 2005 22:45:18 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j1SMjInk073950
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 28 Feb 2005 22:45:18 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id j1SMjIJB073949;
	Mon, 28 Feb 2005 22:45:18 GMT
	(envelope-from nobody)
Message-Id: <200502282245.j1SMjIJB073949@www.freebsd.org>
Date: Mon, 28 Feb 2005 22:45:18 GMT
From: Jason Hitt <jhitt25@swbell.net>
To: freebsd-gnats-submit@FreeBSD.org
Subject: WRITE_DMA UDMA ICRC errors while copying data to a disk
X-Send-Pr-Version: www-2.3

>Number:         78216
>Category:       kern
>Synopsis:       WRITE_DMA UDMA ICRC errors while copying data to a disk
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Feb 28 22:50:07 GMT 2005
>Closed-Date:    Thu May 24 19:09:09 GMT 2007
>Last-Modified:  Thu May 24 19:09:09 GMT 2007
>Originator:     Jason Hitt
>Release:        FreeBSD 5.3-STABLE i386
>Organization:
>Environment:
FreeBSD calandor 5.3-STABLE FreeBSD 5.3-STABLE #0: Sun Feb 13 22:01:06 CST 2005     root@calandor:/usr/obj/usr/src/sys/FILESERVER_5  i386
>Description:
    My system was configured with 4.10 using vinum with a simple mirroring setup.
    I upgraded to 5.3 and attempted to convert to gmirror.
    I removed /dev/ad2 from my vinum volume and created a gmirror volume on
        it instead (on /dev/ad2s1).
    I then successfully copied all my data from the mounts residing
        on /dev/ad0 to the mounts residing on /dev/ad2 without a single error.
    I rebooted using /dev/ad2 and reset /dev/ad0.
    Upon adding /dev/ad0s1 to the gmirror volume, I immediately began receiving
        errors of the form:
        WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=########
    I then removed /dev/ad0s1 from the gmirror volume, attempted to create
        a new volume on it, and simply copy data from the volume on /dev/ad2s1
        to this new volume.  Again, the exact same results occurred.
    I disabled dma via hw.ata.ata_dma in /boot/loader.conf, and everything
        immediately began working without error.

    Two interesting points about the above process:
        1) I had no problems whatsoever copying nearly 100 gigs of data
           from /dev/ad0 to /dev/ad2 (repeatedly...it re-did the copy three
           times before i decided my new setup met my desires).
        2) After attempting to add /dev/ad0 to the gmirror volume and seeing
           errors, I rebooted my PC to use a hard disk diagnostic tool.
           When the machine rebooted, the BIOS reported the first drive in
           CHS mode, not LBA mode.  Zeroing out the drive and re-fdisking
           corrected this.  Attempting to copy data to the drive caused it
           to re-occur (with the associated WRITE_DMA errors popping up
           as well).

    The only customizations i have made to the config file were to disable
    drivers i do not use (various network cards, some drive controllers...
    basically just hardware i will never own).

    I have two hard disks, each on their own 80 conductor IDE cable.

    Below is my startup dump.

    FreeBSD 5.3-STABLE #0: Sun Feb 13 22:01:06 CST 2005
    root@calandor:/usr/obj/usr/src/sys/FILESERVER_5
    Timecounter "i8254" frequency 1193182 Hz quality 0
    CPU: AMD Duron(tm) processor (798.64-MHz 686-class CPU)
    Origin = "AuthenticAMD"  Id = 0x631  Stepping = 1
    Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
    AMD Features=0xc0440000<RSVD,AMIE,DSP,3DNow!>
    real memory  = 536805376 (511 MB)
    avail memory = 515620864 (491 MB)
    npx0: [FAST]
    npx0: <math processor> on motherboard
    npx0: INT 16 interface
    acpi0: <VIA694 AWRDACPI> on motherboard
    acpi0: Power Button (fixed)
    Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
    acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
    cpu0: <ACPI CPU (3 Cx states)> on acpi0
    acpi_tz0: <Thermal Zone> on acpi0
    acpi_button0: <Power Button> on acpi0
    acpi_button1: <Sleep Button> on acpi0
    pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
    pci0: <ACPI PCI bus> on pcib0
    agp0: <VIA Generic host to PCI bridge> mem 0xe0000000-0xe1ffffff at device 0.0 on pci0
    pcib1: <PCI-PCI bridge> at device 1.0 on pci0
    pci1: <PCI bus> on pcib1
    pci1: <display, VGA> at device 0.0 (no driver attached)
    uhci0: <VIA 83C572 USB controller> port 0xd000-0xd01f irq 11 at device 16.0 on pci0
    uhci0: [GIANT-LOCKED]
    usb0: <VIA 83C572 USB controller> on uhci0
    usb0: USB revision 1.0
    uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub0: 2 ports with 2 removable, self powered
    uhci1: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 3 at device 16.1 on pci0
    uhci1: [GIANT-LOCKED]
    usb1: <VIA 83C572 USB controller> on uhci1
    usb1: USB revision 1.0
    uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub1: 2 ports with 2 removable, self powered
    uhci2: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 10 at device 16.2 on pci0
    uhci2: [GIANT-LOCKED]
    usb2: <VIA 83C572 USB controller> on uhci2
    usb2: USB revision 1.0
    uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
    uhub2: 2 ports with 2 removable, self powered
    pci0: <serial bus, USB> at device 16.3 (no driver attached)
    isab0: <PCI-ISA bridge> at device 17.0 on pci0
    isa0: <ISA bus> on isab0
    atapci0: <VIA 8235 UDMA133 controller> port 0xdc00-0xdc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 17.1 on pci0
    ata0: channel #0 on atapci0
    ata1: channel #1 on atapci0
    pci0: <multimedia, audio> at device 17.5 (no driver attached)
    vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xe800-0xe8ff mem 0xe8001000-0xe80010ff irq 11 at device 18.0 on pci0
    miibus0: <MII bus> on vr0
    ukphy0: <Generic IEEE 802.3u media interface> on miibus0
    ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
    vr0: Ethernet address: 00:0d:87:b0:00:55
    fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
    fdc0: [FAST]
    sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
    sio0: type 16550A
    ppc0: <ECP parallel printer port> port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0
    ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
    ppc0: FIFO with 16/16/16 bytes threshold
    ppbus0: <Parallel port bus> on ppc0
    plip0: <PLIP network interface> on ppbus0
    lpt0: <Printer> on ppbus0
    lpt0: Interrupt-driven port
    ppi0: <Parallel I/O> on ppbus0
    atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
    atkbd0: <AT Keyboard> irq 1 on atkbdc0
    kbd0 at atkbd0
    atkbd0: [GIANT-LOCKED]
    psm0: <PS/2 Mouse> irq 12 on atkbdc0
    psm0: [GIANT-LOCKED]
    psm0: model Generic PS/2 mouse, device ID 0
    orm0: <ISA Option ROM> at iomem 0xc0000-0xc9fff on isa0
    pmtimer0 on isa0
    sc0: <System console> at flags 0x100 on isa0
    sc0: VGA <16 virtual consoles, flags=0x300>
    sio1: configured irq 3 not in bitmap of probed irqs 0
    sio1: port may not be enabled
    vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
    Timecounter "TSC" frequency 798642802 Hz quality 800
    Timecounters tick every 10.000 msec
    ad0: 114473MB <WDC WD1200JB-00DUA3/75.13B75> [232581/16/63] at ata0-master PIO4
    ad2: 114473MB <WDC WD1200JB-00EVA0/15.05R15> [232581/16/63] at ata1-master PIO4
    GEOM_MIRROR: Device m0s1 created (id=1279703646).
    GEOM_MIRROR: Device m0s1: provider ad0s1 detected.
    GEOM_MIRROR: Device m0s1: provider ad2s1 detected.
    GEOM_MIRROR: Device m0s1: provider ad2s1 activated.
    GEOM_MIRROR: Device m0s1: provider ad0s1 activated.
    GEOM_MIRROR: Device m0s1: provider mirror/m0s1 launched.
    Mounting root from ufs:/dev/mirror/m0s1a
    Accounting enabled

>How-To-Repeat:
    Unknown if this is repeatable on any random system.  It appears to
    be an issue for many people, however, i did not see any reports of
    multiple drive configurations such as mine.  The fact that my second
    drive had no DMA issues while my first drive did may be revealing.

>Fix:
Workaround: disable dma access via hw.ata.ata_dma in /boot/loader.conf
I have not yet tested various DMA modes other than UDMA100, but PIO4 works flawlessly (albeit quite slowly)
>Release-Note:
>Audit-Trail:

From: "Jason Hitt" <jhitt25@swbell.net>
To: <freebsd-gnats-submit@FreeBSD.org>, <jhitt25@swbell.net>
Cc:  
Subject: Re: kern/78216: WRITE_DMA UDMA ICRC errors while copying data to a disk
Date: Mon, 28 Feb 2005 19:36:28 -0600

 This is a multi-part message in MIME format.
 
 ------=_NextPart_000_0002_01C51DCC.CC65BA80
 Content-Type: text/plain;
 	charset="iso-8859-1"
 Content-Transfer-Encoding: quoted-printable
 
 I tested both UDMA33 and UDMA66 modes on ad0 and ad2.
 
 My test consisted of the following:
     - cp -R of a 1 gigabyte directory to the same gmirror volume.
     - An FTP push of 1 gigabyte of data from another host.
     - An FTP pull of 1 gigabyte of data from another host.
 
 No DMA errors occurred at either UDMA33 or UDMA66.  Attempting the same =
 test in UDMA100 mode almost immediately resulted in errors on ad0 and =
 caused gmirror to drop ad0 from the mirror volume.
 
 ------=_NextPart_000_0002_01C51DCC.CC65BA80
 Content-Type: text/html;
 	charset="iso-8859-1"
 Content-Transfer-Encoding: quoted-printable
 
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
 <HTML><HEAD>
 <META http-equiv=3DContent-Type content=3D"text/html; =
 charset=3Diso-8859-1">
 <META content=3D"MSHTML 6.00.2900.2604" name=3DGENERATOR>
 <STYLE></STYLE>
 </HEAD>
 <BODY bgColor=3D#ffffff>
 <DIV><FONT face=3DArial size=3D2>I tested both UDMA33 and UDMA66 modes =
 on ad0 and=20
 ad2.</FONT></DIV>
 <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
 <DIV><FONT face=3DArial size=3D2>My test consisted of the =
 following:</FONT></DIV>
 <DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp;&nbsp; - cp -R of a 1 =
 gigabyte=20
 directory to the same gmirror volume.</FONT></DIV>
 <DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp;&nbsp; - An FTP push of 1 =
 gigabyte of=20
 data from another host.</FONT></DIV>
 <DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp;&nbsp; - An FTP pull of 1 =
 gigabyte of=20
 data from another host.</FONT></DIV>
 <DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
 <DIV><FONT face=3DArial size=3D2>No DMA errors occurred at either UDMA33 =
 or=20
 UDMA66.&nbsp; Attempting the same test in UDMA100 mode almost =
 immediately=20
 resulted in errors on ad0 and caused gmirror to drop ad0 from the mirror =
 
 volume.</FONT></DIV>
 <DIV>&nbsp;</DIV></BODY></HTML>
 
 ------=_NextPart_000_0002_01C51DCC.CC65BA80--
 

From: Andrey Smagin <samspeedu@mail.ru>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/78216: WRITE_DMA UDMA ICRC errors while copying data to a disk
Date: Tue, 1 Mar 2005 09:45:26 +0300

 >>How-To-Repeat:
 JH>     Unknown if this is repeatable on any random system.  It appears to
 JH>     be an issue for many people, however, i did not see any reports of
 JH>     multiple drive configurations such as mine.  The fact that my second
 JH>     drive had no DMA issues while my first drive did may be revealing.
 
  I have old hardware 1 CPU, 2MBoard, 64MBram(iP~200, chipsets iTX, VIA), and
 tested next configuration ata0 master - 2.1Gb(UDMA33)(may normally detected by
 BIOS for loading), ata1 master WD120Gb(UDMA33)(data storage), on both
 motherboards it error was occured, with destroying data. Problem was resolved
 by disabling DMA.
  Also I have another PC as server(iP166/VIA/64MB) with 20GB(UDMA33) and
 120GB(UDMA100) (first for BIOS another for data), on it error was ocurred
 after loading kernel, but important data while not be destroyed. Speed of
 linear reading UDMA33(100) not more than 4Mbytes/s with 100% CPU usage.
 
  All HDD, MB, RAM modules different, but was tried it all possible combination.
 Version FreeBSD 6.0-CURRENT 4 month ago, now on all PC FreeBSD 6.0-CURRENT
 2weeks ago with disabled DMA.
 
  Another PC is more modern (Duron 1133/SIS735/1GBram). ata0 master WD120GB,
 ata1 master Seagate 120GB, both UDMA100. Error never ocurred.
 
 -- 
 Best regards,
  Andrey                            mailto:samspeedu@mail.ru
 

From: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk>
To: bug-followup@FreeBSD.org, jhitt25@swbell.net
Cc:  
Subject: Re: kern/78216: WRITE_DMA UDMA ICRC errors while copying data to a
	disk
Date: Thu, 24 May 2007 17:03:24 +0100

 Hi,
 
 Is this still a problem with more recent versions of FreeBSD?  If so,
 have you tried changing the cables for known-good 80-conductor cables?
 An ICRC error usually means that the data was corrupted between the
 motherboard and the drive (the I in ICRC means Interface), so the cable
 is usually the first suspect.
 
 Thanks,
 Gavin
State-Changed-From-To: open->feedback 
State-Changed-By: linimon 
State-Changed-When: Thu May 24 18:53:11 UTC 2007 
State-Changed-Why:  
Note that submitter has been asked for feedback. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=78216 
State-Changed-From-To: feedback->closed 
State-Changed-By: linimon 
State-Changed-When: Thu May 24 19:08:39 UTC 2007 
State-Changed-Why:  
Submitter's email address bounces. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=78216 
>Unformatted:
