From hadara@bsd.ee  Mon Dec 20 16:38:38 2004
Return-Path: <hadara@bsd.ee>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2205916A4CE
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 20 Dec 2004 16:38:38 +0000 (GMT)
Received: from bsd.ee (bsd.ee [194.126.101.115])
	by mx1.FreeBSD.org (Postfix) with SMTP id 1352643D41
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 20 Dec 2004 16:38:37 +0000 (GMT)
	(envelope-from hadara@bsd.ee)
Received: (qmail 90712 invoked by uid 1000); 20 Dec 2004 16:39:15 -0000
Message-Id: <20041220163915.90711.qmail@bsd.ee>
Date: 20 Dec 2004 16:39:15 -0000
From: Sven Petai <hadara@bsd.ee>
Reply-To: Sven Petai <hadarai@bsd.ee>
To: FreeBSD-gnats-submit@freebsd.org
Cc: sos@freebsd.org
Subject: ATA DMA broken on PCalpha
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         75317
>Category:       alpha
>Synopsis:       [busdma] [patch] ATA DMA broken on PCalpha
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-alpha
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Dec 20 16:40:12 GMT 2004
>Closed-Date:    Mon Feb 11 19:17:41 UTC 2008
>Last-Modified:  Mon Feb 11 19:17:41 UTC 2008
>Originator:     Sven Petai
>Release:        FreeBSD 6.0-CURRENT
>Organization:
NPO BSD Estonia
>Environment:
>Description:
Machine fails to boot with various different symptoms after mounting root, sometimes
it hangs, sometimes it gets machine check etc. I traced it down to introduction
of version 1.129 of the file src/sys/dev/ata/ata-dma.c which among other changes removes code
that did split up segments into page sized chunks to avoid running into some kind
of bug in busdma. it was commented as:
"A maximum segment size was specified for bus_dma_tag_create, but
 some busdma code does not seem to honor this, so fix up if needed."

System: FreeBSD alpha 6.0-CURRENT FreeBSD 6.0-CURRENT #23: Mon Dec 20 05:07:30 EET 2004     root@alpha:/mnt/disk/obj/usr/src/sys/HADARA  alpha dmesg can be found @ http://bsd.ee/~hadara/debug/pcalpha/kernel.txt
but relevant details of it should be:
Digital AlphaPC 164LX 533 MHz, 531MHz
8192 byte page size, 1 processor.
...
ad0: 1226MB <FUJITSU M1636TAU/5045> [2491/16/63] at ata0-master WDMA2
ad1: 3052MB <SAMSUNG SV0322A/JK200-36> [11024/9/63] at ata0-slave WDMA2
...
this is IDE only machine

>How-To-Repeat:
Just try to boot fbsd 5.3 release on similar hardware.

>Fix:
easy & ugly workaround is to just disable ata dma from the loader with
set hw.ata.ata_dma=0
maybe a little bit better workaround is following patch, which just reverts to 
previous behaviour:

--- sys/dev/ata/ata-dma.c.orig  Mon Dec 20 08:27:25 2004
+++ sys/dev/ata/ata-dma.c       Mon Dec 20 08:47:15 2004
@@ -198,16 +198,22 @@
 {
     struct ata_dmasetprd_args *args = xsc;
     struct ata_dma_prdentry *prd = args->dmatab;
-    int i;
+    int i,j;
+    bus_size_t cnt;
+    u_int32_t lastcount;

     if ((args->error = error))
        return;

+    lastcount = j = 0;
     for (i = 0; i < nsegs; i++) {
-       prd[i].addr = htole32(segs[i].ds_addr);
-       prd[i].count = htole32(segs[i].ds_len);
+       for (cnt = 0; cnt < segs[i].ds_len; cnt += PAGE_SIZE, j++) {
+             prd[j].addr = htole32(segs[i].ds_addr + cnt);
+             lastcount = ulmin(segs[i].ds_len - cnt, PAGE_SIZE) & 0xffff;
+             prd[j].count = htole32(lastcount);
+        }
     }
-    prd[i - 1].count |= htole32(ATA_DMA_EOT);
+    prd[j - 1].count |= htole32(lastcount | ATA_DMA_EOT);
 }

 static int


of course the real solution should be finding and fixing the bug in busdma code.
>Release-Note:
>Audit-Trail:

From: Sven Petai <hadara@bsd.ee>
To: freebsd-gnats-submit@freebsd.org
Cc:  
Subject: Re: alpha/75317: ATA DMA broken on PCalpha
Date: Wed, 22 Dec 2004 19:24:16 +0200

 I managed to put a wrong link in the original
 bugreport. The correct dmesg -v output is located at
 http://bsd.ee/~hadara/debug/pcalpha/pcalpha_panic_08.11.2004.txt
 
 
 I think I now understand what causes this bug,
 First of all there seems to be a slight bug in Alphas
 bus_dmamap_load(), which could be fixed with following
 patch:
 
 --- sys/alpha/alpha/busdma_machdep.c.orig       Wed Dec 22 17:11:27 2004
 +++ sys/alpha/alpha/busdma_machdep.c    Wed Dec 22 17:13:57 2004
 @@ -581,7 +581,8 @@
                 if (sg->ds_len == 0) {
                         sg->ds_addr = paddr + alpha_XXX_dmamap_or;
                         sg->ds_len = size;
 -               } else if (paddr == nextpaddr) {
 +               } else if (paddr == nextpaddr &&
 +            (sg->ds_len + size) <= dmat->maxsegsz) {
                         sg->ds_len += size;
                 } else {
                         /* Go to the next segment */
 
 
 without that we really could return larger chunks than
 specified by maxsegsz to bus_dma_tag_create function.
 
 anyway this still doesn't make things work correctly for me,
 because the real problem seems to be the Pyxis 
 page crossing bug. Basically it comes down to corrupting
 DMA transfers larger than 8k. It didn't cause problems before,
 since we never did larger than PAGE_SIZE transfers before 
 the ATA dma change mentioned in the original report. 
 There's a detection code for the buggy chip @
 src/sys/alpha/pci/cia.c
 but it's little too naive, since it assumes only DEC_ST550 can
 have it, in reality it seems to be used in some very early 
 revisions of 164LX(SX too?). But there doesn't seem to be a
 reliable way to detect if we have the faulty chip since
 it was worked around in later revisions by doing some changes
 elsewhere.
 
 One of the possible easy solutions would be to hack ata_dmaalloc() to
 use PAGE_SIZE as max segment size argument to the 
 bus_dma_tag_create function if machine has Pyxis chip at all,
 no matter if it's faulty or not.
 Would that be acceptable and if so what is the best way to propagate
 knowledge about existence of this chip from cia driver to ATA ?

From: Nathan Whitehorn <nathanw@uchicago.edu>
To: bug-followup@FreeBSD.org, hadarai@bsd.ee
Cc:  
Subject: Re: alpha/75317: [ata] [busdma] ATA DMA broken on PCalpha
Date: Tue, 07 Mar 2006 16:24:43 -0600

 This occurs on my 164SX running 6.1-PRERELEASE and can be reproduced on 
 6.0-RELEASE and 5.4-STABLE as well. None of the above patches fix the 
 problem on 6.0. With ATA drives in the system, and DMA on, machine 
 checks occur every few hours at best, every few minutes at worst. With 
 the same drives attached by firewire, I haven't had a single machine 
 check ever.
State-Changed-From-To: open->closed 
State-Changed-By: remko 
State-Changed-When: Mon Mar 26 20:36:44 UTC 2007 
State-Changed-Why:  
alpha is no longer supported and time will not be invested to get things 
fixed. Our apologies for our lack of attention for this issue. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=75317 
State-Changed-From-To: closed->open 
State-Changed-By: remko 
State-Changed-When: Tue Mar 27 05:28:25 UTC 2007 
State-Changed-Why:  
People are actively working on this, reopen the ticket (Thanks John, again my apologies) 

http://www.freebsd.org/cgi/query-pr.cgi?pr=75317 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: alpha/75317: commit references a PR
Date: Tue, 27 Nov 2007 17:43:56 +0000 (UTC)

 jhb         2007-11-27 17:43:50 UTC
 
   FreeBSD src repository
 
   Modified files:        (Branch: RELENG_6)
     sys/alpha/alpha      busdma_machdep.c 
   Log:
   Cleanup the alpha bus dma code a bit and sync it up with i386.  Changes
   include:
   - Honor alignment and boundary restrictions on DMA tags by using bounce
     pages for misaligned buffers and not coalescing pages if the resulting
     segment would cross a boundary.
   - Teach the _bus_dmamap_load_buffer() helper function to use bounce pages
     when needed and change bus_dmamap_load() to use the helper function
     instead of largely duplicating it.  As a side effect, this enables bounce
     page support for the other load routines (load_mbuf(), load_mbuf_sg(),
     and load_uio()).
   
   Honoring the boundary restrictions partially helps with the Alpha ATA DMA
   problem.  More work is needed for that however (and forthcoming).
   
   PR:             alpha/75317
   Tested by:      wilko
   Approved by:    re (kensmith)
   
   Revision  Changes    Path
   1.51.2.2  +155 -158  src/sys/alpha/alpha/busdma_machdep.c
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: alpha/75317: commit references a PR
Date: Mon, 10 Dec 2007 20:14:23 +0000 (UTC)

 jhb         2007-12-10 20:14:16 UTC
 
   FreeBSD src repository
 
   Modified files:        (Branch: RELENG_6)
     sys/alpha/alpha      busdma_machdep.c 
     sys/alpha/include    md_var.h 
     sys/alpha/pci        cia.c 
   Log:
   - Add a workaround for the DMA bugs on some alpha chipsets that ATA DMA
     trips over often.  Specifically, in these chipsets DMA transfers that
     cross a page boundary result in data corruption.  The workaround is to
     not allow any DMA transfers for non-static DMA maps (i.e. "real"
     transfers as opposed to work areas allocated with bus_dmamem_alloc()) to
     cross a page in a single S/G element.  This behavior is enabled by
     setting 'busdma_pyxis_bug' to 1.
   - Add a new tunable 'machdep.busdma_pyxis_bug' that can be used to enable
     the workaround from the loader.  This can be used to enable it on
     chipsets where we don't automatically enable it.
   - Auto-enable the workaround for buggy PYXIS 1 chipsets supported via
     cia(4).
   
   PR:             alpha/75317
   
   Revision   Changes    Path
   1.51.2.3   +23 -6     src/sys/alpha/alpha/busdma_machdep.c
   1.23.10.1  +1 -0      src/sys/alpha/include/md_var.h
   1.44.2.1   +1 -0      src/sys/alpha/pci/cia.c
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"

State-Changed-From-To: open->closed 
State-Changed-By: jhb 
State-Changed-When: Mon Feb 11 19:16:28 UTC 2008 
State-Changed-Why:  
This is believed to be fixed in 6.3 based on at least one positive user 
report. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=75317 
>Unformatted:
 
