From nobody@FreeBSD.org  Sun Feb  6 10:25:35 2005
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 91C5A16A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Sun,  6 Feb 2005 10:25:35 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7405943D2F
	for <freebsd-gnats-submit@FreeBSD.org>; Sun,  6 Feb 2005 10:25:35 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j16APZJF078018
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 6 Feb 2005 10:25:35 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id j16APZaf078017;
	Sun, 6 Feb 2005 10:25:35 GMT
	(envelope-from nobody)
Message-Id: <200502061025.j16APZaf078017@www.freebsd.org>
Date: Sun, 6 Feb 2005 10:25:35 GMT
From: Yuri <yuri@tsoft.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: File cache gets corrupted, system randomly hangs, sometimes with disk corruption
X-Send-Pr-Version: www-2.3

>Number:         77163
>Category:       kern
>Synopsis:       File cache gets corrupted, system randomly hangs, sometimes with disk corruption
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Feb 06 10:30:24 GMT 2005
>Closed-Date:    Sun Feb 06 20:47:29 GMT 2005
>Last-Modified:  Mon Feb  7 06:50:05 GMT 2005
>Originator:     Yuri
>Release:        5.3-RELEASE i386 on AMD64
>Organization:
NA
>Environment:
FreeBSD xxx.xxx.org 5.3-RELEASE FreeBSD 5.3-RELEASE #0: Fri Nov  5 04:19:18UTC 2004     root@harlow.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  i386
>Description:
My system periodically hangs, sometimes there are disk corruption after reboot.

While running I was comparing number of identical files between harddrive and CD copies. Comparison fails randomly (~one out of 100 files of 560MB worth of files).
Once file gets in cache it always fails if repeatedly compared. When it gets out of cache -- some other will fail. Sometimes bad copy is in CD, sometimes in HD. Once it gets out of cache -- doesn't fail again, but some other one will.

Differences in files that I've spotted: from 1 to ~32 bytes continuously.

What's unusual about my system:
* I have SATA RAID disk array (2 identical disks Maxtor 6Y120M0/YAR51HW0, mirrored) (became supported only in 5.3 ?)
* I run i386 on AMD64
* I have recent NVidia card, but problem happens even w/out drivers installed.

Difference in copy of file coming from CD to my mind is telling that it's not HD hardware.
And it's not memory: I've ran each of two 512MB memory cards separately -- happens on both of them.

Looks like someone in kernel does a bad write in the memory.

I know this is a tough one
but I am lost with this problem.

>How-To-Repeat:
N/A
>Fix:
N/A
>Release-Note:
>Audit-Trail:

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: misc/77163
Date: Sun, 06 Feb 2005 14:11:56 +0100

 This sounds exactly like the stuff I fought for half a year.
 
 My motherboard was an Arima/Rioworks HDAMA and something on that board
 just didn't like Promise chips.  This was a bit of a problem as the
 onboard SATA channels are Promise.
 
 I've heard that recent bios updates should have fixed it, but I have
 not been able to check it.
 
 If you have a HDAMA motherboard and a bios upgrade does not fix it,
 return the board and tell them that you have the "promise data corruption
 problem" and want a board that works.
 
 -- 
 Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
 phk@FreeBSD.ORG         | TCP/IP since RFC 956
 FreeBSD committer       | BSD since 4.3-tahoe
 Never attribute to malice what can adequately be explained by incompetence.

From: David Malone <dwmalone@maths.tcd.ie>
To: Yuri <yuri@tsoft.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: misc/77163: File cache gets corrupted, system randomly hangs, sometimes with disk corruption
Date: Sun, 6 Feb 2005 17:19:05 +0000

 On Sun, Feb 06, 2005 at 10:25:35AM +0000, Yuri wrote:
 > Difference in copy of file coming from CD to my mind is telling that it's not HD hardware.
 > And it's not memory: I've ran each of two 512MB memory cards separately -- happens on both of them.
 > 
 > Looks like someone in kernel does a bad write in the memory.
 
 We saw a problem like this once and it was the disk controler fault,
 sometimes it wouldn't finish the DMA of data into memory.
 
 	David.

From: Yuri <yuri@tsoft.com>
To: David Malone <dwmalone@maths.tcd.ie>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: misc/77163: File cache gets corrupted, system randomly hangs,
 sometimes with disk corruption
Date: Sun, 06 Feb 2005 12:26:39 -0800

 >We saw a problem like this once and it was the disk controler fault,
 >sometimes it wouldn't finish the DMA of data into memory.
 >
 >	David.
 >  
 >
 I upgraded BIOS and problem seems to be gone.
 
 Just for the record: Motherboard MSI: MS-6702, BIOS was v.1.0, upgraded 
 to v.2.0, has Promise SATA controller by Marvell.
 
 Thank you!
 Yuri
State-Changed-From-To: open->closed 
State-Changed-By: linimon 
State-Changed-When: Sun Feb 6 20:47:04 GMT 2005 
State-Changed-Why:  
Submitter notes that problem went away after a BIOS upgrade. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=77163 

From: Yuri <yuri@tsoft.com>
To: freebsd-gnats-submit@FreeBSD.org, yuri@tsoft.com
Cc:  
Subject: Re: kern/77163: File cache gets corrupted, system randomly hangs,
 sometimes with disk corruption
Date: Sun, 06 Feb 2005 22:43:05 -0800

 Also for the record: BIOS update also fixed memory clock problem:
 DDR400 memory was unable to work @ 400, only @ 300, after upgrade
 problem is also gone.
 
 
 Yuri
>Unformatted:
