From nobody@FreeBSD.org  Fri Mar  7 22:15:07 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DE9C51065674
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  7 Mar 2008 22:15:07 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id CDC0D8FC16
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  7 Mar 2008 22:15:07 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.2/8.14.2) with ESMTP id m27MC1Sc048841
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 7 Mar 2008 22:12:01 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.2/8.14.1/Submit) id m27MC1k3048840;
	Fri, 7 Mar 2008 22:12:01 GMT
	(envelope-from nobody)
Message-Id: <200803072212.m27MC1k3048840@www.freebsd.org>
Date: Fri, 7 Mar 2008 22:12:01 GMT
From: Bernard Steiner <zdbs@lif.de>
To: freebsd-gnats-submit@FreeBSD.org
Subject: data rot on disk
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         121481
>Category:       kern
>Synopsis:       [gmirror] data rot on disk with gmirror
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-geom
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Fri Mar 07 22:20:00 UTC 2008
>Closed-Date:    Tue Jul 07 11:07:08 UTC 2009
>Last-Modified:  Mon Jul 13 09:20:01 UTC 2009
>Originator:     Bernard Steiner
>Release:        6.3
>Organization:
>Environment:
FreeBSD grimma 6.3-STABLE FreeBSD 6.3-STABLE #16: Wed Mar  5 19:14:26 CET 2008     root@grimma.anydomain.de:/usr/obj/usr/src/sys/GRIMMA  amd64

>Description:
Setup:
I keep around ten thousand digital images from my camera on disk.
The disk in question is a geom mirror (using load balancing) of two
same size partitions (both h) on two slices (both 1) on two identical
ST3500630AS/3.AAK SATA drives connected via standard SATA cables to
the same VIA 6421 SATA150 controller.
The gmirror contains a ufs2 file system and _is_usually_mounted_read-only_.
The only time I mount the file system read-write is when I actually
copy new images to it. 
May I also point out that I
(a) changed to the amd64 platform,
(b) installed the file system mentioned above and
(c) started using GNOME which for some reason or other has crashed (or
    rendered unusable, as in destroyed console access) my system on a
    daily basis, at about the same time.
 
After I copied said images to the file system, I md5 summed them all.
It appears that now, a few weeks later, I get md5 sum mis-matches.
 
Problem:
It is obvious there is some data rot going on somewhere.
It is yet unclear when and how this data rot should occur.
The obvious place might be a hardware problem; alas, the
kernel does not report such a one.

Note that graid3 only provides EIO on parity mis-match, and gmirror does
not even provide for that.

I have now implemented a ufs2 on top of a gmirror using two
geli devices both with hmac/md5. 
 
I request to enhance geli with "none" encryption.

I also request for a RAID-6 (or double parity) functionality with possible
parity auto-repair geom class to be developed.

>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->pjd 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Sat Mar 8 12:34:07 UTC 2008 
Responsible-Changed-Why:  
Assign to Pawel, who has done a lot of work in these areas, and wrote GELI. 
FYI, ZFS has auto-checksumming of all data and integration with RAID, so 
you might want to look at its feature set as well. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=121481 
Responsible-Changed-From-To: pjd->freebsd-geom 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Thu May 28 22:16:48 UTC 2009 
Responsible-Changed-Why:  
pjd is not actively working on GEOM at the moment. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=121481 
State-Changed-From-To: open->closed 
State-Changed-By: ivoras 
State-Changed-When: Tue Jul 7 11:05:46 UTC 2009 
State-Changed-Why:  
Mostly irrelevant, RAID1 does not provide checksumming / data consistency 
checks that would catch bit-rot errors. (See ZFS for alternatives). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=121481 

From: "Steiner, Bernard" <Bernard.Steiner@lahmeyer.de>
To: <bug-followup@FreeBSD.org>
Cc: <bernard.steiner@lahmeyer.de>
Subject: Re: kern/121481: [gmirror] data rot on disk with gmirror
Date: Sun, 12 Jul 2009 22:11:36 +0200

 This is a multi-part message in MIME format.
 
 ------_=_NextPart_001_01CA032D.322E1433
 Content-Type: text/plain;
 	charset="iso-8859-1"
 Content-Transfer-Encoding: quoted-printable
 
 That was most unhelpful.
 
 The very reason I asked for data consistency checks was because graid3 =
 at least seems to have -w for checking as opposed to gmirror.
 
 ZFS might be nice, but (I quote)...
 
 WARNING: ZFS is considered to be an experimental feature in FreeBSD.
 
 Time for me to move to a serious operating system, I guess.
 
 Bernard
 
 
 
 --=20
 i.A. Dipl.-Inform. Bernard Steiner
 Netzwerk- und Systemadministrator
 Phone: +49 6101 55 1280, Fax: +49 6101 55 1623
 
 Lahmeyer International GmbH
 Friedberger Strasse 173, 61118 Bad Vilbel, Deutschland/Germany
 
 Geschaeftsfuehrer/Managing Directors:
 Dr. Henning Nothdurft (Vorsitzender/President), Burkhard Neumann
 
 Firmensitz/Registered office: Bad Vilbel
 Registergericht/Registry court: Frankfurt am Main HRB 80852
 
 Internet: http://www.lahmeyer.de/
 Disclaimer: http://www.lahmeyer.de/disclaimer/
 
 ------_=_NextPart_001_01CA032D.322E1433
 Content-Type: text/html;
 	charset="iso-8859-1"
 Content-Transfer-Encoding: quoted-printable
 
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
 <HTML>
 <HEAD>
 <META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
 charset=3Diso-8859-1">
 <META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
 6.5.7638.1">
 <TITLE>Re: kern/121481: [gmirror] data rot on disk with gmirror</TITLE>
 </HEAD>
 <BODY>
 <!-- Converted from text/plain format -->
 
 <P><FONT SIZE=3D2>That was most unhelpful.<BR>
 <BR>
 The very reason I asked for data consistency checks was because graid3 =
 at least seems to have -w for checking as opposed to gmirror.<BR>
 <BR>
 ZFS might be nice, but (I quote)...<BR>
 <BR>
 WARNING: ZFS is considered to be an experimental feature in FreeBSD.<BR>
 <BR>
 Time for me to move to a serious operating system, I guess.<BR>
 <BR>
 Bernard<BR>
 <BR>
 <BR>
 </FONT>
 </P>
 
 -- <FONT style=3D"FONT-SIZE: 10pt; FONT-FAMILY: Arial">
 <BR>
 i.A. Dipl.-Inform. Bernard Steiner<BR>
 Netzwerk- und Systemadministrator<BR>
 Phone: +49 6101 55 1280, Fax: +49 6101 55 1623<BR>
 <BR>
 Lahmeyer International GmbH</FONT><BR>
 <FONT style=3D"FONT-SIZE: 8pt; FONT-FAMILY: Arial">Friedberger Strasse =
 173, 61118 Bad Vilbel, Deutschland/Germany<BR>
 <BR>
 Geschaeftsfuehrer/Managing Directors:<BR>
 Dr. Henning Nothdurft (Vorsitzender/President), Burkhard Neumann<BR>
 <BR>
 Firmensitz/Registered office: Bad Vilbel<BR>
 Registergericht/Registry court: Frankfurt am Main HRB 80852<BR>
 <BR>
 Internet: <A =
 href=3D"http://www.lahmeyer.de/">http://www.lahmeyer.de/</A><BR>
 Disclaimer: <A =
 href=3D"http://www.lahmeyer.de/disclaimer/">http://www.lahmeyer.de/discla=
 imer/</A><BR></FONT><BR><br></BODY>
 </HTML>
 ------_=_NextPart_001_01CA032D.322E1433--

From: Dan Naumov <dan.naumov@gmail.com>
To: bug-followup@FreeBSD.org, zdbs@lif.de
Cc:  
Subject: Re: kern/121481: [gmirror] data rot on disk with gmirror
Date: Mon, 13 Jul 2009 11:23:32 +0300

 Bernard, while I understand your frustration, you are barking up the wrong tree.
 
 RAID offers protection against very specific kinds of disk failure and
 does not offer any kind of protection against bit rot. I want to
 emphasize that this is not a FreeBSD issue, but a RAID issue in
 general and you will run into exact same limitations if you try raid
 on Linux or Windows or hardware raid from any hardware vendor. For
 another example of a fault that RAID mirror will NOT protect you or
 even warn you against, is your disk/raid controller going berserk and
 writing garbage to the mirror or one of it's member disks.
 
 If you are happy with just getting a warning when file(s) somewhere
 are silently getting corrupted, this can easily be easily implemented
 with existing tools: there are plenty of checksumming utilities you
 can use to checksum your datasets and you could set up a cronjob to
 have the utility run a check of your files against a known hash
 database and list all the files (if any) that have changed, mailing
 you the output. When properly configured, this can also help with
 intrusion detection, as it can help detecting all new or changed files
 on the system :)
 
 However, if you require not only a warning, but also automatic
 recovery and healing from such corruption, your only option is ZFS and
 if you have evaluated the state of ZFS in FreeBSD and concluded that
 it's not mature enough for your needs, then your only other option is
 Solaris.
 
 - Sincerely,
 Dan Naumov

From: "Steiner, Bernard" <Bernard.Steiner@lahmeyer.de>
To: "Dan Naumov" <dan.naumov@gmail.com>, <bug-followup@FreeBSD.org>
Cc:  
Subject: RE: kern/121481: [gmirror] data rot on disk with gmirror
Date: Mon, 13 Jul 2009 10:50:02 +0200

 Dan,
 
 > RAID offers protection against very specific kinds of disk 
 > failure and does not offer any kind of protection against bit 
 > rot. I want to emphasize that this is not a FreeBSD issue, 
 > but a RAID issue in general and you will run into exact same 
 > limitations if you try raid on Linux or Windows or hardware
 
 I was asking for -w to be implemented by gmirror, and/or graid6
 (double parity) be implemented (also with -w or even -w2 ;-)
 
 > raid from any hardware vendor. For another example of a fault 
 > that RAID mirror will NOT protect you or even warn you 
 > against, is your disk/raid controller going berserk and 
 > writing garbage to the mirror or one of it's member disks.
 
 ACK. This is exactly why I want a check on the data read.
 
 > [checksumming utilities]
 
 Please explain how to do that on both sides of a gmirror.
 AFAIK, gmirror can be configured in the following ways:
 (1) always read from "primary" disk => cannot check secondary
 (2) round robin or load => read cannot be reliably reproduced
 
 Correct me if I am wrong, but this does not seem like a solution
 to my problem.
 
 > [ZFS / Solaris]
 
 I think I *like* ZFS (raidz2) and probably go with that.
 Solaris' future is uncertain in the light of SUN's future...
 
 I think maybe I'll wait a while till the warning is edited out
 of ZFS in FreeBSD and give it another shot.
 
 Bernard
 
 -- 
 i.A. Dipl.-Inform. Bernard Steiner
 Netzwerk- und Systemadministrator
 Phone: +49 6101 55 1280, Fax: +49 6101 55 1623
 
 Lahmeyer International GmbH
 Friedberger Strasse 173, 61118 Bad Vilbel, Deutschland/Germany
 
 Geschaeftsfuehrer/Managing Directors:
 Dr. Henning Nothdurft (Vorsitzender/President), Burkhard Neumann
 
 Firmensitz/Registered office: Bad Vilbel
 Registergericht/Registry court: Frankfurt am Main HRB 80852
 
 Internet: http://www.lahmeyer.de/
 Disclaimer: http://www.lahmeyer.de/disclaimer/

From: Dan Naumov <dan.naumov@gmail.com>
To: "Steiner, Bernard" <Bernard.Steiner@lahmeyer.de>
Cc: bug-followup@freebsd.org
Subject: Re: kern/121481: [gmirror] data rot on disk with gmirror
Date: Mon, 13 Jul 2009 12:16:33 +0300

 On Mon, Jul 13, 2009 at 11:50 AM, Steiner,
 Bernard<Bernard.Steiner@lahmeyer.de> wrote:
 >> [checksumming utilities]
 >
 > Please explain how to do that on both sides of a gmirror.
 > AFAIK, gmirror can be configured in the following ways:
 > (1) always read from "primary" disk => cannot check secondary
 > (2) round robin or load => read cannot be reliably reproduced
 >
 > Correct me if I am wrong, but this does not seem like a solution
 > to my problem.
 
 You have several options:
 
 Option 1 (this has the benefit of working with all balance algorithms):
 Take disc2 offline, run checksum check (so that checks are done against disc1)
 Take disc2 online, take disc1 offline, run checksum check
 
 Option 2 (for "prefer" algorithm):
 Assuming disk1 is the promoted disk, run checksum check
 Promote disk2, run checksum check
 Promote disk1 to return to original state
 
 - Sincerely,
 Dan Naumov
>Unformatted:
