From milt@moth.vicor-nb.com Mon Apr 19 15:09:25 1999
Return-Path: <milt@moth.vicor-nb.com>
Received: from moth.vicor-nb.com (moth.vicor-nb.com [208.206.78.46])
	by hub.freebsd.org (Postfix) with ESMTP id 4D90515577
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 19 Apr 1999 15:09:11 -0700 (PDT)
	(envelope-from milt@moth.vicor-nb.com)
Received: (from milt@localhost)
	by moth.vicor-nb.com (8.8.5/8.8.5) id PAA05235;
	Mon, 19 Apr 1999 15:08:24 -0700 (PDT)
Message-Id: <199904192208.PAA05235@moth.vicor-nb.com>
Date: Mon, 19 Apr 1999 15:08:24 -0700 (PDT)
From: milt <milt@moth.vicor-nb.com>
To: FreeBSD-gnats-submit@freebsd.org
Cc: daver@moth.vicor-nb.com, jpl@moth.vicor-nb.com
Subject: Invalid files on disk after fsync

>Number:         11226
>Category:       kern
>Synopsis:       Invalid files on disk after fsync
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    alfred
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Apr 19 15:10:00 PDT 1999
>Closed-Date:    Mon Jun 26 11:34:51 PDT 2000
>Last-Modified:  Mon Jun 26 11:36:19 PDT 2000
>Originator:     milt@vicor-nb.com
>Release:        FreeBSD 2.2.6-STABLE
>Organization:
vicor
>Environment:

   We are running a very busy server which includes scsi drives and lots of nfs
   mounts.  df and dmesg are:

52. ? df
Filesystem              1K-blocks     Used    Avail Capacity  Mounted on
/dev/sd0s1a                 59471    21463    33251    39%    /
/dev/sd0s1f               8202004  2540061  5005783    34%    /usr
/dev/sd0s1e                 98479     4681    85920     5%    /var
/dev/sd1c                 8621381        1  7931670     0%    /disk2
procfs                          4        4        0   100%    /proc
opshome:/usr/opshome      8241012  4715979  2865753    62%    /usr/opshome
oos0a:/raid1             60821938 10291873 45664310    18%    /raid1
oos0a:/raid2             60821938 11946475 44009708    21%    /raid2
oos0a:/raid3             60821938    73528 55882655     0%    /raid3
oos0a:/usr/env1/data/lb   1843066   847170   848451    50%    /usr/env1/data/lb
oos0a:/usr/env5/data/lb   1843066   847170   848451    50%    /usr/env5/data/lb
oos0a:/usr/env3/data/lb   1843066   847170   848451    50%    /usr/env3/data/lb
oos0a:/usr/env4/data/lb   1843066   847170   848451    50%    /usr/env4/data/lb


Copyright (c) 1992-1998 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.

FreeBSD 2.2.6-STABLE #0: Tue Apr 28 15:08:30 PDT 1998
    root@ipm0.lbxrich.vicor-nb.com:/disk2/src.patched/sys/compile/WSS
CPU: Pentium (199.43-MHz 586-class CPU)
  Origin = "GenuineIntel"  Id = 0x544  Stepping=4
  Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
real memory  = 134217728 (131072K bytes)
avail memory = 129515520 (126480K bytes)
Probing for devices on PCI bus 0:
chip0 <Intel 82439TX PCI cache memory controller> rev 1 on pci0:0:0
chip1 <Intel 82371AB PCI-ISA bridge> rev 1 on pci0:7:0
chip2 <Intel 82371AB IDE interface> rev 1 on pci0:7:1
chip3 <Intel 82371AB USB interface> rev 1 int d irq 11 on pci0:7:2
chip4 <Intel 82371AB Power management controller> rev 1 on pci0:7:3
ahc0 <Adaptec 2940 SCSI host adapter> rev 0 int a irq 10 on pci0:9:0
ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs
ahc0 waiting for scsi devices to settle
ahc0: target 0 Tagged Queuing Device
(ahc0:0:0): "SEAGATE ST19171N 0023" type 0 fixed SCSI 2
sd0(ahc0:0:0): Direct-Access 8683MB (17783112 512 byte sectors)
ahc0: target 1 Tagged Queuing Device
(ahc0:1:0): "SEAGATE ST19171N 0023" type 0 fixed SCSI 2
sd1(ahc0:1:0): Direct-Access 8683MB (17783112 512 byte sectors)
(ahc0:3:0): "HP C1533A A708" type 1 removable SCSI 2
st0(ahc0:3:0): Sequential-Access density code 0x24, variable blocks, write-enabled
fxp0 <Intel EtherExpress Pro 10/100B Ethernet> rev 1 int a irq 9 on pci0:10:0
fxp0: Ethernet address 00:a0:c9:27:93:fd
vga0 <VGA-compatible display device> rev 0 int a irq 12 on pci0:11:0
Probing for devices on the ISA bus:
sc0 at 0x60-0x6f irq 1 on motherboard
sc0: VGA color <16 virtual consoles, flags=0x0>
sio0 at 0x3f8-0x3ff irq 4 on isa
sio0: type 16550A
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A
psm0 not found at 0x60
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
npx0 on motherboard
npx0: INT 16 interface
Intel Pentium F00F detected, installing workaround


>Description:

What we notice is invalid file contents.  The sequence is:

1. Our Qft utility writes a file by using a open,write/write/...,fsync,close
   sequence:
      int outFd = open (inPath, O_WRONLY | O_CREAT | O_TRUNC, 0664);
      if (outFd <= 0)
         ReportError ("Unable to open " << inPath << " for writing")

      while (...
      ... receive 8192 bytes from tcp
      if (write (fd, fileBuffer, sizeof (fileBuffer)) != sizeof (fileBuffer))
         ReportError ("Error Writing File " << inPath);

      ... the last write may be less than 8192 bytes.

      if (fsync (outFd)) ReportError ("fsync failure on file " << inPath)
      close (outFd);

2. Some program (maybe a backup utility, maybe Qft) reads the file and receives
   valid contents.

3. Some program reads the file and receives invalid contents.

   *. At this point, the data on disk is permanently invalid.

   *. The problem consists of 8192 wrong bytes which start on a page boundary in
      the middle of the file.  The first time this came up, the bad data
      consisted of two short files which had existed some weeks prior to the
      problem.  The second time this came up, the bad data looked like part of
      an executeable and could have been old or new.

   *. The file date/time modified (from ls -l) remains the time of the initial
      file write from step 1.

THAT IS: we write a file, we read it and find it good, we read it and find it
invalid even though the date hasn't changed.

Probably you will think I am making this up, confused, or incompetent.  Sounds
pretty phenomenal.  Tell me how one could accidentally change the file contents
without changing the access date?

Also, the valid read is happening within 5 to 20 minutes of the initial file
write and may well be seeing data which is in cache but never makes it to
disk.  (The invalid read is happening hours or days later.)


>How-To-Repeat:

   I wish I could.  I spent all week end trying.  The cause is going to have to
   include some unusual interaction or other or I would have duplicated it by
   now.

   We have seen two incidents one month apart on two different hosts at the
   same site (our busiest site).

>Fix:
    
   I wish I knew.  Fortunately it doesn't come up often.

>Release-Note:
>Audit-Trail:

From: "Ted Mittelstaedt" <tedm@toybox.placo.com>
To: <freebsd-gnats-submit@freebsd.org>, <milt@moth.vicor-nb.com>
Cc:  
Subject: Re: kern/11226: Invalid files on disk after fsync
Date: Sun, 12 Sep 1999 00:20:39 -0700

 Hi All,
 
   I was struck by the similarities between this and what happens with our
 news server.
 
   Once every 2 or 3 months I come in to find the server rebooted itself
 during the night - and I'm sure it's due to a SCSI error because the news
 spool is always scrambled.  Usually, multiple invocations of fsck will
 eventually clear the garbage out but I've had to regenerate the spools once
 because of it.
 
   The only one time that I have ever seen anything in the message log
 regarding this is a rather cryptic log entry:
 
 Aug 11 06:12:59 herald /kernel: biodone: buffer already done
 
 This was made seconds before the system rebooted itself during one of those
 times.
 
 The server has done up to 2 million articles in a day and I'm sure that it's
 during the peak times that this has happened.
 
 I suspect there is some sequencing bug or other within the Adaptec SCSI
 driver - of course I'm running with all go-fast options turned on including
 async mounts on the spools.  Output of dmesg follows:
 
 # dmesg
 Copyright (c) 1992-1998 FreeBSD Inc.
 Copyright (c) 1982, 1986, 1989, 1991, 1993
         The Regents of the University of California.  All rights reserved.
 
 FreeBSD 2.2.8-RELEASE #0: Mon Jan 25 00:50:14 PST 1999
     tedm@herald.ipinc.net:/usr/src/sys/compile/HERALD
 CPU: Pentium/P54C (166.19-MHz 586-class CPU)
   Origin = "GenuineIntel"  Id = 0x52c  Stepping=12
   Features=0x1bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8>
 real memory  = 134217728 (131072K bytes)
 avail memory = 129216512 (126188K bytes)
 Probing for devices on PCI bus 0:
 chip0 <Intel 82439> rev 3 on pci0:0:0
 ahc0 <Adaptec 2940A Ultra SCSI host adapter> rev 1 int a irq 9 on pci0:2:0
 ahc0: aic7860 Single Channel, SCSI Id=7, 3/8 SCBs
 ahc0 waiting for scsi devices to settle
 ahc0: target 0 Tagged Queuing Device
 (ahc0:0:0): "SEAGATE ST51080N 0958" type 0 fixed SCSI 2
 sd0(ahc0:0:0): Direct-Access 1030MB (2109840 512 byte sectors)
 ahc1 <Adaptec 2940 Ultra SCSI host adapter> rev 1 int a irq 9 on pci0:3:0
 ahc1: aic7880 Wide Channel, SCSI Id=7, 16/255 SCBs
 ahc1 waiting for scsi devices to settle
 Sending SDTR!!
 ahc1: target 4 Tagged Queuing Device
 (ahc1:4:0): "IBM OEM DCHS09F 2222" type 0 fixed SCSI 2
 sd1(ahc1:4:0): Direct-Access 8689MB (17796077 512 byte sectors)
 Sending SDTR!!
 ahc1: target 5 Tagged Queuing Device
 (ahc1:5:0): "IBM OEM DCHS09F 2222" type 0 fixed SCSI 2
 sd2(ahc1:5:0): Direct-Access 8689MB (17796077 512 byte sectors)
 Sending SDTR!!
 ahc1: target 6 Tagged Queuing Device
 (ahc1:6:0): "IBM OEM DCHS09F 2222" type 0 fixed SCSI 2
 sd3(ahc1:6:0): Direct-Access 8689MB (17796077 512 byte sectors)
 ahc2 <Adaptec 2940 Ultra SCSI host adapter> rev 1 int a irq 10 on pci0:4:0
 ahc2: aic7880 Wide Channel, SCSI Id=7, 16/255 SCBs
 ahc2 waiting for scsi devices to settle
 ahc2: target 0 Tagged Queuing Device
 (ahc2:0:0): "SEAGATE ST39102LW 0005" type 0 fixed SCSI 2
 sd4(ahc2:0:0): Direct-Access 8683MB (17783240 512 byte sectors)
 ahc2: target 1 Tagged Queuing Device
 (ahc2:1:0): "SEAGATE ST39102LW 0005" type 0 fixed SCSI 2
 sd5(ahc2:1:0): Direct-Access 8683MB (17783240 512 byte sectors)
 ahc2: target 2 Tagged Queuing Device
 (ahc2:2:0): "SEAGATE ST39102LW 0005" type 0 fixed SCSI 2
 sd6(ahc2:2:0): Direct-Access 8683MB (17783240 512 byte sectors)
 chip1 <Intel 82371SB PCI-ISA bridge> rev 1 on pci0:7:0
 chip2 <Intel 82371SB IDE interface> rev 0 on pci0:7:1
 vga0 <VGA-compatible display device> rev 0 on pci0:15:0
 Probing for devices on the ISA bus:
 sc0 at 0x60-0x6f irq 1 on motherboard
 sc0: VGA color <16 virtual consoles, flags=0x0>
 ed0 at 0x300-0x31f irq 5 maddr 0xdc000 msize 16384 on isa
 ed0: address 00:00:c0:1e:eb:23, type WD8013W (16 bit)
 sio0 at 0x3f8-0x3ff irq 4 on isa
 sio0: type 16550A
 lpt0 at 0x378-0x37f irq 7 on isa
 lpt0: Interrupt-driven port
 lp0: TCP/IP capable interface
 psm0 at 0x60-0x64 irq 12 on motherboard
 psm0: model Generic PS/2 mouse, device ID 0
 fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
 fdc0: FIFO enabled, 8 bytes threshold
 fd0: 1.44MB 3.5in
 wdc0 at 0x1f0-0x1f7 irq 14 on isa
 wdc0: unit 0 (atapi): <685A/8.4D>, removable, dma, iordy
 wcd0: 1171KB/sec, 120KB cache, audio play, 255 volume levels, ejectable tray
 wcd0: no disc inside, unlocked
 npx0 flags 0x1 on motherboard
 npx0: INT 16 interface
 Intel Pentium F00F detected, installing workaround
 ccd0-1: Concatenated disk drivers
 
 Ted
 
 

From: milt@vicor-nb.com (milt)
To: freebsd-gnats-submit@freebsd.org, milt@moth.vicor-nb.com,
	tedm@toybox.placo.com
Cc:  
Subject: Re: kern/11226: Invalid files on disk after fsync
Date: Sun, 12 Sep 1999 14:10:10 -0700 (PDT)

 HIYA
 
 >  I was struck by the similarities between this and what happens with our
 > news server.
 >  
 >    Once every 2 or 3 months I come in to find the server rebooted itself
 >  during the night - and I'm sure it's due to a SCSI error because the news
 >  spool is always scrambled.  Usually, multiple invocations of fsck will
 >  eventually clear the garbage out but I've had to regenerate the spools once
 >  because of it.
 >  
 >    The only one time that I have ever seen anything in the message log
 >  regarding this is a rather cryptic log entry:
 >  
 >  Aug 11 06:12:59 herald /kernel: biodone: buffer already done
 >  
 >  This was made seconds before the system rebooted itself during one of those
 >  times.
 >  
 >  The server has done up to 2 million articles in a day and I'm sure that it's
 >  during the peak times that this has happened.
 >  
 >  I suspect there is some sequencing bug or other within the Adaptec SCSI
 >  driver - of course I'm running with all go-fast options turned on including
 >  async mounts on the spools.  Output of dmesg follows:
 
 This problem has been one of the most frustrating I have encountered.   I
 think, Ted, that you are the only one in the world who ever really
 believed my description.
 
 What I don't see in the current problem report is that we eventually
 realized that the problem was always accompanied by an invalid disk address
 in an inode.  We found that the disk fragment listed for file address
 0xA000 for one or more files had one or both of the bits in 0x18000 cleared.
 
 That still got us nowhere.   It did explain why fsck was going nuts and
 destroying all sorts of things when this happened.
 
 Two things make this frustrating:
 
 1. I tried for weeks to make it happen in a test environment.   Got a few
    failures, but never in any controlled manner so I was never able to
    reproduce the failure reliably.
 
 2. We managed to get out of the soup without understanding or fixing the
    problem.  (I hate when that happens!)
 
    a. We now mount all file systems with noatime.  Since most of our
       accesses are reads, noatime cuts the inode writes down quite a bit and
 	  this change reduced the problem from about 6 a week to about 1 every
 	  two weeks.  (I don't have very good numbers on the frequency, the 
 	  problems were spread out over 20 different hosts in 3 cities.)
 
 	  (It's rw,noatime in fstab - see man mount)   It's one of the speedup
 	  things so probably you already have it.
 
    b. The problem has dissappeared.  We haven't had trouble in the past
       coupla months.  I can't prove why, but I think its because we are
 	  using NFS less these days.  (Some perl scripts running on the
 	  offensive hosts are now using non-NFS methods to get data from other
 	  hosts - NFS traffic is reduced but not yet eliminated.)
 
 You sure have my sympathy.  I bet it ain't fun watching files dissappear as
 you run fsck multiple times on your 50+ gig system!
 
State-Changed-From-To: open->feedback 
State-Changed-By: alfred 
State-Changed-When: Tue Jun 6 03:05:16 PDT 2000 
State-Changed-Why:  
Very old PR, waiting for feedback if 4.0 addresses the issue. 


Responsible-Changed-From-To: freebsd-bugs->alfred 
Responsible-Changed-By: alfred 
Responsible-Changed-When: Tue Jun 6 03:05:16 PDT 2000 
Responsible-Changed-Why:  
Cleaning PRs, would like feedback. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=11226 
State-Changed-From-To: feedback->closed 
State-Changed-By: alfred 
State-Changed-When: Mon Jun 26 11:34:51 PDT 2000 
State-Changed-Why:  
no feedback and no recent problems reported that are the same. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=11226 
>Unformatted:
 Reply-To: milt@vicor-nb.com
 X-send-pr-version: 3.2
 
