From nobody@FreeBSD.org  Sun Oct 18 12:36:31 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C9E7F106566C
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 18 Oct 2009 12:36:31 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id B7A818FC08
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 18 Oct 2009 12:36:31 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n9ICaVJI021013
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 18 Oct 2009 12:36:31 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n9ICaVAd021012;
	Sun, 18 Oct 2009 12:36:31 GMT
	(envelope-from nobody)
Message-Id: <200910181236.n9ICaVAd021012@www.freebsd.org>
Date: Sun, 18 Oct 2009 12:36:31 GMT
From: Alexander Best <alexbestms@math.uni-muenster.de>
To: freebsd-gnats-submit@FreeBSD.org
Subject: all mounted fs don't get synced during reboot/shutdown with >= 1 mounted inaccessible device
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         139718
>Category:       kern
>Synopsis:       [reboot] all mounted fs don't get synced during reboot/shutdown with >= 1 mounted inaccessible device
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    trasz
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Oct 18 12:40:01 UTC 2009
>Closed-Date:    
>Last-Modified:  Wed Sep 22 23:49:23 UTC 2010
>Originator:     Alexander Best
>Release:        9.0-CURRENT
>Organization:
>Environment:
FreeBSD otaku 9.0-CURRENT FreeBSD 9.0-CURRENT #0 r197914: Sat Oct 10 02:58:19 CEST 2009     root@otaku:/usr/obj/usr/src/sys/ARUNDEL  i386
>Description:
when the system is being shutdown or rebooted and a mounted device isn't accessible any longer all other mounted devices aren't being synced correctly and thus marked dirty. this also happens if the inaccessible device was mounted read-only.

the reboot/shutdown sequence hangs after the message "All buffers synced.". after a reset all previosly mounted sttorage devices need to be fsck'ed.

see this thread for further info: http://lists.freebsd.org/pipermail/freebsd-current/2009-October/012679.html

Matthias Andree described the problem like this:

"1. If the device for one file system is gone, why would I mark *other* file
systems dirty? There is no reason to do so.

2. If a file system was mounted read-only, and its device is removed, there are
by definition ZERO dirty buffers that we need to synch on shutdown, so why does
the premature unplug-readonly-before-unmount spoil the shutdown?"
>How-To-Repeat:
1. mount a removable device (e.g. an usb stick) (better use -r to prevent data
loss)
2. unplug the device (without unmounting it)
3. `shutdown -r now`
>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sun Oct 18 16:34:44 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139718 
Responsible-Changed-From-To: freebsd-fs->trasz 
Responsible-Changed-By: trasz 
Responsible-Changed-When: Sun Oct 18 18:42:21 UTC 2009 
Responsible-Changed-Why:  
I'll take it. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139718 

From: Alexander Best <alexbestms@math.uni-muenster.de>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/139718: [reboot] all mounted fs don't get synced during
 reboot/shutdown with &gt;= 1 mounted inaccessible device
Date: Sat, 07 Nov 2009 18:54:57 +0100 (CET)

 this problem is currently being worked on by Edward Tomasz Napierala. running
 r199016 i'm not able to reproduce the problem any longer. the issue seems to
 have been (partially, maybe entirely) resolved by commits r19887[3-7].
 
 i don't think trasz@ has finished committing all the necessary changes to HEAD
 yet, but it seems this pr will get fixed entirely (in HEAD) in the next few
 days.
 
 changes will be mfc'ed to 8-stable, 7-stable and maybe 8.0-release (if re@
 approves the changes). don't know how hard it'll be to merge them into
 6-stable.
 
 so anyone running current please test the recent changes committed by @trasz.
 everybody else running <= 8-stable stay tuned for the fixes to get committed
 to those branches.
 
 might be a good idea to set this pr into analysed or even patched state.
 
 cheers.
 alex

From: Alexander Best <alexbestms@wwu.de>
To: <bug-followup@FreeBSD.org>
Cc:  
Subject: Re: kern/139718: [reboot] all mounted fs don't get synced during
 reboot/shutdown with &gt;= 1 mounted inaccessible device
Date: Mon, 09 Nov 2009 21:45:05 +0100 (CET)

 issue has been partly solved. detaching a usb device in a clean state and
 unmounting it afterwards works.
 
 however neither `umount` nor `umount -f` work in this case:
 
 1) when a read/write to a usb device fail like in this example:
 g_vfs_done():da0[READ(offset=311132160, length=65536)]error = 5
 the device becomes completely inaccessible.
 
 2) when calling umount this also fails with a similar error:
 g_vfs_done():da0[READ(offset=16384, length=4096)]error = 5 since the device is
 still present umount tries to write metadata to it, but fails. removing the
 device and then trying to issue umount fails too.
 
 3) in this situation the problem described in this pr still occurs leaving all
 mounted devices tagged dirty after a reboot.
 
 alex
State-Changed-From-To: open->analyzed 
State-Changed-By: linimon 
State-Changed-When: Tue Nov 10 08:15:00 UTC 2009 
State-Changed-Why:  
The issue is partially solved (see Audit-Trail). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139718 

From: =?iso-8859-2?Q?Edward_Tomasz_Napiera=B3a?= <trasz@FreeBSD.org>
To: Alexander Best <alexbestms@wwu.de>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/139718: [reboot] all mounted fs don't get synced during reboot/shutdown with &gt;= 1 mounted inaccessible device
Date: Mon, 16 Nov 2009 11:29:52 +0100

 Wiadomo=B6=E6 napisana przez Alexander Best w dniu 2009-11-09, o godz. =
 21:50:
 > The following reply was made to PR kern/139718; it has been noted by =
 GNATS.
 >=20
 > From: Alexander Best <alexbestms@wwu.de>
 > To: <bug-followup@FreeBSD.org>
 > Cc: =20
 > Subject: Re: kern/139718: [reboot] all mounted fs don't get synced =
 during
 > reboot/shutdown with &gt;=3D 1 mounted inaccessible device
 > Date: Mon, 09 Nov 2009 21:45:05 +0100 (CET)
 >=20
 > issue has been partly solved. detaching a usb device in a clean state =
 and
 > unmounting it afterwards works.
 
 I was not able to reproduce this.  I don't think anything has changed in =
 this regard
 for the last six months - removing the device and unmounting it =
 afterwards just worked
 in my tests.
 
 > however neither `umount` nor `umount -f` work in this case:
 >=20
 > 1) when a read/write to a usb device fail like in this example:
 > g_vfs_done():da0[READ(offset=3D311132160, length=3D65536)]error =3D 5
 > the device becomes completely inaccessible.
 >=20
 > 2) when calling umount this also fails with a similar error:
 > g_vfs_done():da0[READ(offset=3D16384, length=3D4096)]error =3D 5 since =
 the device is
 > still present umount tries to write metadata to it, but fails. =
 removing the
 > device and then trying to issue umount fails too.
 
 Now, this changes everything.  The device is supposed to fail with error =
 6, which
 is ENXIO, "Device not configured".  This, however, fails with EIO, =
 "Input/output error".
 If the device never returns ENXIO, then GEOM didn't detach it - in other =
 words,
 from the system point of view, it still exists.
 
 What do you do to make it fail this way?
 
 > 3) in this situation the problem described in this pr still occurs =
 leaving all
 > mounted devices tagged dirty after a reboot.
 
 I think it's a separate problem - it seems that, for some reason, we =
 don't unmount
 filesystems properly if one of them fails.
 
 --
 If you cut off my head, what would I say?  Me and my head, or me and my =
 body?
 

From: Alexander Best <alexbestms@wwu.de>
To: <Edward@uni-muenster.de>, <Tomasz@uni-muenster.de>,
	<Napiera&#322@uni-muenster.de>, a <trasz@FreeBSD.org>
Cc: <bug-followup@FreeBSD.org>
Subject: Re: kern/139718: [reboot] all mounted fs don't get synced during
 reboot/shutdown with &gt;= 1 mounted inaccessible device
Date: Mon, 16 Nov 2009 13:52:58 +0100 (CET)

 Edward Tomasz Napiera&#322;a schrieb am 2009-11-16:
 > Wiadomo&#347;&#263; napisana przez Alexander Best w dniu 2009-11-09, o godz.
 > 21:50:
 > > The following reply was made to PR kern/139718; it has been noted
 > > by GNATS.
 
 > > From: Alexander Best <alexbestms@wwu.de>
 > > To: <bug-followup@FreeBSD.org>
 > > Cc:
 > > Subject: Re: kern/139718: [reboot] all mounted fs don't get synced
 > >  during
 > > reboot/shutdown with &gt;= 1 mounted inaccessible device
 > > Date: Mon, 09 Nov 2009 21:45:05 +0100 (CET)
 
 > > issue has been partly solved. detaching a usb device in a clean
 > > state and
 > > unmounting it afterwards works.
 
 > I was not able to reproduce this.  I don't think anything has changed
 > in this regard
 > for the last six months - removing the device and unmounting it
 > afterwards just worked
 > in my tests.
 
 oh i see. to be honest i haven't tried this procedure for a long time. just
 remember that it wasn't possible to unmount inaccessible devices, but that
 might have been long time ago. thought this might be related to the changes i
 mentioned, but apperently i was wrong. ;)
 
 > > however neither `umount` nor `umount -f` work in this case:
 
 > > 1) when a read/write to a usb device fail like in this example:
 > > g_vfs_done():da0[READ(offset=311132160, length=65536)]error = 5
 > > the device becomes completely inaccessible.
 
 > > 2) when calling umount this also fails with a similar error:
 > > g_vfs_done():da0[READ(offset=16384, length=4096)]error = 5 since
 > > the device is
 > > still present umount tries to write metadata to it, but fails.
 > > removing the
 > > device and then trying to issue umount fails too.
 
 > Now, this changes everything.  The device is supposed to fail with
 > error 6, which
 > is ENXIO, "Device not configured".  This, however, fails with EIO,
 > "Input/output error".
 > If the device never returns ENXIO, then GEOM didn't detach it - in
 > other words,
 > from the system point of view, it still exists.
 
 > What do you do to make it fail this way?
 
 i'm not sure this is reproducable in an easy manor. the problem appears with
 only a single umass device. writing large amounts of data to it triggers this
 problem. the problem however doesn't seem to be related to the device itself,
 but the usb2 stack. i ran several health scans under windows and they reported
 no problem with the device.
 
 maybe you could find the exact function returning EIO and replace
 return(errno) with return(EIO) so the problem gets triggered with every umass
 device. just a thought though.
 
 > > 3) in this situation the problem described in this pr still occurs
 > >    leaving all
 > > mounted devices tagged dirty after a reboot.
 
 > I think it's a separate problem - it seems that, for some reason, we
 > don't unmount
 > filesystems properly if one of them fails.
 
 there was a discussion at some point to introduce a -F switch to umount which
 should "really" umount devices forcefully. the -f switch doesn't seem to have
 any effect in regard to this issue. i don't even know that it's purpose is.
 
 > --
 > If you cut off my head, what would I say?  Me and my head, or me and
 > my body?

From: Alexander Best <alexbestms@wwu.de>
To: <bug-followup@FreeBSD.org>,
 <trasz@FreeBSD.org>
Cc:  
Subject: Re: kern/139718: [reboot] all mounted fs don't get synced during
 reboot/shutdown with &gt;= 1 mounted inaccessible device
Date: Wed, 25 Nov 2009 22:06:30 +0100 (CET)

 have there been any recent developments concerning this problem? because after
 some i/o errors:
 
 g_vfs_done():label/usb[WRITE(offset=26853376, length=16384)]error = 5
 g_vfs_done():label/usb[READ(offset=435863552, length=36864)]error = 5
 vnode_pager_getpages: I/O read error
 vm_fault: pager read error, pid 2171 (cp)
 g_vfs_done():label/usb[WRITE(offset=26853376, length=16384)]error = 5
 
 i was able to unmount the device without any problems.
 
 alex

From: Alexander Best <alexbestms@wwu.de>
To: <bug-followup@FreeBSD.org>
Cc: Edward Tomasz =?utf-8?Q?Napiera=C5=82a?= <trasz@FreeBSD.org>
Subject: Re: kern/139718: [reboot] all mounted fs don't get synced during
 reboot/shutdown with >= 1 mounted inaccessible device
Date: Mon, 01 Mar 2010 03:49:21 +0100 (CET)

 i believe this pr can be closed. i'm no longer able to produce a system hang
 during reboot.
 
 if the device becomes inaccessible i still get these errors:
 
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=671744,
 length=4096)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=1667072,
 length=4096)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=1671168,
 length=4096)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2603204608,
 length=16384)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2677309440,
 length=49152)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2684502016,
 length=65536)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2694135808,
 length=32768)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=512,
 length=512)]error = 5
 Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=667648,
 length=4096)]error = 5
 
 if i try to do an umount it fails with EAGAIN (which is odd, because
 unmount(2) doesn't mention EAGAIN).
 
 when removing the device i get these warnings:
 
 Mar  1 03:16:42 otaku kernel: Device usb went missing before all of the data
 could be written to it; expect data loss.
 Mar  1 03:16:58 otaku kernel: deget(): pcbmap returned 6
 Mar  1 03:16:58 otaku last message repeated 2 times
 
 but i'm now able to umount the device properly.
 
 i tried rebooting after seeing the WRITE errors with the device
 
 1) removed and unmounted
 2) removed but still mounted and
 3) with it being still attached
 
 in all cases the kernel manages to sync all buffers and reboot properly.
 
 alex
State-Changed-From-To: analyzed->open 
State-Changed-By: arundel 
State-Changed-When: Wed Sep 22 23:12:04 UTC 2010 
State-Changed-Why:  
It seems this issue still exists. Quite regularly during reboot/shutdown FreeBSD 
is unable to sync all vnodes and times out. Syncing buffers results in a messy 
iterating output which never reaches zero. No timeout gets hit and only a 
physical reset will bring the system back up with all mounted devices being 
marked dirty. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=139718 
>Unformatted:
