From nobody@FreeBSD.org  Wed Jan 25 08:56:26 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4B561106566C
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 25 Jan 2012 08:56:26 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 1AD808FC0C
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 25 Jan 2012 08:56:26 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q0P8uPmd095362
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 25 Jan 2012 08:56:25 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q0P8uPAc095361;
	Wed, 25 Jan 2012 08:56:25 GMT
	(envelope-from nobody)
Message-Id: <201201250856.q0P8uPAc095361@red.freebsd.org>
Date: Wed, 25 Jan 2012 08:56:25 GMT
From: "Eugene M. Zheganin" <eugene@zhegan.in>
To: freebsd-gnats-submit@FreeBSD.org
Subject: fsck -B panics on particular data inconsistency
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         164472
>Category:       kern
>Synopsis:       [ufs] fsck -B panics on particular data inconsistency
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-fs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jan 25 09:00:29 UTC 2012
>Closed-Date:    
>Last-Modified:  Wed Feb  1 06:30:15 UTC 2012
>Originator:     Eugene M. Zheganin
>Release:        8.2-RELEASE
>Organization:
RealService LLC
>Environment:
FreeBSD elf.hq.norma.perm.ru 8.2-RELEASE FreeBSD 8.2-RELEASE #1: Thu May  5 19:14:23 YEKST 2011     emz@ns.hq.norma.perm.ru:/usr/obj/usr/src/sys/ELF  i386
>Description:
fsck -B panics on particular inconsistencies.
I got one machine that sometimes locks up (probably die to some other bug), and after reset when it runs fsck -B it leads to panic.

Since this happens quite often last time I created an image on a partition that makes FreeBSD panic.

Unfortunately this machine doesn't run a debug kernel, and, due to a bug when FreeBSD reboots immidiately upon a key press from a screen where it says 'Automatic reboot in 15 seconds, press any key to abort' (which is still unreported for some reason, while lots of people confirm its existance) I was (and I am) unable to capture a panic screen.

But, since I have an image, everyone can easily reproduce this panic.
This is 100% reproduceable, at least I've done it 5 times on a test machine and got a panic each time. Unfortunately, this machine is too old to build debug kernel in some reasonable amount of time (I really think anyone will download this image faster than I will build a debug kernel).

So... here comes the image in case someone is interested.

Attention, FreeBSD panics only when fsck is run with -B. Ordinary fsck run doesn't panic and is able to successfully resolve all the filesystem errors.
>How-To-Repeat:
Get an image from http://tech.norma.perm.ru/files/var.dsk (sorry, this link is about a couple of megabits, my really broadband link is served by a server from my previous report with a buggy pf route-to/reply-to, so I'm using this old server). Mount it read-write (I didn't test it on an unmounted or read-only image). Like this:

mdconfig -a -t vnode -f var.dsk
mount /dev/md0 /mnt/panic

Run an fsck (since it's a partition image you need to manually specify the fsck of the type needed):

fsck_4.2bsd -B /dev/md0
<here it panics>
>Fix:
Run fsck without -B.

>Release-Note:
>Audit-Trail:

From: "Eugene M. Zheganin" <eugene@zhegan.in>
To: bug-followup@FreeBSD.org, eugene@zhegan.in
Cc:  
Subject: Re: misc/164472: fsck -B panics on particular data inconsistency
Date: Wed, 25 Jan 2012 15:21:48 +0600

 P.S . I forgot to mention that this image size in 2048 megs, so I'm 
 sorry if you cannot afford its downloading.

From: "Eugene M. Zheganin" <eugene@zhegan.in>
To: bug-followup@FreeBSD.org, eugene@zhegan.in
Cc:  
Subject: Re: misc/164472: fsck -B panics on particular data inconsistency
Date: Wed, 25 Jan 2012 15:48:16 +0600

 Yeah, my bad, I compressed it with xz and redirected web-server to it. 
 Now it's about 400 megs. Old URL still works.

From: Mihail Timofeev <9267096@gmail.com>
To: bug-followup@FreeBSD.org, eugene@zhegan.in
Cc:  
Subject: Re: misc/164472: fsck -B panics on particular data inconsistency
Date: Wed, 25 Jan 2012 19:56:57 +0700

 Confirm, bug is reproduced on FreeBSD 9.0-STABLE #2 r230483: Tue Jan
 24 00:48:28 NOVT 2012     root@bsd.loc:/usr/obj/usr/src/sys/GENERIC
 i386
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Jan 30 04:13:00 UTC 2012 
Responsible-Changed-Why:  
reclassify. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=164472 

From: Kostik Belousov <kostikbel@gmail.com>
To: bug-followup@FreeBSD.org, eugene@zhegan.in
Cc:  
Subject: Re: kern/164472: [ufs] fsck -B panics on particular data inconsistency
Date: Mon, 30 Jan 2012 07:30:04 +0200

 You failed to mention which panic you got. Was it 'dup alloc' ? A
 backtace would be also useful.
 
 If it was indeed 'dup alloc', then there is nothing fsck or snapshots
 can be accused for. Your filesystem is in inconsistent state, which
 requires full fsck to recover. It must be not mounted while not
 repaired.
 
 Somewhat more interesting is how the fs got into this state.

From: "Eugene M. Zheganin" <emz@norma.perm.ru>
To: bug-followup@FreeBSD.org, eugene@zhegan.in
Cc:  
Subject: Re: kern/164472: [ufs] fsck -B panics on particular data inconsistency
Date: Mon, 30 Jan 2012 13:50:35 +0600

 This state can be achieved (and sometimes is achieved) after a server 
 hangup and/or reset.
 I cannot agree that using fsck -B should lead to panic, because there is 
 no way to distinguish filesystem between the state where it can be cured 
 with fsck -B and where it can not. After all, this is what the bgfsck is 
 for.

From: Kirk McKusick <mckusick@mckusick.com>
To: eugene@zhegan.in
Cc: bug-followup@FreeBSD.org, freebsd-fs@FreeBSD.org,
        Kostik Belousov <kostikbel@gmail.com>
Subject: Re: kern/164472: [ufs] fsck -B panics on particular data inconsistency 
Date: Tue, 31 Jan 2012 22:26:43 -0800

 > From: Kostik Belousov <kostikbel@gmail.com>
 > To: bug-followup@FreeBSD.org, eugene@zhegan.in
 > Cc:  
 > Subject: Re: kern/164472: [ufs] fsck -B panics on particular data inconsistency
 > Date: Mon, 30 Jan 2012 07:30:04 +0200
 > 
 >  You failed to mention which panic you got. Was it 'dup alloc' ? A
 >  backtace would be also useful.
 > 
 >  If it was indeed 'dup alloc', then there is nothing fsck or snapshots
 >  can be accused for. Your filesystem is in inconsistent state, which
 >  requires full fsck to recover. It must be not mounted while not
 >  repaired.
 > 
 >  Somewhat more interesting is how the fs got into this state.
 
 Thanks for your report and in particular a small file image that
 demonstrates the problem. I have been able to reproduce your panic
 reliably on my test machine.
 
 Running a normal fsck on the image does indeed show that the filesystem
 has corruption that is unexpected on a filesystem running with soft
 updates. So, in the end, if the background fsck were able to run, it
 would fail and notify the system that it needed to be checked by a
 full fsck. But as you have aptly demonstrated, the background fsck
 crashes the system as it tries to take a snapshot of the filesystem
 on which to run its check.
 
 The cause of the crash is because in taking a snapshot, the filesystem
 needs to allocate an inode for the snapshot. As it turns out, the
 inode that it tries to allocate is marked free in the inode map, but
 is in fact already allocated which leads to the panic.
 
 I am still mulling over how to resolve this problem, but have not
 yet come up with one. I am looking for a solution that effectively
 will let the snapshot fail rather than crashing the system so that
 the fsck -B can then gracefully fail and lead to the full fsck as
 is needed in this case.
 
 	Kirk McKusick
>Unformatted:
