From yar@comp.chem.msu.su  Sat May 27 10:38:17 2006
Return-Path: <yar@comp.chem.msu.su>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D846716A4AB
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 27 May 2006 10:38:17 +0000 (UTC)
	(envelope-from yar@comp.chem.msu.su)
Received: from comp.chem.msu.su (comp.chem.msu.su [158.250.32.97])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 90D8F43D46
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 27 May 2006 10:38:07 +0000 (GMT)
	(envelope-from yar@comp.chem.msu.su)
Received: from comp.chem.msu.su (localhost [127.0.0.1])
	by comp.chem.msu.su (8.13.4/8.13.3) with ESMTP id k4RAbtUL063347
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 27 May 2006 14:37:55 +0400 (MSD)
	(envelope-from yar@comp.chem.msu.su)
Received: (from yar@localhost)
	by comp.chem.msu.su (8.13.4/8.13.3/Submit) id k4RAbtYw063346;
	Sat, 27 May 2006 14:37:55 +0400 (MSD)
	(envelope-from yar)
Message-Id: <200605271037.k4RAbtYw063346@comp.chem.msu.su>
Date: Sat, 27 May 2006 14:37:55 +0400 (MSD)
From: Yar Tikhiy <yar@comp.chem.msu.su>
Reply-To: Yar Tikhiy <yar@comp.chem.msu.su>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: loader corrupts other files when rewriting nextboot.conf
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         98005
>Category:       bin
>Synopsis:       loader corrupts other files when rewriting nextboot.conf
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    iedowse
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat May 27 10:40:17 GMT 2006
>Closed-Date:    Mon Jun 26 01:45:38 GMT 2006
>Last-Modified:  Mon Jun 26 01:45:38 GMT 2006
>Originator:     Yar Tikhiy
>Release:        FreeBSD 7.0-CURRENT i386
>Organization:
None
>Environment:
System: FreeBSD  7.0-CURRENT FreeBSD 7.0-CURRENT #1: Thu May 25
19:55:44 UTC 2006     root@:/usr/obj/usr/src/sys/GENERIC  i386

>Description:
	When nextboot is in effect, loader(8) writes the modified
	contents of the nextboot.conf file, or whatever nextboot_conf
	is set to, to an incorrect location on the disk.  The
	location overwritten is in the block after the one actually
	belonging to the nextboot.conf file.

	This problem is likely to be caused by an off-by-one bug in
	the stand-alone FS access library used by loader(8).

>How-To-Repeat:

### Here's an example.  Booting in the following environment:

# cat /boot/loader.conf
beastie_disable="YES"
nextboot_conf="/root/foo"
# cat /root/foo
nextboot_enable="YES"
kernel="${kernel}"
kernel_options="${kernel_options}"

### After the reboot, /boot/kernel/kernel.symbols appears damaged:

# /root/ckroot
378c378
< MD5 (/boot/kernel/kernel.symbols) = 98d1d52fed7985df0712618de7db8e03
---
> MD5 (/boot/kernel/kernel.symbols) = b42770ba4b010e8aed83ca89440ae79b

### Inspecting its contents.  Note the `nextboot_enable="NO" ' string.
### It apparently came from rewrite_nextboot_file in support.4th.

# hd /boot/kernel/kernel.symbols | grep -A8 nextboot_enable
000a0000  6e 65 78 74 62 6f 6f 74  5f 65 6e 61 62 6c 65 3d  |nextboot_enable=|
000a0010  22 4e 4f 22 20 0a 6b 65  72 6e 65 6c 3d 22 24 7b  |"NO" .kernel="${|
000a0020  6b 65 72 6e 65 6c 7d 22  0a 6b 65 72 6e 65 6c 5f  |kernel}".kernel_|
000a0030  6f 70 74 69 6f 6e 73 3d  22 24 7b 6b 65 72 6e 65  |options="${kerne|
000a0040  6c 5f 6f 70 74 69 6f 6e  73 7d 22 0a 00 00 00 00  |l_options}".....|
000a0050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000a0800  00 00 02 23 0c 0a 20 5a  00 00 05 80 30 01 00 00  |...#.. Z....0...|
000a0810  02 23 10 00 1a 4b 0e 00  00 01 4c 00 00 00 19 30  |.#...K....L....0|

### Let's find out the files' inodes:

# ls -if /root/foo /boot/kernel/kernel.symbols
490 /root/foo                    36 /boot/kernel/kernel.symbols

### Let's see how the files' blocks are laid out:

# fsdb -r /dev/ad0s3a
** /dev/ad0s3a (NO WRITE)
Examining file system `/dev/ad0s3a'
Last Mounted on /
current inode: directory
I=2 MODE=40755 SIZE=512
        MTIME=May 25 20:46:43 2006 [0 nsec]
        CTIME=May 25 20:46:43 2006 [0 nsec]
        ATIME=May 27 12:21:18 2006 [0 nsec]
OWNER=root GRP=wheel LINKCNT=20 FLAGS=0 BLKCNT=4 GEN=58ef642f
fsdb (inum: 2)> inode 490
current inode: regular file
I=490 MODE=100644 SIZE=76
        MTIME=May 27 12:15:23 2006 [0 nsec]
        CTIME=May 27 12:15:23 2006 [0 nsec]
        ATIME=May 27 12:21:37 2006 [0 nsec]
OWNER=root GRP=wheel LINKCNT=1 FLAGS=0 BLKCNT=4 GEN=6f069864
fsdb (inum: 490)> blocks
Blocks for inode 490:
Direct blocks:
6551 (1 frag)
fsdb (inum: 490)> inode 36
current inode: regular file
I=36 MODE=100555 SIZE=18022637
        MTIME=May 25 19:55:57 2006 [0 nsec]
        CTIME=May 25 20:22:34 2006 [0 nsec]
        ATIME=May 27 12:21:19 2006 [0 nsec]
OWNER=root GRP=wheel LINKCNT=1 FLAGS=0 BLKCNT=89c0 GEN=3c3ab2a9
fsdb (inum: 36)> blocks
Blocks for inode 36:
Direct blocks:
6104, 6112, 6120, 6128, 6136, 6144, 6152, 6160, 6200, 6208, 6216, 6224
Indirect blocks:
6232, 6240, 6248, 6256, 6264, 6272, 6280, 6288, 6296, 6304, 6312, 6320, 6328,
6336, 6344, 6352, 6360, 6368, 6376, 6384, 6392, 6400, 6408, 6416, 6424, 6432,
6440, 6448, 6552, 6560, 6568, 6576, 6584, 6592, 6600, 6608, 6616, 6624, 6632,
            ^^^^
...

### Note that block 6552 belongs to the damaged file while block 6551
### contains /root/foo, which plays the role of a nextboot_conf file.
### BTW, 6551 is a fragment address.

### Now we need to know the FS parameters to do some easy math.

# dumpfs /dev/ad0s3a | head -20
magic   19540119 (UFS2) time    Sat May 27 12:21:51 2006
superblock location     65536   id      [ 4475a06c d0014a51 ]
ncg     4       size    262144  blocks  253815
bsize   16384   shift   14      mask    0xffffc000
fsize   2048    shift   11      mask    0xfffff800
frag    8       shift   3       fsbtodb 2
minfree 8%      optim   time    symlinklen 120
maxbsize 16384  maxbpg  2048    maxcontig 8     contigsumsize 8
nbfree  20648   ndir    49      nifree  63361   nffree  1229
bpg     8193    fpg     65544   ipg     16448
nindir  2048    inopb   64      maxfilesize     140806241583103
sbsize  2048    cgsize  12288   csaddr  2112    cssize  2048
sblkno  40      cblkno  48      iblkno  56      dblkno  2112
cgrotor 0       fmod    0       ronly   0       clean   1
avgfpdir 64     avgfilesize 16384
flags   none
fsmnt   /
volname         swuid   0
...

### Aha, bsize is 16384.  40 blocks precede block 6552 in the damaged
### file according to fsdb.
### Therefore the file offset of block 6552 is 40*16384 = 0xa0000,
### which is exactly the offset of the damaged part (see above.)

### Let's see the raw disk contents.  Keep in mind that FFS blocks
### are addressed in a funny way.  The "bs" value here isn't bsize,
### but fsize.

# dd bs=2048 iseek=6551 count=2 if=/dev/ad0s3a | hd
00000000  6e 65 78 74 62 6f 6f 74  5f 65 6e 61 62 6c 65 3d  |nextboot_enable=|
00000010  22 59 45 53 22 0a 6b 65  72 6e 65 6c 3d 22 24 7b  |"YES".kernel="${|
00000020  6b 65 72 6e 65 6c 7d 22  0a 6b 65 72 6e 65 6c 5f  |kernel}".kernel_|
00000030  6f 70 74 69 6f 6e 73 3d  22 24 7b 6b 65 72 6e 65  |options="${kerne|
00000040  6c 5f 6f 70 74 69 6f 6e  73 7d 22 0a 00 00 00 00  |l_options}".....|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000800  6e 65 78 74 62 6f 6f 74  5f 65 6e 61 62 6c 65 3d  |nextboot_enable=|
00000810  22 4e 4f 22 20 0a 6b 65  72 6e 65 6c 3d 22 24 7b  |"NO" .kernel="${|
00000820  6b 65 72 6e 65 6c 7d 22  0a 6b 65 72 6e 65 6c 5f  |kernel}".kernel_|
00000830  6f 70 74 69 6f 6e 73 3d  22 24 7b 6b 65 72 6e 65  |options="${kerne|
00000840  6c 5f 6f 70 74 69 6f 6e  73 7d 22 0a 00 00 00 00  |l_options}".....|
00000850  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
2+0 records in
2+0 records out
4096 bytes transferred in 0.011375 secs (360082 bytes/sec)

### We can see clearly that the loader wrote the modified data
### to the 2048-byte fragment following the correct one.

### And of course, the FS metadata aren't damaged:

# fsck -f /
** /dev/ad0s3a
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
2429 files, 87402 used, 166413 free (1229 frags, 20648 blocks, 0.5% fragmentation)

>Fix:
>Release-Note:
>Audit-Trail:

From: Ian Dowse <iedowse@iedowse.com>
To: Yar Tikhiy <yar@comp.chem.msu.su>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: bin/98005: loader corrupts other files when rewriting nextboot.conf 
Date: Sat, 27 May 2006 13:05:01 +0100

 In message <200605271037.k4RAbtYw063346@comp.chem.msu.su>, Yar Tikhiy writes:
 >	When nextboot is in effect, loader(8) writes the modified
 >	contents of the nextboot.conf file, or whatever nextboot_conf
 >	is set to, to an incorrect location on the disk.  The
 >	location overwritten is in the block after the one actually
 >	belonging to the nextboot.conf file.
 >
 >	This problem is likely to be caused by an off-by-one bug in
 >	the stand-alone FS access library used by loader(8).
 
 You could try the following (I haven't tested it), but it's pretty
 obvious how the bug happened if you compare bd_write() with the
 bd_read() function that it was copied from. Looks like the author
 of bd_write() was more interested in writing a little song in the
 comments than writing to the correct part of the disk ;-)
 
 The bug probably wasn't noticed originally because it only affected
 the LBA access case.
 
 Ian
 
 Index: i386/libi386/biosdisk.c
 ===================================================================
 RCS file: /dump/FreeBSD-CVS/src/sys/boot/i386/libi386/biosdisk.c,v
 retrieving revision 1.46
 diff -u -r1.46 biosdisk.c
 --- i386/libi386/biosdisk.c	19 Dec 2005 09:00:11 -0000	1.46
 +++ i386/libi386/biosdisk.c	27 May 2006 11:53:34 -0000
 @@ -1037,9 +1037,6 @@
  	*/
  	if (bbuf != NULL)
  	    bcopy(p, breg, x * BIOSDISK_SECSIZE);
 -	p += (x * BIOSDISK_SECSIZE);
 -	dblk += x;
 -	resid -= x;
  
  	/* Loop retrying the operation a couple of times.  The BIOS may also retry. */
  	for (retry = 0; retry < 3; retry++) {
 @@ -1103,6 +1100,9 @@
  	if (result) {
  	    return(-1);
  	}
 +	p += (x * BIOSDISK_SECSIZE);
 +	dblk += x;
 +	resid -= x;
      }
  	
  /*    hexdump(dest, (blks * BIOSDISK_SECSIZE)); */

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Ian Dowse <iedowse@iedowse.com>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: bin/98005: loader corrupts other files when rewriting nextboot.conf
Date: Sun, 28 May 2006 11:03:49 +0400

 On Sat, May 27, 2006 at 01:05:01PM +0100, Ian Dowse wrote:
 > In message <200605271037.k4RAbtYw063346@comp.chem.msu.su>, Yar Tikhiy writes:
 > >	When nextboot is in effect, loader(8) writes the modified
 > >	contents of the nextboot.conf file, or whatever nextboot_conf
 > >	is set to, to an incorrect location on the disk.  The
 > >	location overwritten is in the block after the one actually
 > >	belonging to the nextboot.conf file.
 > >
 > >	This problem is likely to be caused by an off-by-one bug in
 > >	the stand-alone FS access library used by loader(8).
 > 
 > You could try the following (I haven't tested it), but it's pretty
 > obvious how the bug happened if you compare bd_write() with the
 > bd_read() function that it was copied from. Looks like the author
 > of bd_write() was more interested in writing a little song in the
 > comments than writing to the correct part of the disk ;-)
 
 Your patch seems to work, thanks!  I'll give it more thorough testing
 today by using nextboot routinely.
 
 > The bug probably wasn't noticed originally because it only affected
 > the LBA access case.
 
 Indeed, my FreeBSD slice lies above the 1023rd cylinder.
 
 -- 
 Yar

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Ian Dowse <iedowse@iedowse.com>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: bin/98005: loader corrupts other files when rewriting nextboot.conf
Date: Wed, 31 May 2006 11:36:28 +0400

 On Sat, May 27, 2006 at 01:05:01PM +0100, Ian Dowse wrote:
 > In message <200605271037.k4RAbtYw063346@comp.chem.msu.su>, Yar Tikhiy writes:
 > >	When nextboot is in effect, loader(8) writes the modified
 > >	contents of the nextboot.conf file, or whatever nextboot_conf
 > >	is set to, to an incorrect location on the disk.  The
 > >	location overwritten is in the block after the one actually
 > >	belonging to the nextboot.conf file.
 > >
 > >	This problem is likely to be caused by an off-by-one bug in
 > >	the stand-alone FS access library used by loader(8).
 > 
 > You could try the following (I haven't tested it), but it's pretty
 > obvious how the bug happened if you compare bd_write() with the
 > bd_read() function that it was copied from. Looks like the author
 > of bd_write() was more interested in writing a little song in the
 > comments than writing to the correct part of the disk ;-)
 [...]
 
 I've been using nextboot with the patched loader lately and failed
 to notice any corruption.  The patch apparently works.  Thank you
 very much!  Are you going to commit it now?
 
 -- 
 Yar
State-Changed-From-To: open->patched 
State-Changed-By: iedowse 
State-Changed-When: Wed May 31 09:08:56 UTC 2006 
State-Changed-Why:  

Fixed in revision 1.47 of biosdisk.c 


Responsible-Changed-From-To: freebsd-bugs->iedowse 
Responsible-Changed-By: iedowse 
Responsible-Changed-When: Wed May 31 09:08:56 UTC 2006 
Responsible-Changed-Why:  
MFC reminder 

http://www.freebsd.org/cgi/query-pr.cgi?pr=98005 
State-Changed-From-To: patched->closed 
State-Changed-By: iedowse 
State-Changed-When: Mon Jun 26 01:44:56 UTC 2006 
State-Changed-Why:  
This has now been MFC'd to RELENG_6. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=98005 
>Unformatted:
