From acid@thunderbird.dg.net.ua  Mon Feb  4 07:33:08 2002
Return-Path: <acid@thunderbird.dg.net.ua>
Received: from thunderbird.dg.net.ua (thunderbird.dg.net.ua [213.186.192.11])
	by hub.freebsd.org (Postfix) with ESMTP id A494637B43A
	for <FreeBSD-gnats-submit@freebsd.org>; Mon,  4 Feb 2002 07:33:03 -0800 (PST)
Received: (from root@localhost)
	by thunderbird.dg.net.ua (8.11.6/8.11.6) id g14FWvu08490;
	Mon, 4 Feb 2002 17:32:57 +0200 (EET)
	(envelope-from acid)
Message-Id: <200202041532.g14FWvu08490@thunderbird.dg.net.ua>
Date: Mon, 4 Feb 2002 17:32:57 +0200 (EET)
From: Michael Vasilenko <acid@dg.net.ua>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: kernel panic in ufsdirhash_lookup
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         34613
>Category:       kern
>Synopsis:       kernel panic in ufsdirhash_lookup
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Feb 04 07:40:00 PST 2002
>Closed-Date:    Thu Feb 7 14:32:02 PST 2002
>Last-Modified:  Thu Feb 07 14:33:01 PST 2002
>Originator:     Michael Vasilenko
>Release:        FreeBSD 4.5-STABLE i386
>Organization:
DG Ltd. ISP
>Environment:
System: FreeBSD thunderbird.dg.net.ua 4.5-STABLE FreeBSD 4.5-STABLE #5: Mon Feb 4 16:25:03 EET 2002 root@thunderbird.dg.net.ua:/var/obj/usr/src/sys/BIRD i386

>Description:

GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
IdlePTD at phsyical address 0x00393000
initial pcb at physical address 0x002fc5c0
panicstr: from debugger
panic messages:
---
panic: ufsdirhash_lookup: bad offset in hash array
panic: from debugger
Uptime: 30m12s

dumping to dev #ad/0x20001, offset 543232
dump ata0: resetting devices .. done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
---
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:485
485		if (dumping++) {
(kgdb) where
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:485
#1  0xc01587c4 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:314
#2  0xc0158bd9 in panic (fmt=0xc0295ce4 "from debugger")
    at /usr/src/sys/kern/kern_shutdown.c:593
#3  0xc012bc6d in db_panic (addr=-1071168587, have_addr=0, count=-1, 
    modif=0xcdb95ab4 "") at /usr/src/sys/ddb/db_command.c:435
#4  0xc012bc0b in db_command (last_cmdp=0xc02c8f24, cmd_table=0xc02c8d64, 
    aux_cmd_tablep=0xc02f5f78) at /usr/src/sys/ddb/db_command.c:333
#5  0xc012bcd2 in db_command_loop () at /usr/src/sys/ddb/db_command.c:457
#6  0xc012de83 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
#7  0xc0274154 in kdb_trap (type=3, code=0, regs=0xcdb95bbc)
    at /usr/src/sys/i386/i386/db_interface.c:158
#8  0xc0280b84 in trap (frame={tf_fs = 16, tf_es = -1006567408, tf_ds = 16, 
      tf_edi = -843284608, tf_esi = 256, tf_ebp = -843490300, tf_isp = -843490328, 
      tf_ebx = -1070918784, tf_edx = -1070872209, tf_ecx = 0, tf_eax = 18, 
      tf_trapno = 3, tf_err = 0, tf_eip = -1071168587, tf_cs = 8, tf_eflags = 582, 
      tf_esp = -1070872225, tf_ss = -1071007909}) at /usr/src/sys/i386/i386/trap.c:574
#9  0xc02743b5 in Debugger (msg=0xc029b75b "panic") at machine/cpufunc.h:67
#10 0xc0158bd0 in panic (fmt=0xc02b1380 "ufsdirhash_lookup: bad offset in hash array")
    at /usr/src/sys/kern/kern_shutdown.c:591
#11 0xc023d1ef in ufsdirhash_lookup (ip=0xc1390900, 
    name=0xcdae7c0a "univ-month.png.meta", namelen=19, offp=0xc1390964, 
    bpp=0xcdb95cc8, prevoffp=0x0) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:359
#12 0xc023762f in ufs_lookup (ap=0xcdb95d24) at /usr/src/sys/ufs/ufs/ufs_lookup.c:212
#13 0xc023ca21 in ufs_vnoperate (ap=0xcdb95d24)
    at /usr/src/sys/ufs/ufs/ufs_vnops.c:2423
#14 0xc018235a in vfs_cache_lookup (ap=0xcdb95d7c) at vnode_if.h:77
#15 0xc023ca21 in ufs_vnoperate (ap=0xcdb95d7c)
    at /usr/src/sys/ufs/ufs/ufs_vnops.c:2423
#16 0xc0185339 in lookup (ndp=0xcdb95ed4) at vnode_if.h:52
#17 0xc0184e24 in namei (ndp=0xcdb95ed4) at /usr/src/sys/kern/vfs_lookup.c:153
#18 0xc018d82a in vn_open (ndp=0xcdb95ed4, fmode=1538, cmode=420)
    at /usr/src/sys/kern/vfs_vnops.c:99
#19 0xc01899b0 in open (p=0xcbf92a40, uap=0xcdb95f80)
    at /usr/src/sys/kern/vfs_syscalls.c:999
#20 0xc02814a1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = -1078001617, 
      tf_edi = 8, tf_esi = 672999120, tf_ebp = -1077937452, tf_isp = -843489324, 
      tf_ebx = 672925284, tf_edx = 672999120, tf_ecx = 14, tf_eax = 5, 
      tf_trapno = 12, tf_err = 2, tf_eip = 672834804, tf_cs = 31, tf_eflags = 663, 
      tf_esp = -1077937496, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1157
#21 0xc0274f45 in Xint0x80_syscall ()
#22 0x28082c6f in ?? ()
#23 0x2808729b in ?? ()
#24 0x2807e11d in ?? ()
#25 0x280e7ef0 in ?? ()
#26 0x8048e75 in ?? ()
#27 0x8048d61 in ?? ()
(kgdb) up 11
#11 0xc023d1ef in ufsdirhash_lookup (ip=0xc1390900, 
    name=0xcdae7c0a "univ-month.png.meta", namelen=19, offp=0xc1390964, 
    bpp=0xcdb95cc8, prevoffp=0x0) at /usr/src/sys/ufs/ufs/ufs_dirhash.c:359
359				panic("ufsdirhash_lookup: bad offset in hash array");
(kgdb) list
354		    slot = WRAPINCR(slot, dh->dh_hlen)) {
355			if (offset == DIRHASH_DEL)
356				continue;
357	
358			if (offset < 0 || offset >= ip->i_size)
359				panic("ufsdirhash_lookup: bad offset in hash array");
360			if ((offset & ~bmask) != blkoff) {
361				if (bp != NULL)
362					brelse(bp);
363				blkoff = offset & ~bmask;
(kgdb) print offset
$1 = 33571596
(kgdb) print ip
$2 = (struct inode *) 0xc1390900
(kgdb) print *ip->i_size
There is no member named i_size.
(kgdb) q

>How-To-Repeat:
	don't know

>Fix:
	none

>Release-Note:
>Audit-Trail:

From: David Malone <dwmalone@maths.tcd.ie>
To: Michael Vasilenko <acid@dg.net.ua>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup
Date: Mon, 4 Feb 2002 16:32:13 +0000

 On Mon, Feb 04, 2002 at 05:32:57PM +0200, Michael Vasilenko wrote:
 > panic: ufsdirhash_lookup: bad offset in hash array
 > panic: from debugger
 > Uptime: 30m12s
 
 Is this the same machine that you're seeing the panics on when
 it is heavly loaded?
 
 	David.

From: Michael Vasilenko <acid@dg.net.ua>
To: David Malone <dwmalone@maths.tcd.ie>
Cc: <FreeBSD-gnats-submit@freebsd.org>
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup
Date: Mon, 4 Feb 2002 18:35:52 +0200 (EET)

 On Mon, 4 Feb 2002, David Malone wrote:
 
 DM> On Mon, Feb 04, 2002 at 05:32:57PM +0200, Michael Vasilenko wrote:
 DM> > panic: ufsdirhash_lookup: bad offset in hash array
 DM> > panic: from debugger
 DM> > Uptime: 30m12s
 DM>
 DM> Is this the same machine that you're seeing the panics on when
 DM> it is heavly loaded?
 
 yes
 
 DM> 	David.
 DM>
 
 -- 
 Michael Vasilenko
 

From: David Malone <dwmalone@maths.tcd.ie>
To: Michael Vasilenko <acid@dg.net.ua>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup
Date: Mon, 4 Feb 2002 16:39:21 +0000

 On Mon, Feb 04, 2002 at 05:32:57PM +0200, Michael Vasilenko wrote:
 > (kgdb) print offset
 > $1 = 33571596
 > (kgdb) print ip
 > $2 = (struct inode *) 0xc1390900
 > (kgdb) print *ip->i_size
 
 Try "print *ip->i_din.di_size".
 
 > There is no member named i_size.
 > (kgdb) q
 
 Actually, the offset 33571596 in hex is 0x200430c. I wonder if that
 is a top bit which got set in the offset by accident. Do you know
 which directory it would have been searching for "runiv-month.png.meta"
 in? Could you "ls -ld" it?
 
 	David.

From: Michael Vasilenko <acid@dg.net.ua>
To: David Malone <dwmalone@maths.tcd.ie>
Cc: <FreeBSD-gnats-submit@freebsd.org>
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup
Date: Mon, 4 Feb 2002 18:49:33 +0200 (EET)

 DM> On Mon, Feb 04, 2002 at 05:32:57PM +0200, Michael Vasilenko wrote:
 DM> > (kgdb) print offset
 DM> > $1 = 33571596
 DM> > (kgdb) print ip
 DM> > $2 = (struct inode *) 0xc1390900
 DM> > (kgdb) print *ip->i_size
 DM>
 DM> Try "print *ip->i_din.di_size".
 
 (kgdb) print *ip->i_din.di_size
 Cannot access memory at address 0x17e00.
 
 DM> > (kgdb) q
 DM>
 DM> Actually, the offset 33571596 in hex is 0x200430c. I wonder if that
 DM> is a top bit which got set in the offset by accident. Do you know
 DM> which directory it would have been searching for "runiv-month.png.meta"
 DM> in? Could you "ls -ld" it?
 
 thunderbird:/var/mrtg# ls -l univ-month.png.meta
 -rw-r--r--  1 root  www  39 Feb  4 17:12 univ-month.png.meta
 thunderbird:/var/mrtg# ls -ld
 drwxr-xr-x  2 root  www  97792 Feb  4 18:42 .
 
 -- 
 Michael Vasilenko
 

From: David Malone <dwmalone@maths.tcd.ie>
To: Michael Vasilenko <acid@dg.net.ua>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup 
Date: Mon, 04 Feb 2002 16:59:29 +0000

 > DM> Actually, the offset 33571596 in hex is 0x200430c. I wonder if that
 > DM> is a top bit which got set in the offset by accident. Do you know
 > DM> which directory it would have been searching for "runiv-month.png.meta"
 > DM> in? Could you "ls -ld" it?
 
 > thunderbird:/var/mrtg# ls -l univ-month.png.meta
 > -rw-r--r--  1 root  www  39 Feb  4 17:12 univ-month.png.meta
 > thunderbird:/var/mrtg# ls -ld
 > drwxr-xr-x  2 root  www  97792 Feb  4 18:42 .
 
 OK - If I remove the top bit from 0x200430c, then I get 17164
 decimal, which looks like a valid offset in that directory. My guess
 is that something is causing random bit flips in memory and this
 the cause of the crashes. Could you post the output of "hd -s 17100
 -l 128 ." in that directory, that way we can see if 17164 is the
 right offset.
 
 If it does look like random bit flips then we'll have to figure out
 if it is a software or hardware problem.
 
 	David.

From: Michael Vasilenko <acid@dg.net.ua>
To: David Malone <dwmalone@maths.tcd.ie>
Cc: <FreeBSD-gnats-submit@freebsd.org>
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup 
Date: Mon, 4 Feb 2002 19:26:51 +0200 (EET)

 On Mon, 4 Feb 2002, David Malone wrote:
 
 DM> > DM> Actually, the offset 33571596 in hex is 0x200430c. I wonder if that
 DM> > DM> is a top bit which got set in the offset by accident. Do you know
 DM> > DM> which directory it would have been searching for "runiv-month.png.meta"
 DM> > DM> in? Could you "ls -ld" it?
 DM>
 DM> > thunderbird:/var/mrtg# ls -l univ-month.png.meta
 DM> > -rw-r--r--  1 root  www  39 Feb  4 17:12 univ-month.png.meta
 DM> > thunderbird:/var/mrtg# ls -ld
 DM> > drwxr-xr-x  2 root  www  97792 Feb  4 18:42 .
 DM>
 DM> OK - If I remove the top bit from 0x200430c, then I get 17164
 DM> decimal, which looks like a valid offset in that directory. My guess
 DM> is that something is causing random bit flips in memory and this
 DM> the cause of the crashes. Could you post the output of "hd -s 17100
 DM> -l 128 ." in that directory, that way we can see if 17164 is the
 DM> right offset.
 DM>
 DM> If it does look like random bit flips then we'll have to figure out
 DM> if it is a software or hardware problem.
 
 000042cc  72 2e 70 6e 67 00 00 00  21 60 1b 00 1c 00 08 11  |r.png...!`......|
 000042dc  75 6e 69 76 2d 64 61 79  2e 70 6e 67 2e 6d 65 74  |univ-day.png.met|
 000042ec  61 00 35 cc 22 60 1b 00  1c 00 08 12 75 6e 69 76  |a.5."`......univ|
 000042fc  2d 77 65 65 6b 2e 70 6e  67 2e 6d 65 74 61 00 cc  |-week.png.meta..|
 0000430c  23 60 1b 00 1c 00 08 13  75 6e 69 76 2d 6d 6f 6e  |#`......univ-mon|
 0000431c  74 68 2e 70 6e 67 2e 6d  65 74 61 00 24 60 1b 00  |th.png.meta.$`..|
 0000432c  1c 00 08 12 75 6e 69 76  2d 79 65 61 72 2e 70 6e  |....univ-year.pn|
 0000433c  67 2e 6d 65 74 61 00 cc  26 60 1b 00 18 00 08 0e  |g.meta..&`......|
 0000434c
 
 -- 
 Michael Vasilenko
 

From: David Malone <dwmalone@maths.tcd.ie>
To: Michael Vasilenko <acid@dg.net.ua>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup 
Date: Mon, 04 Feb 2002 17:39:00 +0000

 > DM> If it does look like random bit flips then we'll have to figure out
 > DM> if it is a software or hardware problem.
 
 > 000042cc  72 2e 70 6e 67 00 00 00  21 60 1b 00 1c 00 08 11  |r.png...!`......|
 > 000042dc  75 6e 69 76 2d 64 61 79  2e 70 6e 67 2e 6d 65 74  |univ-day.png.met|
 > 000042ec  61 00 35 cc 22 60 1b 00  1c 00 08 12 75 6e 69 76  |a.5."`......univ|
 > 000042fc  2d 77 65 65 6b 2e 70 6e  67 2e 6d 65 74 61 00 cc  |-week.png.meta..|
 > 0000430c  23 60 1b 00 1c 00 08 13  75 6e 69 76 2d 6d 6f 6e  |#`......univ-mon|
 > 0000431c  74 68 2e 70 6e 67 2e 6d  65 74 61 00 24 60 1b 00  |th.png.meta.$`..|
 
 Definitely a bitfilp. See the line which starts 0000430c, there are
 4 bytes for the inode number, 2 bytes for the record length, 1 for
 the file type and 1 for the name length and then the name dirhash
 was looking for. We're almost certainly looking at a case of dirhash
 writing the correct offset in and then something corrupting it by
 flipping a bit.
 
 The other panic you reported is probably caused by the same sort
 of corruption.
 
 	http://www.FreeBSD.org/cgi/query-pr.cgi?pr=34605
 
 Is this new hardware, or was it successfully in use before you
 installed 4.5? If it was successfully in use before, then what
 version of FreeBSD were you using before?
 
 	David.
State-Changed-From-To: open->feedback 
State-Changed-By: dwmalone 
State-Changed-When: Thu Feb 7 08:45:55 PST 2002 
State-Changed-Why:  
Waiting to hear from Michael to see if he reckons we've got hardware 
or software corruption. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=34613 

From: Michael Vasilenko <acid@dg.net.ua>
To: David Malone <dwmalone@maths.tcd.ie>
Cc: <FreeBSD-gnats-submit@freebsd.org>
Subject: Re: kern/34613: kernel panic in ufsdirhash_lookup 
Date: Fri, 8 Feb 2002 00:18:55 +0200 (EET)

 On Mon, 4 Feb 2002, David Malone wrote:
 
 DM> > DM> If it does look like random bit flips then we'll have to figure out
 DM> > DM> if it is a software or hardware problem.
 DM>
 DM> > 000042cc  72 2e 70 6e 67 00 00 00  21 60 1b 00 1c 00 08 11  |r.png...!`......|
 DM> > 000042dc  75 6e 69 76 2d 64 61 79  2e 70 6e 67 2e 6d 65 74  |univ-day.png.met|
 DM> > 000042ec  61 00 35 cc 22 60 1b 00  1c 00 08 12 75 6e 69 76  |a.5."`......univ|
 DM> > 000042fc  2d 77 65 65 6b 2e 70 6e  67 2e 6d 65 74 61 00 cc  |-week.png.meta..|
 DM> > 0000430c  23 60 1b 00 1c 00 08 13  75 6e 69 76 2d 6d 6f 6e  |#`......univ-mon|
 DM> > 0000431c  74 68 2e 70 6e 67 2e 6d  65 74 61 00 24 60 1b 00  |th.png.meta.$`..|
 DM>
 DM> Definitely a bitfilp. See the line which starts 0000430c, there are
 DM> 4 bytes for the inode number, 2 bytes for the record length, 1 for
 DM> the file type and 1 for the name length and then the name dirhash
 DM> was looking for. We're almost certainly looking at a case of dirhash
 DM> writing the correct offset in and then something corrupting it by
 DM> flipping a bit.
 
 Thank you, it's really was hardware problem (bad memory chip), you may
 close my second case too.
 
 -- 
 Michael Vasilenko
 
State-Changed-From-To: feedback->closed 
State-Changed-By: dwmalone 
State-Changed-When: Thu Feb 7 14:32:02 PST 2002 
State-Changed-Why:  
Submitter confirms that it is a hardware problem - bad memory seemingly. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=34613 
>Unformatted:
