From nobody@FreeBSD.ORG  Tue Jun 20 13:19:57 2000
Return-Path: <nobody@FreeBSD.ORG>
Received: by hub.freebsd.org (Postfix, from userid 32767)
	id B066837BE0C; Tue, 20 Jun 2000 13:19:57 -0700 (PDT)
Message-Id: <20000620201957.B066837BE0C@hub.freebsd.org>
Date: Tue, 20 Jun 2000 13:19:57 -0700 (PDT)
From: krentel@dreamscape.com
Sender: nobody@FreeBSD.ORG
To: freebsd-gnats-submit@FreeBSD.org
Subject: Panic running linux binary on ext2fs
X-Send-Pr-Version: www-1.0

>Number:         19407
>Category:       kern
>Synopsis:       Panic running linux binary on ext2fs
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jun 20 13:20:00 PDT 2000
>Closed-Date:    Sun Nov 12 20:46:31 PST 2000
>Last-Modified:  Sun Nov 12 20:51:31 PST 2000
>Originator:     Mark W. Krentel
>Release:        FreeBSD 4.0-STABLE as of May 20
>Organization:
Home
>Environment:
FreeBSD 4.0-STABLE as of May 20

>Description:
I get a panic when running linux binaries directly from an ext2fs
partition.  They run fine on UFS, but on ext2fs they panic.  I'm 
using linux_base-6.1 and Freebsd 4.0 as of May 20.

I can reboot, login as root, mount the Linux partition, load the linux
module, login as user, cd to /mnt/bin (Linux's /bin), run ./ls, cd out
of /mnt, unmount /mnt, and then it panics.  I've asked about this on
-emulation and the answer was "works fine for me."  But it happens so
easily and reliably for me that I'm just baffled about what I'm doing
differently.

The problem is not new.  I've seen it with linux_base-5.2 and Freebsd
3.2, and now with linux_base-6.1 and Freebsd 4.0.  It happens with
Slackware 7, RedHat 6.0 and 6.1 binaries, and on ext2 file systems
with revision 0 and 1, and with blocksize 1024, 2048 and 4096.

I know everyone says, "it's not hardware", but really, it's not the
hardware. :-)  The machine has always been very reliable, I can do
multiple buildworlds, this is the only crash I've had in two years,
and I can produce the panic on demand.  And it's not bad sectors on
the disk, I've tried the Linux partition on two disks and it panics 
on both.

The machine is a PPro/166, ASUS mobo, 64 meg parity memory, Adaptec
2940 controller.  The kernel is GENERIC plus EXT2FS and IPFIREWALL,
and minus a bunch of devices I don't have.  Kernel and world built
with "-O -pipe".

The panics are a bit different in 3.4 from 4.0.  In 3.4, the Linux
./ls panics immediately, usually with ext2_readdir somewhere on the
stack.  And I often a get "lockmgr: not exclusive lock holder" error.
In 4.0, ./ls returns, but the output is corrupt (too few files).  But
then, unmounting /mnt produces the panic.  So, 4.0 is more recent, but
I think the 3.4 stack traces are more informative.

Here is a stack trace from 3.4.

% gdb -k
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
This GDB was configured as "i386-unknown-freebsd".
(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file kernel.1
(kgdb) core-file vmcore.1
IdlePTD 2768896
initial pcb at 23b078
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xc08d7000
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc012fe83
stack pointer           = 0x10:0xc4583bb8
frame pointer           = 0x10:0xc4583d34
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 272 (ls)
interrupt mask          = 
trap number             = 12
panic: page fault

syncing disks... done

dumping to dev 401, offset 196608
dump 64 63 62 ... 3 2 1 
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:285
285                     dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:285
#1  0xc0144f90 in at_shutdown (
    function=0xc021ec86 <__set_sysinit_set_sym_memdev_sys_init+1050>, 
    arg=0xc4534700, queue=-1000850564) at ../../kern/kern_shutdown.c:446
#2  0xc01f6e3d in trap_fatal (frame=0xc4583b7c, eva=3230494720)
    at ../../i386/i386/trap.c:942
#3  0xc01f6b1b in trap_pfault (frame=0xc4583b7c, usermode=0, eva=3230494720)
    at ../../i386/i386/trap.c:835
#4  0xc01f67be in trap (frame={tf_es = -2147483632, tf_ds = -1064501232, tf_edi = 0, 
      tf_esi = -1064479919, tf_ebp = -1000850124, tf_isp = -1000850524, 
      tf_ebx = -1064479920, tf_edx = 848, tf_ecx = 0, tf_eax = -1064472576, 
      tf_trapno = 12, tf_err = 2, tf_eip = -1072497021, tf_cs = 8, tf_eflags = 66118, 
      tf_esp = -1000760896, tf_ss = -1000850008}) at ../../i386/i386/trap.c:437
#5  0xc012fe83 in ext2_readdir (ap=0xc4583d8c) at ../../gnu/ext2fs/ext2_lookup.c:238
#6  0xc08ba4ab in ?? ()
#7  0xc01f707f in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 3, 
      tf_esi = 134578064, tf_ebp = -1077945160, tf_isp = -1000849436, tf_ebx = 3, 
      tf_edx = 849, tf_ecx = -1077946080, tf_eax = 141, tf_trapno = 12, tf_err = 2, 
      tf_eip = 672099533, tf_cs = 31, tf_eflags = 642, tf_esp = -1077946084, 
      tf_ss = 39}) at ../../i386/i386/trap.c:1100
#8  0xc01ed5dc in Xint0x80_syscall ()
#9  0x280f696c in ?? ()
#10 0x804a6d5 in ?? ()
#11 0x8049594 in ?? ()
#12 0x280811eb in ?? ()
(kgdb) fr 4
#4  0xc01f67be in trap (frame={tf_es = -2147483632, tf_ds = -1064501232, tf_edi = 0, 
      tf_esi = -1064479919, tf_ebp = -1000850124, tf_isp = -1000850524, 
      tf_ebx = -1064479920, tf_edx = 848, tf_ecx = 0, tf_eax = -1064472576, 
      tf_trapno = 12, tf_err = 2, tf_eip = -1072497021, tf_cs = 8, tf_eflags = 66118, 
      tf_esp = -1000760896, tf_ss = -1000850008}) at ../../i386/i386/trap.c:437
437                             (void) trap_pfault(&frame, FALSE, eva);
(kgdb) info fr
Stack level 4, frame at 0xc4583b74:
 eip = 0xc01f67be in trap (../../i386/i386/trap.c:437); saved eip 0xc012fe83
 called by frame at 0xc4583d34, caller of frame at 0xc4583b48
 source language c.
 Arglist at 0xc4583b74, args: frame={tf_es = -2147483632, tf_ds = -1064501232, 
      tf_edi = 0, tf_esi = -1064479919, tf_ebp = -1000850124, tf_isp = -1000850524, 
      tf_ebx = -1064479920, tf_edx = 848, tf_ecx = 0, tf_eax = -1064472576, 
      tf_trapno = 12, tf_err = 2, tf_eip = -1072497021, tf_cs = 8, tf_eflags = 66118, 
      tf_esp = -1000760896, tf_ss = -1000850008}
 Locals at 0xc4583b74, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc4583b5c, ebp at 0xc4583b74, esi at 0xc4583b60, edi at 0xc4583b64,
  eip at 0xc4583b78
(kgdb) info loc
p = (struct proc *) 0xc4534700
sticks = 1024
i = 0
ucode = 0
type = 12
code = 0
eva = 3230494720
(kgdb) up
#5  0xc012fe83 in ext2_readdir (ap=0xc4583d8c) at ../../gnu/ext2fs/ext2_lookup.c:238
238                                     *cookiep++ = (u_long) off;
(kgdb) info fr
Stack level 5, frame at 0xc4583d34:
 eip = 0xc012fe83 in ext2_readdir (../../gnu/ext2fs/ext2_lookup.c:238); 
    saved eip 0xc08ba4ab
 called by frame at 0xc4583f60, caller of frame at 0xc4583b74
 source language c.
 Arglist at 0xc4583d34, args: ap=0xc4583d8c
 Locals at 0xc4583d34, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc4583bb8, ebp at 0xc4583d34, esi at 0xc4583bbc, edi at 0xc4583bc0,
  eip at 0xc4583d38
(kgdb) info loc
cookies = (long unsigned int *) 0xc08d6e00
cookiep = (long unsigned int *) 0x0
off = 848
uio = (struct uio *) 0xc4583f40
count = -2147483648
error = 0
edp = (struct ext2_dir_entry_2 *) 0xc08d5351
dp = (struct ext2_dir_entry_2 *) 0x80000000
ncookies = 56
dstdp = {d_fileno = 26226, d_reclen = 20, d_type = 0 '\000', d_namlen = 9 '\t', 
  d_name = "linuxconf\000\000\000 ... " ...}
auio = {uio_iov = 0xc4583c04, uio_iovcnt = 1, uio_offset = 849, uio_resid = 0, 
  uio_segflg = UIO_SYSSPACE, uio_rw = UIO_READ, uio_procp = 0xc4534700}
aiov = {iov_base = 0xc08d5351 "", iov_len = 0}
dirbuf = 0xc08d5000 "<F9>e"
readcnt = 0
startoffset = 0
(kgdb) up
#6  0xc08ba4ab in ?? ()
(kgdb) info fr
Stack level 6, frame at 0xc4583f60:
 eip = 0xc08ba4ab; saved eip 0xc01f707f
 called by frame at 0xc4583fb4, caller of frame at 0xc4583d34
 Arglist at 0xc4583f60, args: 
 Locals at 0xc4583f60, Previous frame's sp is 0x0
 Saved registers:
  ebp at 0xc4583f60, eip at 0xc4583f64
(kgdb) up
#7  0xc01f707f in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 3, 
      tf_esi = 134578064, tf_ebp = -1077945160, tf_isp = -1000849436, tf_ebx = 3, 
      tf_edx = 849, tf_ecx = -1077946080, tf_eax = 141, tf_trapno = 12, tf_err = 2, 
      tf_eip = 672099533, tf_cs = 31, tf_eflags = 642, tf_esp = -1077946084, 
      tf_ss = 39}) at ../../i386/i386/trap.c:1100
1100            error = (*callp->sy_call)(p, args);
(kgdb) info fr
Stack level 7, frame at 0xc4583fb4:
 eip = 0xc01f707f in syscall (../../i386/i386/trap.c:1100); saved eip 0xc01ed5dc
 called by frame at 0xbfbfdcb8, caller of frame at 0xc4583f60
 source language c.
 Arglist at 0xc4583fb4, args: frame={tf_es = 39, tf_ds = 39, tf_edi = 3, 
      tf_esi = 134578064, tf_ebp = -1077945160, tf_isp = -1000849436, tf_ebx = 3, 
      tf_edx = 849, tf_ecx = -1077946080, tf_eax = 141, tf_trapno = 12, tf_err = 2, 
      tf_eip = 672099533, tf_cs = 31, tf_eflags = 642, tf_esp = -1077946084, 
      tf_ss = 39}
 Locals at 0xc4583fb4, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc4583f70, ebp at 0xc4583fb4, esi at 0xc4583f74, edi at 0xc4583f78,
  eip at 0xc4583fb8
(kgdb) info loc
params = 0x0
callp = (struct sysent *) 0xc4583da8
p = (struct proc *) 0xc45999c0
sticks = 3
error = 3
args = {3, -1077946080, 849, 134578064, 3, 134579140, 3, 0}
code = 141
(kgdb) up
#8  0xc01ed5dc in Xint0x80_syscall ()
(kgdb) info fr
Stack level 8, frame at 0xbfbfdcb8:
 eip = 0xc01ed5dc in Xint0x80_syscall; saved eip 0x280f696c
 (FRAMELESS), called by frame at 0xbfbfdccc, caller of frame at 0xc4583fb4
 Arglist at 0xbfbfdcb8, args: 
 Locals at 0xbfbfdcb8, Previous frame's sp is 0x0
 Saved registers:
  ebp at 0xbfbfdcb8, eip at 0xbfbfdcbc
(kgdb)  


And here is a stack trace from 4.0.

% gdb -k
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
This GDB was configured as "i386-unknown-freebsd".
(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file kernel.5
(kgdb) core-file vmcore.5
IdlePTD 2953216
initial pcb at 265020
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x1e4
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc016faac
stack pointer           = 0x10:0xc61f2e90
frame pointer           = 0x10:0xc61f2e9c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 237 (umount)
interrupt mask          = none
trap number             = 12
panic: page fault

syncing disks... 1 
done
Uptime: 6m46s

dumping to dev #da/1, offset 196608
dump 64 63 62 ... 3 2 1 
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:302
302                     dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:302
#1  0xc014ad50 in poweroff_wait (junk=0xc023f5cf, howto=-976397344)
    at ../../kern/kern_shutdown.c:552
#2  0xc02108d1 in trap_fatal (frame=0xc61f2e50, eva=484)
    at ../../i386/i386/trap.c:927
#3  0xc02105a9 in trap_pfault (frame=0xc61f2e50, usermode=0, eva=484)
    at ../../i386/i386/trap.c:820
#4  0xc02101a7 in trap (frame={tf_fs = 16, tf_es = -971112432, tf_ds = -976420848, 
      tf_edi = -1064273920, tf_esi = 484, tf_ebp = -971034980, tf_isp = -971035012, 
      tf_ebx = -1064570884, tf_edx = 484, tf_ecx = -976397344, tf_eax = -971091648, 
      tf_trapno = 12, tf_err = 0, tf_eip = -1072235860, tf_cs = 8, 
      tf_eflags = 66054, tf_esp = -1064273920, tf_ss = -971078016})
    at ../../i386/i386/trap.c:426
#5  0xc016faac in cache_purgevfs (mp=0xc0907800) at ../../kern/vfs_cache.c:403
#6  0xc0176349 in dounmount (mp=0xc0907800, flags=0, p=0xc5cd5be0)
    at ../../kern/vfs_syscalls.c:482
#7  0xc01762d9 in unmount (p=0xc5cd5be0, uap=0xc61f2f80)
    at ../../kern/vfs_syscalls.c:456
#8  0xc0210b7d in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
      tf_edi = 134640005, tf_esi = 134706237, tf_ebp = -1077937140, 
      tf_isp = -971034668, tf_ebx = 0, tf_edx = 0, tf_ecx = 3, tf_eax = 22, 
      tf_trapno = 12, tf_err = 2, tf_eip = 134522392, tf_cs = 31, tf_eflags = 647, 
      tf_esp = -1077938288, tf_ss = 47}) at ../../i386/i386/trap.c:1126
#9  0xc0205b76 in Xint0x80_syscall ()
#10 0x8048402 in ?? ()
#11 0x80480f9 in ?? ()
(kgdb) fr 4
#4  0xc02101a7 in trap (frame={tf_fs = 16, tf_es = -971112432, tf_ds = -976420848, 
      tf_edi = -1064273920, tf_esi = 484, tf_ebp = -971034980, tf_isp = -971035012, 
      tf_ebx = -1064570884, tf_edx = 484, tf_ecx = -976397344, tf_eax = -971091648, 
      tf_trapno = 12, tf_err = 0, tf_eip = -1072235860, tf_cs = 8, 
      tf_eflags = 66054, tf_esp = -1064273920, tf_ss = -971078016})
    at ../../i386/i386/trap.c:426
426                             (void) trap_pfault(&frame, FALSE, eva);
(kgdb) info fr
Stack level 4, frame at 0xc61f2e48:
 eip = 0xc02101a7 in trap (../../i386/i386/trap.c:426); saved eip 0xc016faac
 called by frame at 0xc61f2e9c, caller of frame at 0xc61f2e08
 source language c.
 Arglist at 0xc61f2e48, args: frame={tf_fs = 16, tf_es = -971112432, 
      tf_ds = -976420848, tf_edi = -1064273920, tf_esi = 484, tf_ebp = -971034980, 
      tf_isp = -971035012, tf_ebx = -1064570884, tf_edx = 484, tf_ecx = -976397344, 
      tf_eax = -971091648, tf_trapno = 12, tf_err = 0, tf_eip = -1072235860, 
      tf_cs = 8, tf_eflags = 66054, tf_esp = -1064273920, tf_ss = -971078016}
 Locals at 0xc61f2e48, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc61f2e1c, ebp at 0xc61f2e48, esi at 0xc61f2e20, edi at 0xc61f2e24,
  eip at 0xc61f2e4c
(kgdb) info loc
p = (struct proc *) 0xc5cd5be0
sticks = 14275995756443556832
i = 0
ucode = 0
type = 12
code = 0
eva = 484
(kgdb) up
#5  0xc016faac in cache_purgevfs (mp=0xc0907800) at ../../kern/vfs_cache.c:403
403                     for (ncp = LIST_FIRST(ncpp); ncp != 0; ncp = nnp) {
(kgdb) info fr
Stack level 5, frame at 0xc61f2e9c:
 eip = 0xc016faac in cache_purgevfs (../../kern/vfs_cache.c:403); 
    saved eip 0xc0176349
 called by frame at 0xc61f2ec0, caller of frame at 0xc61f2e48
 source language c.
 Arglist at 0xc61f2e9c, args: mp=0xc0907800
 Locals at 0xc61f2e9c, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc61f2e90, ebp at 0xc61f2e9c, esi at 0xc61f2e94, edi at 0xc61f2e98,
  eip at 0xc61f2ea0
(kgdb) info loc
mp = (struct mount *) 0xc0907800
ncpp = (struct nchashhead *) 0x0
ncp = (struct namecache *) 0x0
nnp = (struct namecache *) 0x1e4
(kgdb) up
#6  0xc0176349 in dounmount (mp=0xc0907800, flags=0, p=0xc5cd5be0)
    at ../../kern/vfs_syscalls.c:482
482             cache_purgevfs(mp);     /* remove cache entries for this file sys */
(kgdb) info fr
Stack level 6, frame at 0xc61f2ec0:
 eip = 0xc0176349 in dounmount (../../kern/vfs_syscalls.c:482); saved eip 0xc01762d9
 called by frame at 0xc61f2f2c, caller of frame at 0xc61f2e9c
 source language c.
 Arglist at 0xc61f2ec0, args: mp=0xc0907800, flags=0, p=0xc5cd5be0
 Locals at 0xc61f2ec0, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc61f2eb0, ebp at 0xc61f2ec0, esi at 0xc61f2eb4, edi at 0xc61f2eb8,
  eip at 0xc61f2ec4
(kgdb) info loc
mp = (struct mount *) 0xc0907800
p = (struct proc *) 0xc5cd5be0
coveredvp = (struct vnode *) 0x0
error = -971078016
async_flag = 0
(kgdb) up
#7  0xc01762d9 in unmount (p=0xc5cd5be0, uap=0xc61f2f80)
    at ../../kern/vfs_syscalls.c:456
456             return (dounmount(mp, SCARG(uap, flags), p));
(kgdb) info fr
Stack level 7, frame at 0xc61f2f2c:
 eip = 0xc01762d9 in unmount (../../kern/vfs_syscalls.c:456); saved eip 0xc0210b7d
 called by frame at 0xc61f2fa0, caller of frame at 0xc61f2ec0
 source language c.
 Arglist at 0xc61f2f2c, args: p=0xc5cd5be0, uap=0xc61f2f80
 Locals at 0xc61f2f2c, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc61f2ed8, ebp at 0xc61f2f2c, esi at 0xc61f2edc, edi at 0xc61f2ee0,
  eip at 0xc61f2f30
(kgdb) info loc
vp = (struct vnode *) 0xc61e8680
mp = (struct mount *) 0xc0907800
error = 0
nd = {ni_dirp = 0xbfbffd73 "/mnt", ni_segflg = UIO_USERSPACE, ni_startdir = 0x0, 
  ni_rootdir = 0xc5cd2e00, ni_topdir = 0x0, ni_vp = 0xc61e8680, 
  ni_dvp = 0xc5cd2e00, ni_pathlen = 1, ni_next = 0xc5ce7404 "", ni_loopcnt = 0, 
  ni_cnd = {cn_nameiop = 0, cn_flags = 49220, cn_proc = 0xc5cd5be0, 
    cn_cred = 0xc0955900, cn_pnbuf = 0xc5ce7400 "", cn_nameptr = 0xc5ce7401 "p<CE><C5>", 
    cn_namelen = 3, cn_consume = 0}}
(kgdb) up
#8  0xc0210b7d in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
      tf_edi = 134640005, tf_esi = 134706237, tf_ebp = -1077937140, 
      tf_isp = -971034668, tf_ebx = 0, tf_edx = 0, tf_ecx = 3, tf_eax = 22, 
      tf_trapno = 12, tf_err = 2, tf_eip = 134522392, tf_cs = 31, tf_eflags = 647, 
      tf_esp = -1077938288, tf_ss = 47}) at ../../i386/i386/trap.c:1126
1126            error = (*callp->sy_call)(p, args);
(kgdb) info fr
Stack level 8, frame at 0xc61f2fa0:
 eip = 0xc0210b7d in syscall2 (../../i386/i386/trap.c:1126); saved eip 0xc0205b76
 called by frame at 0xbfbffc0c, caller of frame at 0xc61f2f2c
 source language c.
 Arglist at 0xc61f2fa0, args: frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
      tf_edi = 134640005, tf_esi = 134706237, tf_ebp = -1077937140, 
      tf_isp = -971034668, tf_ebx = 0, tf_edx = 0, tf_ecx = 3, tf_eax = 22, 
      tf_trapno = 12, tf_err = 2, tf_eip = 134522392, tf_cs = 31, tf_eflags = 647, 
      tf_esp = -1077938288, tf_ss = 47}
 Locals at 0xc61f2fa0, Previous frame's sp is 0x0
 Saved registers:
  ebx at 0xc61f2f3c, ebp at 0xc61f2fa0, esi at 0xc61f2f40, edi at 0xc61f2f44,
  eip at 0xc61f2fa4
(kgdb) info loc
params = 0xbfbff794 "s<FD><BF><BF>"
i = 0
callp = (struct sysent *) 0xc024c0f4
p = (struct proc *) 0xc5cd5be0
sticks = 0
error = 0
narg = 2
args = {-1077936781, 0, 0, 0, 0, 0, 0, 0}
have_mplock = 1
code = 22
(kgdb) up
#9  0xc0205b76 in Xint0x80_syscall ()
(kgdb) info fr
Stack level 9, frame at 0xbfbffc0c:
 eip = 0xc0205b76 in Xint0x80_syscall; saved eip 0x8048402
 (FRAMELESS), called by frame at 0xbfbffc5c, caller of frame at 0xc61f2fa0
 Arglist at 0xbfbffc0c, args: 
 Locals at 0xbfbffc0c, Previous frame's sp is 0x0
 Saved registers:
  ebp at 0xbfbffc0c, eip at 0xbfbffc10
(kgdb)  


>How-To-Repeat:
On my system in 4.0, I can reboot, login as root, mount the Linux
partition, load the linux module, switch to an alternate console and
login as user, cd to /mnt/bin (Linux's /bin), run ./ls, cd out of
/mnt, switch back to root and unmount /mnt, and then it panics.

And it happens with Slackware 7, RedHat 6.0 and 6.1 binaries, and on
ext2 file systems with revision 0 and 1, and with blocksize 1024, 2048
and 4096.

>Fix:


>Release-Note:
>Audit-Trail:

From: "Mark W. Krentel" <krentel@dreamscape.com>
To: freebsd-gnats-submit@FreeBSD.ORG
Cc:  
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Mon, 26 Jun 2000 15:12:02 -0400 (EDT)

 It was suggested on -emulation that this may be a branding issue.
 So, I copied the Linux binaries, branded them and ran the branded
 version (still on ext2fs).  And I get the same panic.
 
 --Mark
 

From: "Mark W. Krentel" <krentel@dreamscape.com>
To: freebsd-gnats-submit@FreeBSD.ORG
Cc:  
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Fri, 7 Jul 2000 01:53:55 -0400 (EDT)

 I've run some more experiments and I've narrowed the problem somewhat.
 
 Using the Slackware 7 live file system, I tar-copied /cdrom/live/bin
 onto ufs and ext2fs partitions.  Then I ran Slackware's ls from ufs,
 cdrom and ext2fs and listed directories on ufs, cdrom and ext2fs.
 Sometimes it worked ok, sometimes the output of ls was corrupt (too
 few files), and the pattern is quite clear.
 
                         directory listed on 
     binary on        ufs       cdrom       ext2fs
       ufs            ok       corrupt      corrupt
      cdrom           ok       corrupt      corrupt
      ext2fs          ok       corrupt      corrupt
 
 I also updated libncurses.so.5.0 and installed emacs's libexec and
 share files and repeated the above test with dired from emacs.  I got
 the same results, except that the corrupt directory listings were
 slightly different between ls and emacs.  For example, in one
 directory on ext2fs that actually has 77 files, ls reported 71 files,
 but dired listed only 29.  But they always either both worked or both
 had too few files.  And sometimes the bottom row panics, but not this
 time.
 
 For example, this is Slackware's ls (on ufs) listing a directory on
 ext2fs that actually has 89 files.
 
    % ./ls /mnt/bin
    awk    chmod         cp  gawk        keys;^  mkdir   mv     sed   touch
    bash   chown         dd  gawk-3.0.4  ln      mknod   rm     sh
    chgrp  consolechars  df  igawk       ls      mktemp  rmdir  sync
 
 And the same Linux ls listing a cdrom directory with 801 files.
 It comes up 792 files short.
 
    % ./ls /cdrom/live/usr/bin  
    00_TRANS.TBL  a2p  aafire  aainfo  aasavefont  aatest  aclocal  addr  addr2line
 
 So, apparently the Linux ls is having trouble reading non-ufs file
 systems.  And I noticed that dired was unable to do path completion.
 I typed /cdrom/li and hit tab, and emacs complained that there was no
 completion, probably because there is no /compat/linux/cdrom/li*.  But
 there is /cdrom/live/bin and dired listed it, although incorrectly.
 
 I'll take a wild guess and say that the Linuxulator opens a file or
 directory and gets an error, but it doesn't notice the error and
 proceeds blindly along.  Maybe where it chooses between lookups in
 /compat/linux or /.  But that's a wild guess.
 
 --Mark
 

From: "Mark W. Krentel" <krentel@dreamscape.com>
To: freebsd-gnats-submit@FreeBSD.ORG
Cc:  
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Sat, 8 Jul 2000 15:34:35 -0400 (EDT)

 I can repeat the above results with the simple readdir program
 (opendir followed by a loop of readdir).  Under Linux emulation,
 readdir is prematurely returning NULL on non-ufs file systems.
 
 --Mark
 

From: Bruce Evans <bde@zeta.org.au>
To: "Mark W. Krentel" <krentel@dreamscape.com>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Mon, 24 Jul 2000 13:02:40 +1000 (EST)

 On Thu, 6 Jul 2000, Mark W. Krentel wrote:
 
 >  I've run some more experiments and I've narrowed the problem somewhat.
 >  
 >  Using the Slackware 7 live file system, I tar-copied /cdrom/live/bin
 >  onto ufs and ext2fs partitions.  Then I ran Slackware's ls from ufs,
 >  cdrom and ext2fs and listed directories on ufs, cdrom and ext2fs.
 >  Sometimes it worked ok, sometimes the output of ls was corrupt (too
 >  few files), and the pattern is quite clear.
 >  
 >                          directory listed on 
 >      binary on        ufs       cdrom       ext2fs
 >        ufs            ok       corrupt      corrupt
 >       cdrom           ok       corrupt      corrupt
 >       ext2fs          ok       corrupt      corrupt
 
 I found some of the problems using these hints.  There were 2 serious bugs
 in ext2_readdir(): writing far beyond the end of the cookie buffer, and
 reading a little beyond the end of the directory buffer.
 
 There don't seem to be any problems with the Linuxulator.  It just asks
 ext2_readdir() for cookies.  Then cookie processing is usually fatal.
 Similarly for readdir() on an nfs-mounted ext2fs filesystem.
 
 Overrunning the directory buffer can cause panics and wrong results from
 readdir(3) even for native binaries, but this problem doesn't usually occur
 for native binaries because they use an adequate buffer size (4K).  Linux
 binaries trigger the bug by using a too-small buffer size (512 bytes).
 This size makes Linux's ls (an old (1997) RedHat version) take about 4
 times as much system time as FreeBSD's ls even on ufs filesystems.
 getdirentries(2) claims that the correct size is given by stat(2), but 
 Linux's ls apparently doesn't know this, and in any case the correct
 size is a little larger than the filesystem blocksize for ext2fs, since
 ext2_readdir() expands some directory entries.
 
 Try these fixes:
 
 Index: ext2_lookup.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_lookup.c,v
 retrieving revision 1.24
 diff -c -2 -r1.24 ext2_lookup.c
 *** ext2_lookup.c	2000/05/05 09:57:57	1.24
 --- ext2_lookup.c	2000/07/24 02:09:03
 ***************
 *** 153,166 ****
   	struct iovec aiov;
   	caddr_t dirbuf;
   	int readcnt;
 ! 	u_quad_t startoffset = uio->uio_offset;
   
 !         count = uio->uio_resid;		/* legyenek boldogok akik akarnak ... */
 !         uio->uio_resid = count;
 !         uio->uio_iov->iov_len = count;
 ! 
 ! #if 0
 ! printf("ext2_readdir called uio->uio_offset %d uio->uio_resid %d count %d \n", 
 ! 	(int)uio->uio_offset, (int)uio->uio_resid, (int)count);
   #endif
   
 --- 152,175 ----
   	struct iovec aiov;
   	caddr_t dirbuf;
 + 	int DIRBLKSIZ = VTOI(ap->a_vp)->i_e2fs->s_blocksize;
   	int readcnt;
 ! 	off_t startoffset = uio->uio_offset;
   
 ! 	count = uio->uio_resid;
 ! 	/*
 ! 	 * Avoid complications for partial directory entries by adjusting
 ! 	 * the i/o to end at a block boundary.  Don't give up (like ufs
 ! 	 * does) if the initial adjustment gives a negative count, since
 ! 	 * many callers don't supply a large enough buffer.  The correct
 ! 	 * size is a little larger than DIRBLKSIZ to allow for expansion
 ! 	 * of directory entries, but some callers just use 512.
 ! 	 */
 ! 	count -= (uio->uio_offset + count) & (DIRBLKSIZ -1);
 ! 	if (count <= 0)
 ! 		count += DIRBLKSIZ;
 ! 
 ! #ifdef EXT2FS_DEBUG
 ! 	printf("ext2_readdir: uio_offset = %lld, uio_resid = %d, count = %d\n", 
 ! 	    uio->uio_offset, uio->uio_resid, count);
   #endif
   
 ***************
 *** 168,171 ****
 --- 177,181 ----
   	auio.uio_iov = &aiov;
   	auio.uio_iovcnt = 1;
 + 	auio.uio_resid = count;
   	auio.uio_segflg = UIO_SYSSPACE;
   	aiov.iov_len = count;
 ***************
 *** 226,231 ****
   
   		if (!error && ap->a_ncookies != NULL) {
 ! 			u_long *cookies;
 ! 			u_long *cookiep;
   			off_t off;
   
 --- 236,240 ----
   
   		if (!error && ap->a_ncookies != NULL) {
 ! 			u_long *cookiep, *cookies, *ecookies;
   			off_t off;
   
 ***************
 *** 235,240 ****
   			       M_WAITOK);
   			off = startoffset;
 ! 			for (dp = (struct ext2_dir_entry_2 *)dirbuf, cookiep = cookies;
 ! 			     dp < edp;
   			     dp = (struct ext2_dir_entry_2 *)((caddr_t) dp + dp->rec_len)) {
   				off += dp->rec_len;
 --- 244,250 ----
   			       M_WAITOK);
   			off = startoffset;
 ! 			for (dp = (struct ext2_dir_entry_2 *)dirbuf,
 ! 			     cookiep = cookies, ecookies = cookies + ncookies;
 ! 			     cookiep < ecookies;
   			     dp = (struct ext2_dir_entry_2 *)((caddr_t) dp + dp->rec_len)) {
   				off += dp->rec_len;
 
 Bruce
 
 

From: "Mark W. Krentel" <krentel@dreamscape.com>
To: bde@zeta.org.au
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Tue, 25 Jul 2000 02:32:04 -0400 (EDT)

 > I found some of the problems using these hints.  There were 2 serious bugs
 > in ext2_readdir(): writing far beyond the end of the cookie buffer, and
 > reading a little beyond the end of the directory buffer.
 
 Thanks for looking at the PR!  I tried the patch, but unfortunately
 it didn't make any difference.
 
 Are you able to reproduce the bug?  I can produce it with just the
 simple readdir program (see below).  Readdir prematurely returns NULL
 on both ext2fs and cdrom partitions and thus lists too few files.
 That is, I can produce the bug without even using an ext2fs partition.
 
 > Overrunning the directory buffer can cause panics and wrong results from
 > readdir(3) even for native binaries, but this problem doesn't usually occur
 > for native binaries because they use an adequate buffer size (4K).  Linux
 > binaries trigger the bug by using a too-small buffer size (512 bytes).
 
 What buffers?  Are they something a user program has control over, or
 are they buried within library routines?
 
 I tried bypassing readdir by using open and read on the directory.  I
 wrote a simple hex dump program and compiled it in RH 6.1.  But Linux
 wouldn't run it; read on a directory returned EISDIR (Is a directory).
 Ironically, the Linuxulator did run the program, and read returned the
 entire directory.  So, I guess that narrows the problem to something
 in the readdir library between the levels of read and readdir.
 
 When 4.1 is released, I plan to cvsup to 4.1-R and redo these tests
 more thoroughly.  Maybe your patch is enough to prevent the panic, and
 maybe the readdir problem is separate bug.  I'll let you know.
 
 --Mark
 
 ----------
 
 /*
  * List directory contents with opendir and readdir.
  * Basically the same as "ls -1af".
  */
 
 #include <sys/types.h>
 #include <dirent.h>
 #include <stdio.h>
 
 void my_err(char *mesg)
 {
   printf("Error: %s\n", mesg);
   exit(1);
 }
 
 int main(int argc, char **argv)
 {
   DIR  *dp;
   struct dirent  *de;
   int   n;
 
   if ( argc < 2 ) my_err("missing directory");
 
   if ( (dp = opendir(argv[1])) == NULL )
     my_err("unable to open directory");
 
   n = 0;
   while ( (de = readdir(dp)) != NULL )
     {
       printf("%s\n", de->d_name);
       n++;
     }
 
   printf("Total: %d files\n", n);
 
   return 0;
 }
 

From: Bruce Evans <bde@zeta.org.au>
To: "Mark W. Krentel" <krentel@dreamscape.com>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Thu, 27 Jul 2000 13:09:57 +1000 (EST)

 On Tue, 25 Jul 2000, Mark W. Krentel wrote:
 
 > > I found some of the problems using these hints.  There were 2 serious bugs
 > > in ext2_readdir(): writing far beyond the end of the cookie buffer, and
 > > reading a little beyond the end of the directory buffer.
 > 
 > Thanks for looking at the PR!  I tried the patch, but unfortunately
 > it didn't make any difference.
 > 
 > Are you able to reproduce the bug?  I can produce it with just the
 
 Only the panic.
 
 > simple readdir program (see below).  Readdir prematurely returns NULL
 > on both ext2fs and cdrom partitions and thus lists too few files.
 > That is, I can produce the bug without even using an ext2fs partition.
 
 I didn't try the program, but linux-ls -R works right on a linux partition
 and on a cdrom here.
 
 > > Overrunning the directory buffer can cause panics and wrong results from
 > > readdir(3) even for native binaries, but this problem doesn't usually occur
 > > for native binaries because they use an adequate buffer size (4K).  Linux
 > > binaries trigger the bug by using a too-small buffer size (512 bytes).
 > 
 > What buffers?  Are they something a user program has control over, or
 > are they buried within library routines?
 
 Mostly user buffers in readdir(3), but the Linuxulator and nfs use too-small
 buffers or a too-small rounding up in some cases.
 
 > I tried bypassing readdir by using open and read on the directory.  I
 > wrote a simple hex dump program and compiled it in RH 6.1.  But Linux
 > wouldn't run it; read on a directory returned EISDIR (Is a directory).
 > Ironically, the Linuxulator did run the program, and read returned the
 > entire directory.  So, I guess that narrows the problem to something
 > in the readdir library between the levels of read and readdir.
 
 readdir(3) doesn't use read(2) under either FreeBSD or Linux.  It
 can't, because not all file systems have read(2)'able directories
 (under Linux, no file systems have read(2)'able directories).  Under
 FreeBSD, readdir(3) is a simple wrapper around getdirentries(2), and
 the bug is probably in the latter.
 
 
 Bruce
 
 

From: "Mark W. Krentel" <krentel@dreamscape.com>
To: bde@zeta.org.au
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Mon, 7 Aug 2000 15:12:55 -0400 (EDT)

 Ok, I've cvsup'd to 4.1-R, applied the patch, rebuilt world and kernel,
 and done more tests.  I beat on it with "ls -lR" and "find | xargs ls",
 I ran emacs, xv, xsnow, xboard, all from the ext2 partition, and I've
 been unable to induce a panic.  Then, I removed the patch, reran the
 tests and got a panic almost immediately.  Finally, I put the patch
 back in, beat on it some more and no panic.  So, I'm satisfied that
 you've identified the cause of the panic and that your patch fixes it.
 Good job!
 
 And remember that I'm running 4.1, so I have rev 1.21 of ext2_lookup.c
 (your patch was for rev 1.24).  From looking at the RCS diffs, I don't
 think it's a problem, but you would know better than me.
 
 But I'm still seeing the problem where ls or readdir returns too few
 files.  So this must be a separate problem.  And it happens on both
 ext2 and cdrom partitions, so maybe it's in the Linuxulator.  I'll try
 looking at the source code.  I guess it's the Linux readdir(3) library
 calling getdents(2) in the Linuxulator, is that right?
 
 > I didn't try the program, but linux-ls -R works right on a linux partition
 > and on a cdrom here.
 
 Again, I can't figure out what I'm doing differently.  Do you have a
 machine with a local ext2 partition?  What version of Linux do you
 have?  You're running -current with the linux_base-6.1 port?
 
 I searched the open PR's and found a few more involving panics on ext2
 partitions.  One stands out as being very similar.
 
   PR i386/15074 -- Two different panics when running Linux binaries on Athlon
 
 Three more may be related, or maybe not.
 
   PR kern/10581 -- Kernel panic while using find on an ext2 filesystem.
   PR kern/10594 -- EXT2FS mount problems
   PR gnu/15892  -- NFS-exported ext2 file system makes Linux crash
 
 P.S. What does "legyenek boldogok akik akarnak" mean?  It didn't rot13
 to anything meaningful.
 
 --Mark
 

From: Sheldon Hearn <sheldonh@uunet.co.za>
To: bde@FreeBSD.org
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/19407: Panic running linux binary on ext2fs 
Date: Tue, 08 Aug 2000 11:47:57 +0200

 Hi Bruce,
 
 Given the feedback and the origin of the patch, can I assign the PR to
 you? :-)
 
 Ciao,
 Sheldon.
 
State-Changed-From-To: open->feedback 
State-Changed-By: bde 
State-Changed-When: Fri Oct 27 03:16:37 PDT 2000 
State-Changed-Why:  
My patch has been applied to -current, RELENG_4 and RELENG_3, but there 
is still a problem, possibly at the Linuxulator level. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=19407 

From: "Mark W. Krentel" <krentel@dreamscape.com>
To: marcel@cup.hp.com, bde@zeta.org.au
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/19407: Panic running linux binary on ext2fs
Date: Sun, 12 Nov 2000 18:33:56 -0500 (EST)

 Ok, I've upgraded to 4.2-RC (as of Nov 8) which includes both Bruce's
 patch and Marcel's patch to src/sys/compat/linux/linux_file.c for the
 getdents problem.  I've tried the linux ls, readdir and dired (emacs)
 on ext2 and cdrom partitions and they all work.  I also get identical
 results with the Linux and Freebsd ls -R run over large hierarchies.
 So, I'm satisfied that Marcel's patch fixes the remaining problem and
 that this PR should now be closed.  Good job!
 
 I think these patches may also fix PR i386/15074 and PR gnu/15892, if
 someone wants to take another look at them.
 
 --Mark
 
State-Changed-From-To: feedback->closed 
State-Changed-By: marcel 
State-Changed-When: Sun Nov 12 20:46:31 PST 2000 
State-Changed-Why:  
This PR described two problems. Both have been resolved. 
Thanks to Mark for his patience and contribution! 


http://www.freebsd.org/cgi/query-pr.cgi?pr=19407 
>Unformatted:
