From Stefan.Esser@otelo.de  Mon Dec 10 11:37:36 2001
Return-Path: <Stefan.Esser@otelo.de>
Received: from mail.otelo.de (mail.otelo.de [195.50.151.2])
	by hub.freebsd.org (Postfix) with ESMTP
	id C83BF37B416; Mon, 10 Dec 2001 11:37:34 -0800 (PST)
Message-Id: <20011210193720.DF0CB1510@c4007380.vebacom.de>
Date: Mon, 10 Dec 2001 20:37:20 +0100 (CET)
From: Stefan Esser <se@freebsd.org>
Reply-To: Stefan.Esser@Arcor.Net
To: FreeBSD-gnats-submit@freebsd.org
Cc: <se@freebsd.org>, stefan.esser@o-tel-o.de
Subject: Reproducable PANIC in -stable and .current 

>Number:         32681
>Category:       kern
>Synopsis:       Reproducable PANIC in -stable and .current
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    des
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Dec 10 11:40:01 PST 2001
>Closed-Date:    Thu Jan 3 02:04:51 PST 2002
>Last-Modified:  Thu Jan 03 02:06:15 PST 2002
>Originator:     Stefan Esser
>Release:        FreeBSD 4.4-STABLE i386 / FreeBSD 5.0-CURRENT i386
>Organization:
o.tel.o GmbH
>Environment:
System: FreeBSD current-host 5.0-CURRENT FreeBSD 5.0-CURRENT #1: Mon Dec 10 18:00:17 CET 2001     se@current-host:/usr/src/sys/i386/compile/TYAN  i386
	FreeBSD stable-host 4.4-STABLE FreeBSD 4.4-STABLE #6: Wed Dec  5 14:29:02 CET 2001     se@stable-host:/usr/src/sys/compile/TYAN  i386

>Description:
I've been suffering from panics during a security scan of a large company LAN. 
With nessus and 20 simultanous addresses being scanned, I get panics within 
minutes, which forced me to lower the parallelism to 5 addresses at a time (the 
whole network will be finished in about 4 years, at the current pace ;-)

Since the operational system doing these scans is two hours of travel away, I 
decided to check, whether the same problem exists in -current on identical 
hardware at my location: it's there!

Since I don't have access to the console of the remote -stable box, I did all 
my testing on a similar box running -current. 
The symptoms are identical: Trap 12 within minutes of running the latest nessus-devel (set to test 10 addresses in parallel).

I've tried to identify the location of the fstatfs() call in nessusd, but did not
find a single occurence.


The source of problem appears to be a value of NULL resulting from:

	 mp = ((struct vnode *)fp->f_data)->v_mount;

which is dereferenced in the next progam line ...


System log of the -current system (no SCSI drives, additional i4B, else identical 
to the -stable system):

Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #1: Mon Dec 10 18:00:17 CET 2001
    se@server:/usr/src/sys/i386/compile/TYAN
Preloaded elf kernel "/boot/kernel/kernel" at 0xc049d000.
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x652  Stepping = 2
  Features=0x183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
real memory  = 134217728 (131072K bytes)
avail memory = 125427712 (122488K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
IOAPIC #0 intpin 16 -> irq 10
IOAPIC #0 intpin 18 -> irq 9
IOAPIC #0 intpin 19 -> irq 11
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
Pentium Pro MTRR support enabled
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443GX host to PCI bridge> at pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
agp0: <Intel 82443GX host to PCI bridge> mem 0xf4000000-0xf7ffffff at device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <display, VGA> at device 0.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0xffa0-0xffaf at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: <serial bus, USB> at device 7.2 (no driver attached)
intpm0: <Intel 82371AB Power management controller> port 0x440-0x44f irq 9 at device 7.3 on pci0
intpm0: I/O mapped 440
intpm0: intr IRQ 9 enabled revision 0
smbus0: <System Management Bus> on intsmb0
smb0: <SMBus general purpose I/O> on smbus0
intpm0: PM I/O mapped 400
pcib2: <PCI-PCI bridge> at device 16.0 on pci0
pci2: <PCI bus> on pcib2
fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xdf80-0xdf9f mem 0xfe600000-0xfe6fffff,0xf37ff000-0xf37fffff irq 9 at device 6.0 on pci2
fxp0: Ethernet address 00:90:27:14:c2:0d
inphy0: <i82555 10/100 media interface> on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
isic0: <ELSA MicroLink ISDN/PCI> port 0xdff4-0xdff7,0xdc00-0xdc7f mem 0xfe4fff80-0xfe4
fffff irq 11 at device 7.0 on pci2
isic0: passive stack unit 0
fxp1: <Intel Pro 10/100B/100+ Ethernet> port 0xef40-0xef5f mem 0xfea00000-0xfeafffff,0xfb8ff000-0xfb8fffff irq 11 at device 17.0 on pci0
fxp1: Ethernet address 00:e0:81:10:27:2a
inphy1: <i82555 10/100 media interface> on miibus1
inphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0: <Adaptec aic7895 Ultra SCSI adapter> port 0xe400-0xe4ff mem 0xfebfe000-0xfebfefff irq 10 at device 18.0 on pci0
aic7895C: Ultra Wide Channel A, SCSI Id=7, 32/255 SCBs
ahc1: <Adaptec aic7895 Ultra SCSI adapter> port 0xe800-0xe8ff mem 0xfebff000-0xfebfffff irq 10 at device 18.1 on pci0
aic7895C: Ultra Wide Channel B, SCSI Id=7, 32/255 SCBs
ata-: ata0 already exists, skipping it
ata-: ata1 already exists, skipping it
sc-: sc0 already exists, skipping it
vga-: vga0 already exists, skipping it
orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc87ff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model Generic PS/2 mouse, device ID 0
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sbc0: <Creative ViBRA16X> at port 0x220-0x22f,0x330-0x331,0x388-0x38b irq 5 drq 1,3 on isa0
pcm0: <SB16 DSP 4.16 (ViBRA16X)> on sbc0
unknown: <PNP0303> can't assign resources
unknown: <PNP0c02> can't assign resources
unknown: <PNP0501> can't assign resources
unknown: <PNP0501> can't assign resources
unknown: <PNP0401> can't assign resources
unknown: <PNP0700> can't assign resources
unknown: <PNP0f13> can't assign resources
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
i4brbch: 4 raw B channel access device(s) attached
i4bipr: 4 IP over raw HDLC ISDN device(s) attached (VJ header compression)
i4bctl: ISDN system control port attached
IPv6 packet filtering initialized, logging limited to 100 packets/entry
i4btrc: 4 ISDN trace device(s) attached
DUMMYNET initialized (011031)
i4b: ISDN call control device attached
IP packet filtering initialized, divert disabled, rule-based forwarding enabled, default to deny, logging limited to 100 packets/entry by default
i4btel: 2 ISDN telephony interface device(s) attached
IPsec: Initialized Security Association Processing.
ad0: 12970MB <Maxtor 91360D8> [26353/16/63] at ata0-master UDMA33
Waiting 2 seconds for SCSI devices to settle
Mounting root from ufs:/dev/ad0s1a
SMP: AP CPU #1 Launched!


IdlePTD 4964352
initial pcb at 3c7920
panicstr: from debugger
panic messages:
---
Fatal trap 12: page fault while in kernel mode
cpuid = 1; lapic.id = 01000000
fault virtual address   = 0x8
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0226c3f
stack pointer           = 0x10:0xd25b9be4
frame pointer           = 0x10:0xd25b9cf4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 369 (nessusd)
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 1; lapic.id = 01000000
fault virtual address   = 0xc0226c3
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc02e92f0
stack pointer           = 0x10:0xd25b99bc
frame pointer           = 0x10:0xd25b99c0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 369 (nessusd)


Fatal trap 12: page fault while in kernel mode
cpuid = 1; lapic.id = 01000000
fault virtual address   = 0x5708438b
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc02e92f0
stack pointer           = 0x10:0xd25b99bc
frame pointer           = 0x10:0xd25b99c0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 369 (nessusd)

(kgdb) where
#0  dumpsys () at ../../../kern/kern_shutdown.c:492
#1  0xc01e8b7b in boot (howto=260) at ../../../kern/kern_shutdown.c:335
#2  0xc01e901d in panic (fmt=0xc031dfaa "from debugger")
    at ../../../kern/kern_shutdown.c:634
#3  0xc014679d in db_panic (addr=-1071485889, have_addr=0, count=-1,
    modif=0xd25b9a58 "") at ../../../ddb/db_command.c:452
#4  0xc014673b in db_command (last_cmdp=0xc036f798, cmd_table=0xc036f5b8,
    aux_cmd_tablep=0xc0365e6c, aux_cmd_tablep_end=0xc0365e70)
    at ../../../ddb/db_command.c:348
#5  0xc0146807 in db_command_loop () at ../../../ddb/db_command.c:474
#6  0xc0148ba3 in db_trap (type=12, code=0) at ../../../ddb/db_trap.c:72
#7  0xc02e91b2 in kdb_trap (type=12, code=0, regs=0xd25b9ba4)
    at ../../../i386/i386/db_interface.c:167
#8  0xc02fb6ec in trap_fatal (frame=0xd25b9ba4, eva=8)
    at ../../../i386/i386/trap.c:945
#9  0xc02fb435 in trap_pfault (frame=0xd25b9ba4, usermode=0, eva=8)
    at ../../../i386/i386/trap.c:864
#10 0xc02fae43 in trap (frame={tf_fs = 24, tf_es = 16, tf_ds = 16,
      tf_edi = -767445244, tf_esi = 88, tf_ebp = -765747980, tf_isp = -765748272,
      tf_ebx = 0, tf_edx = -1005341568, tf_ecx = 4, tf_eax = -764857728,
      tf_trapno = 12, tf_err = 0, tf_eip = -1071485889, tf_cs = 8,
      tf_eflags = 66118, tf_esp = -767445504, tf_ss = 2})
    at ../../../i386/i386/trap.c:416
#11 0xc0226c3f in fstatfs4 (td=0xd241b704, uap=0xd25b9d20)
    at ../../../kern/vfs_syscalls.c:798
#12 0xc02fbb61 in syscall (frame={tf_fs = 134610991, tf_es = 47,
      tf_ds = -1078001617, tf_edi = 134656288, tf_esi = 136399488,
      tf_ebp = -1077938528, tf_isp = -765747852, tf_ebx = 672379112, tf_edx = 13,
      tf_ecx = 672455364, tf_eax = 158, tf_trapno = 0, tf_err = 2,
      tf_eip = 671985175, tf_cs = 31, tf_eflags = 663, tf_esp = -1077938956,
      tf_ss = 47}) at ../../../i386/i386/trap.c:1140


/sys/kern/vfs_syscalls:
	        if ((error = getvnode(td->td_proc->p_fd, SCARG(uap, fd), &fp)) != 0)
	                return (error);
	        mp = ((struct vnode *)fp->f_data)->v_mount;
798:	        sp = &mp->mnt_stat;


0xc0226c24 <fstatfs+32>:        call   0xc022a864 <getvnode>
0xc0226c29 <fstatfs+37>:        add    $0xc,%esp
0xc0226c2c <fstatfs+40>:        test   %eax,%eax
0xc0226c2e <fstatfs+42>:        jne    0xc0226cae <fstatfs+170>
0xc0226c30 <fstatfs+44>:        mov    0xfffffefc(%ebp),%eax
0xc0226c36 <fstatfs+50>:        mov    0x30(%eax),%eax
0xc0226c39 <fstatfs+53>:        mov    0x14(%eax),%ebx
0xc0226c3c <fstatfs+56>:        lea    0x58(%ebx),%esi
0xc0226c3f <fstatfs+59>:        mov    0x8(%ebx),%eax		<== TRAP 12
0xc0226c42 <fstatfs+62>:        push   %edi
0xc0226c43 <fstatfs+63>:        push   %esi
0xc0226c44 <fstatfs+64>:        push   %ebx
0xc0226c45 <fstatfs+65>:        mov    0x14(%eax),%eax
0xc0226c48 <fstatfs+68>:        call   *%eax


(kgdb) info reg
eax            0x0      0
ecx            0x0      0
edx            0x0      0
ebx            0x0      0
esp            0xd25b99d8       0xd25b99d8
ebp            0xd25b9cf4       0xd25b9cf4
esi            0x58     88
edi            0xd241b704       -767445244
eip            0xc02ea278       0xc02ea278
eflags         0x0      0
cs             0x0      0
ss             0x0      0
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x2f     47


>How-To-Repeat:
	
Run nessus configured to scan 20 hosts at a time. Allow a few minutes for
delivery of a trap 12 ;-)

>Fix:
No idea. Locking issue (vnode information becomes stale while being accessed) ???
	
>Release-Note:
>Audit-Trail:

From: Stefan.Esser@o-tel-o.de
To: freebsd-gnats-submit@FreeBSD.org
Cc: Stefan.Esser@Arcor.Net, se@freebsd.org
Subject: Re: kern/32681: Reproducable PANIC in -stable
Date: Tue, 11 Dec 2001 16:00:45 +0100

 Further debugging reveals, that the panic is caused by PROCFS (on my -c=
 urrent
 box running either an non-SMP or SMP kernel).
 
 In order to check for the presence of some process %PID%, nessusd perfo=
 rms an
 opendir("/proc/%PID%"), which can lead to a panic in fstafs(), if the
 corresponding process just finishes just before the call to fstatfs ...=
 
 
  /*
   * First method : attempt to open /proc/<pid>
   * (we first check that we can open /proc/ourpid because
   * we may run on some weird system)
   */
  sprintf(procname, "/proc/%d", getpid());
  dir =3D opendir(procname);
  if(dir)                /* The /proc trick worked for us */
  {
  closedir(dir);
  sprintf(procname, "/proc/%d", pid);
  dir =3D opendir(procname);
  if(dir){
         closedir(dir);
         return 1;
         }
  else return 0;
  }
 
 Truss output caused by opendir("/proc/%PID%"). Seems that either the ar=
 gument
 printing does not work for the specific system calls (or there is some =
 other
 problem), but the opendir call had been issued with a valid directory n=
 ame (see
 debugging output below):
 
 stat("D=F2=BF=BFqyp(=A8=ACr(=FFt",0xbfbff004)                =3D 0 (0x0=
 )
 open("qyp(=A8=ACr(=FFt",4,027757771104)                =3D 13 (0xd)
 fstat(13,0xbfbff004)                             =3D 0 (0x0)
 fcntl(0xd,0x2,0x1)                               =3D 0 (0x0)
 fstatfs(0xd,0xbfbfef04)                          =3D 0 (0x0)
 
 Debugging output:
 
 905      sprintf(procname, "/proc/%d", getpid());
 (gdb)
 getpid()                                 =3D 724 (0x2d4)
 906      dir =3D opendir(procname);
 (gdb)
 907      if(dir)        /* The /proc trick worked for us */
 (gdb) p procname
 $7 =3D "/proc/724\000"
 (gdb) s
 909      closedir(dir);
 (gdb)
 close(13)                                =3D 0 (0x0)
 910      sprintf(procname, "/proc/%d", pid);
 =
 
 
Responsible-Changed-From-To: freebsd-bugs->des 
Responsible-Changed-By: sheldonh 
Responsible-Changed-When: Tue Dec 11 07:20:30 PST 2001 
Responsible-Changed-Why:  
Fortunately, we have a procfs maintainer. :-) 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=32681 

From: Stefan.Esser@o-tel-o.de
To: freebsd-gnats-submit@FreeBSD.org
Cc: se@freebsd.org
Subject: Re: kern/32681: Reproducable PANIC in -stable and .current
Date: Tue, 11 Dec 2001 16:35:33 +0100

 The following patch works for me on -current:
 
 Index: vfs_syscalls.c
 ===================================================================
 RCS file: /usr/cvs/src/sys/kern/vfs_syscalls.c,v
 retrieving revision 1.215
 diff -u -3 -r1.215 vfs_syscalls.c
 --- vfs_syscalls.c  5 Nov 2001 10:33:45 -0000      1.215
 +++ vfs_syscalls.c  11 Dec 2001 15:07:18 -0000
 @@ -795,6 +795,8 @@
      if ((error = getvnode(td->td_proc->p_fd, SCARG(uap, fd), &fp)) != 0)
           return (error);
      mp = ((struct vnode *)fp->f_data)->v_mount;
 +    if (mp == NULL)
 +         return (ENOENT);
      sp = &mp->mnt_stat;
      error = VFS_STATFS(mp, sp, td);
      if (error)
 
 

From: Dag-Erling Smorgrav <des@ofug.org>
To: Stefan.Esser@o-tel-o.de
Cc: freebsd-gnats-submit@FreeBSD.org, fs@freebsd.org
Subject: Re: kern/32681: Reproducable PANIC in -stable
Date: 11 Dec 2001 18:30:30 +0100

 Stefan.Esser@o-tel-o.de writes:
 > In order to check for the presence of some process %PID%, nessusd performs an
 > opendir("/proc/%PID%"), which can lead to a panic in fstafs(), if the
 > corresponding process just finishes just before the call to fstatfs ...
 
 Ah, OK, I see.  First, if nessus needs to check for the existence of a
 process with a given PID, it should kill(pid, 0) instead of relying on
 procfs.  Second, the problem seems to be insufficient locking -
 possibly in pseudofs, but equally possibly in the VFS system.  When a
 process terminates, pseudofs automatically reclaims all vnodes
 associated with it, which leads to an obvious race condition which can
 only be avoided through proper locking.
 
 Looking at the pseudofs code, it doesn't seem to be at fault.  When
 the process exits, pfs_exit() removes all vnodes associated with it
 from pseudofs' vnode cache, and vgone()s them.
 
 I'm starting to think that the race might be in namei(), actually -
 since the namei() call from fstatfs() succeeds, it means namei() found
 a vnode, but that vnode's v_mount is NULL, which means it's been
 reclaimed, which can't happen if it's referenced (can it?), and
 namei() is supposed to reference the vnode before it returns it.  My
 bet is that there is a small race between namei() finding a vnode and
 referencing it, but I'm getting out of my depth.  Does anybody else
 have any ideas?
 
 DES
 -- 
 Dag-Erling Smorgrav - des@ofug.org
State-Changed-From-To: open->closed 
State-Changed-By: se 
State-Changed-When: Thu Jan 3 02:04:51 PST 2002 
State-Changed-Why:  
Fix provided in PR has been commited (with EBADF instead of ENOENT). 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=32681 
>Unformatted:
