From nobody@FreeBSD.ORG  Sat Sep  9 07:00:39 2000
Return-Path: <nobody@FreeBSD.ORG>
Received: by hub.freebsd.org (Postfix, from userid 32767)
	id 91D4337B424; Sat,  9 Sep 2000 07:00:39 -0700 (PDT)
Message-Id: <20000909140039.91D4337B424@hub.freebsd.org>
Date: Sat,  9 Sep 2000 07:00:39 -0700 (PDT)
From: dl@leo.org
Sender: nobody@FreeBSD.ORG
To: freebsd-gnats-submit@FreeBSD.org
Subject: multiple crashes while using vinum
X-Send-Pr-Version: www-1.0

>Number:         21148
>Category:       kern
>Synopsis:       multiple crashes while using vinum
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    grog
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Sep 09 07:10:00 PDT 2000
>Closed-Date:    Mon Jan 1 14:35:55 PST 2001
>Last-Modified:  Wed Jan  3 20:10:02 PST 2001
>Originator:     Daniel Lang
>Release:        4.1-STABLE
>Organization:
TU Muenchen
>Environment:
FreeBSD atleo4.leo.org 4.1-STABLE FreeBSD 4.1-STABLE #0: Fri Sep  8 10:24:40 CEST 2000     root@atleo2.leo.org:/usr/obj/usr/src/sys/ATLEO4  i386

>Description:
The machine crashed repeatedly after a vinum raid5 was set up
and used heavily.

Hardware: Dell Poweredge 6100/200 4xPPro SMP machine, with 3
Adaptec SCSI controllers and one Promise Fasttrack ATA100 IDE
controller... see dmesg:

dmesg output:
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 4.1-STABLE #0: Fri Sep  8 10:24:40 CEST 2000
    root@atleo2.leo.org:/usr/obj/usr/src/sys/ATLEO4
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium Pro (198.95-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x619  Stepping = 9
  Features=0xfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV>
real memory  = 536870912 (524288K bytes)
avail memory = 518316032 (506168K bytes)
Programming 16 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfec08000
 cpu1 (AP):  apic id:  4, version: 0x00040011, at 0xfec08000
 cpu2 (AP):  apic id:  1, version: 0x00040011, at 0xfec08000
 cpu3 (AP):  apic id:  2, version: 0x00040011, at 0xfec08000
 io0 (APIC): apic id: 14, version: 0x000f0011, at 0xfec00000
Preloaded elf kernel "kernel" at 0xc0401000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82454KX/GX (Orion) host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xff80-0xff9f mem 0xfe900000-0xfe9fffff,0xfe2ff000-0xfe2fffff irq 10 at device 11.0 on pci0
fxp0: Ethernet address 00:a0:c9:99:47:2c
ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0xfc00-0xfcff mem 0xfeaff000-0xfeafffff irq 11 at device 12.0 on pci0
ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
isab0: <Intel 82375EB PCI-EISA bridge> at device 14.0 on pci0
eisa0: <EISA bus> on isab0
mainboard0: <INT31c0 (System Board)> on eisa0 slot 0
isa0: <ISA bus> on isab0
chip0: <> mem 0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfec01000-0xfec013ff at device 15.0 on pci0
chip1: <Intel 82453KX/GX (Orion) PCI memory controller> at device 20.0 on pci0
pcib1: <Intel 82454KX/GX (Orion) host to PCI bridge> on motherboard
pci1: <PCI bus> on pcib1
ahc1: <Adaptec aic7880 Ultra SCSI adapter> port 0xec00-0xecff mem 0xfe1ff000-0xfe1fffff irq 5 at device 11.0 on pci1
ahc1: Using left over BIOS settings
ahc1: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
ahc2: <Adaptec aic7880 Ultra SCSI adapter> port 0xe800-0xe8ff mem 0xfe1fe000-0xfe1fefff irq 5 at device 12.0 on pci1
ahc2: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
ahc2: Host Adapter Bios disabled.  Using default SCSI device parameters
atapci0: <Promise ATA100 controller> port 0xe480-0xe4bf,0xe4f0-0xe4f3,0xe4e8-0xe4ef,0xe4f4-0xe4f7,0xe4f8-0xe4ff mem 0xfe1a0000-0xfe1bffff irq 9 at device 13.0 on pci1
ata2: at 0xe4f8 on atapci0
ata3: at 0xe4e8 on atapci0
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model Generic PS/2 mouse, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <12 virtual consoles, flags=0x100>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A, console
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: parallel port not found.
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to accept, logging limited to 100 packets/entry by default
IPv6 packet filtering initialized, default to accept, logging limited to 100 packets/entry
IPsec: Initialized Security Association Processing.
IP Filter: v3.4.8 initialized.  Default = pass all, Logging = enabled
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
ad0: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata2-master using UDMA100
ad1: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata2-slave using UDMA100
ad2: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata3-master using UDMA100
ad3: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata3-slave using UDMA100
Waiting 3 seconds for SCSI devices to settle
pt0 at ahc1 bus 0 target 6 lun 0
pt0: <DELL 6UW BACKPLANE 7> Fixed Processor SCSI-2 device 
pt0: 3.300MB/s transfers
sa0 at ahc2 bus 0 target 6 lun 0
sa0: <ARCHIVE Python 29987-XXX 5.AM> Removable Sequential Access SCSI-2 device 
sa0: 4.545MB/s transfers (4.545MHz, offset 15)
ses0 at ahc1 bus 0 target 6 lun 0
ses0: <DELL 6UW BACKPLANE 7> Fixed Processor SCSI-2 device 
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
da2 at ahc1 bus 0 target 2 lun 0
da2: <SEAGATE ST19171W 2224> Fixed Direct Access SCSI-2 device 
da2: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da2: 8683MB (17783112 512 byte sectors: 64H 32S/T 8683C)
da3 at ahc1 bus 0 target 3 lun 0
da3: <SEAGATE ST19171W 2224> Fixed Direct Access SCSI-2 device 
da3: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da3: 8683MB (17783112 512 byte sectors: 64H 32S/T 8683C)
da0 at ahc1 bus 0 target 0 lun 0
da0: <SEAGATE ST34572WC 0784> Fixed Direct Access SCSI-2 device 
da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da0: 4095MB (8388315 512 byte sectors: 64H 32S/T 4095C)
da1 at ahc1 bus 0 target 1 lun 0
da1: <SEAGATE ST34572WC 0784> Fixed Direct Access SCSI-2 device 
da1: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da1: 4095MB (8388315 512 byte sectors: 64H 32S/T 4095C)
ch0 at ahc2 bus 0 target 6 lun 1
ch0: <ARCHIVE Python 29987-XXX 5.AM> Removable Changer SCSI-2 device 
ch0: 4.545MB/s transfers (4.545MHz, offset 15)
ch0: 0 slots, 1 drive, 1 picker, 0 portals
Mounting root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted
vinum: loaded
vinum: reading configuration from /dev/ad3s1e
vinum: updating configuration from /dev/ad2s1e
vinum: updating configuration from /dev/ad1s1e
vinum: updating configuration from /dev/ad0s1e
cd0 at ahc2 bus 0 target 5 lun 0
cd0: <NEC CD-ROM DRIVE:464 1.05> Removable CD-ROM SCSI-2 device 
cd0: 20.000MB/s transfers (20.000MHz, offset 15)
cd0: Attempt to query device size failed: NOT READY, Medium not present

Kernel Config file:

machine         i386
#cpu            I386_CPU
#cpu            I486_CPU
#cpu            I586_CPU
cpu             I686_CPU
ident           ATLEO4
maxusers        256

makeoptions     DEBUG=-g                #Build kernel with gdb(1) debug symbols

options         INET                    #InterNETworking
options         INET6                   #IPv6 communications protocols
options         IPSEC                   #IP security
options         IPSEC_ESP               #IP security (crypto; define w/ IPSEC)
options         IPSEC_DEBUG             #debug for IP security
options         MROUTING
options         IPFIREWALL              #firewall
options         IPFIREWALL_VERBOSE      #print information about
                                        # dropped packets
options         IPFIREWALL_FORWARD      #enable transparent proxy support
options         IPFIREWALL_VERBOSE_LIMIT=100    #limit verbosity
options         IPFIREWALL_DEFAULT_TO_ACCEPT    #allow everything by default
options         IPV6FIREWALL            #firewall for IPv6
options         IPV6FIREWALL_VERBOSE
options         IPV6FIREWALL_VERBOSE_LIMIT=100
options         IPV6FIREWALL_DEFAULT_TO_ACCEPT
options         IPDIVERT                #divert sockets
options         IPFILTER                #ipfilter support
options         IPFILTER_LOG            #ipfilter logging
options         IPSTEALTH               #support for stealth forwarding
options         TCPDEBUG
#options         TCP_DROP_SYNFIN         #drop TCP packets with SYN+FIN
options         TCP_RESTRICT_RST        #restrict emission of TCP RST
options         NETATALK                #Appletalk protocol


options         FFS                     #Berkeley Fast Filesystem
options         FFS_ROOT                #FFS usable as root device [keep this!]
options         SOFTUPDATES             #Enable FFS soft updates support
options         MFS                     #Memory Filesystem
options         MD_ROOT                 #MD is a potential root device
options         NFS                     #Network Filesystem
options         NFS_ROOT                #NFS usable as root device, NFS required

options         COMPAT_43               #Compatible with BSD 4.3 [KEEP THIS!]
options         SCSI_DELAY=3000 #Delay (in ms) before probing SCSI
options         UCONSOLE                #Allow users to grab the console
options         USERCONFIG              #boot -c editor
options         VISUAL_USERCONFIG       #visual boot -c editor
options         KTRACE                  #ktrace(1) support
options         SYSVSHM                 #SYSV-style shared memory
options         SYSVMSG                 #SYSV-style message queues
options         SYSVSEM                 #SYSV-style semaphores
options         P1003_1B                #Posix P1003_1B real-time extensions
options         _KPOSIX_PRIORITY_SCHEDULING
options         ICMP_BANDLIM            #Rate limit bad replies
options         KBD_INSTALL_CDEV        # install a CDEV entry in /dev
options         NETGRAPH

# To make an SMP kernel, the next two are needed
options         SMP                     # Symmetric MultiProcessor Kernel
options         APIC_IO                 # Symmetric (APIC) I/O
# Optionally these may need tweaked, (defaults shown):
options         NCPU=4                  # number of CPUs
options         NBUS=3                  # number of busses
options         NAPIC=1                 # number of IO APICs
options         NINTR=24                # number of INTs

device          isa
device          eisa
device          pci

# Floppy drives
device          fdc0    at isa? port IO_FD1 irq 6 drq 2
device          fd0     at fdc0 drive 0
device          fd1     at fdc0 drive 1

# ATA and ATAPI devices
#device         ata0    at isa? port IO_WD1 irq 14
#device         ata1    at isa? port IO_WD2 irq 15
device          ata
device          atadisk                 # ATA disk drives
device          atapicd                 # ATAPI CDROM drives
device          atapifd                 # ATAPI floppy drives
device          atapist                 # ATAPI tape drives
#options        ATA_STATIC_ID           #Static device numbering
options         ATA_ENABLE_ATAPI_DMA    #Enable DMA on ATAPI devices


# SCSI Controllers
#device         ahb             # EISA AHA1742 family
device          ahc0            # AHA2940 and onboard AIC7xxx devices
device          ahc1            # AHA2940 and onboard AIC7xxx devices
device          ahc2            # AHA2940 and onboard AIC7xxx devices

# SCSI peripherals
device          scbus           # SCSI bus (required)
device          da              # Direct Access (disks)
device          sa              # Sequential Access (tape etc)
device          ch              # SCSI media changers
device          cd              # CD
device          pass            # Passthrough device (direct SCSI access)
device          pt              # SCSI processor type
device          ses             # SCSI SES/SAF-TE driver

# disks
# the first ahc0 ist the external controller, which we use as last bus
# the first internal ahc1 is the first we use with the SCA disks
# the second internal ahc2 has the CD-ROM and the Archive Python
device          scbus0 at ahc1
device          scbus1 at ahc2
device          scbus2 at ahc0
device          da0 at scbus0 target 0
device          da1 at scbus0 target 1
device          da2 at scbus0 target 2
device          da3 at scbus0 target 3

# atkbdc0 controls both the keyboard and the PS/2 mouse
device          atkbdc0 at isa? port IO_KBD
device          atkbd0  at atkbdc? irq 1 flags 0x1
device          psm0    at atkbdc? irq 12

device          vga0    at isa?

# splash screen/screen saver
pseudo-device   splash

# syscons is the default console driver, resembling an SCO console
device          sc0     at isa? flags 0x100
options         MAXCONS=12              # number of virtual consoles
options         SC_NORM_ATTR="(FG_LIGHTGREY|BG_BLACK)"
options         SC_NORM_REV_ATTR="(FG_YELLOW|BG_GREEN)"
options         SC_KERNEL_CONS_ATTR="(FG_WHITE|BG_BLUE)"
options         SC_KERNEL_CONS_REV_ATTR="(FG_BLACK|BG_RED)"

# Floating point support - do not disable.
device          npx0    at nexus? port IO_NPX irq 13

# Power management support (see LINT for more options)
device          apm0    at nexus? disable flags 0x20 # Advanced Power Management

# PCCARD (PCMCIA) support

# Serial (COM) ports
device          sio0    at isa? port IO_COM1 flags 0x10 irq 4
device          sio1    at isa? port IO_COM2 irq 3
device          sio2    at isa? disable port IO_COM3 irq 5
device          sio3    at isa? disable port IO_COM4 irq 9

# Parallel port
device          ppc0    at isa? irq 7
device          ppbus           # Parallel port bus (required)
device          lpt             # Printer
device          plip            # TCP/IP over parallel
device          ppi             # Parallel port interface device
#device         vpo             # Requires scbus and da

# PCI Ethernet NICs.
device          de              # DEC/Intel DC21x4x (``Tulip'')
device          fxp             # Intel EtherExpress PRO/100B (82557, 82558)
device          tx              # SMC 9432TX (83c170 ``EPIC'')
device          vx              # 3Com 3c590, 3c595 (``Vortex'')
device          wx              # Intel Gigabit Ethernet Card (``Wiseman'')

# PCI Ethernet NICs that use the common MII bus controller code.
device          miibus          # MII bus support
device          dc              # DEC/Intel 21143 and various workalikes
device          rl              # RealTek 8129/8139
device          sf              # Adaptec AIC-6915 (``Starfire'')
device          sis             # Silicon Integrated Systems SiS 900/SiS 7016
device          ste             # Sundance ST201 (D-Link DFE-550TX)
device          tl              # Texas Instruments ThunderLAN
device          vr              # VIA Rhine, Rhine II
device          wb              # Winbond W89C840F
device          xl              # 3Com 3c90x (``Boomerang'', ``Cyclone'')

# ISA Ethernet NICs.

# Pseudo devices - the number indicates how many units to allocated.
pseudo-device   loop            # Network loopback
pseudo-device   ether           # Ethernet support
pseudo-device   sl      1       # Kernel SLIP
pseudo-device   ppp     1       # Kernel PPP
pseudo-device   tun             # Packet tunnel.
pseudo-device   pty     256     # Pseudo-ttys (telnet etc)
pseudo-device   md              # Memory "disks"
pseudo-device   gif     4       # IPv6 and IPv4 tunneling
pseudo-device   faith   1       # IPv6-to-IPv4 relaying (translation)
pseudo-device   vn
pseudo-device   snp     4

# The `bpf' pseudo-device enables the Berkeley Packet Filter.
# Be aware of the administrative consequences of enabling this!
pseudo-device   bpf             #Berkeley packet filter

# USB support
device          uhci            # UHCI PCI->USB interface
device          ohci            # OHCI PCI->USB interface
device          usb             # USB Bus (required)
device          ugen            # Generic
device          uhid            # "Human Interface Devices"
device          ukbd            # Keyboard
device          ulpt            # Printer
device          umass           # Disks/Mass storage - Requires scbus and da
device          ums             # Mouse
# USB Ethernet, requires mii
device          aue             # ADMtek USB ethernet
device          cue             # CATC USB ethernet
device          kue             # Kawasaki LSI USB ethernet

VINUM statements according to instructions on www.vinumvm.org:

Problem: Subsequent crashes (kernel panics) during heavy disk-access on a vinum
         device.
FreeBSD: 4.1-STABLE, no changes to the sources

Vinum list: one raid5 volume from 4 ATA drives

atleo4:/usr/src#vinum list
4 drives:
D d1                    State: up       Device /dev/ad0s1e      Avail: 0/73304 MB (0%)
D d2                    State: up       Device /dev/ad1s1e      Avail: 0/73304 MB (0%)
D d3                    State: up       Device /dev/ad2s1e      Avail: 0/73304 MB (0%)
D d4                    State: up       Device /dev/ad3s1e      Avail: 0/73304 MB (0%)
 
1 volumes:
V leoata                State: up       Plexes:       1 Size:        214 GB

1 plexes:
P leoata.p0          R5 State: up       Subdisks:     4 Size:        214 GB

4 subdisks:
S leoata.p0.s0          State: up       PO:        0  B Size:         71 GB
S leoata.p0.s1          State: up       PO:      512 kB Size:         71 GB
S leoata.p0.s2          State: up       PO:     1024 kB Size:         71 GB
S leoata.p0.s3          State: up       PO:     1536 kB Size:         71 GB 


The history file reflects the creation of the volume
which didn't cause any problems:

History file in: /var/log/vinum_history (not /var/tmp !):
[..]
 6 Sep 2000 17:41:13.473942 *** vinum started ***
 6 Sep 2000 17:41:13.475950 create -v vinum.init.leoata
drive d1 device /dev/ad0e
drive d2 device /dev/ad1e
drive d3 device /dev/ad2e
drive d4 device /dev/ad3e
volume leoata
  plex org raid5 512k
    sd length 150127097s drive d1
    sd length 150127097s drive d2
    sd length 150127097s drive d3
    sd length 150127097s drive d4
 6 Sep 2000 17:41:13.491734 *** Created devices ***
[..]
 6 Sep 2000 17:50:55.914542 *** vinum started ***
 6 Sep 2000 17:50:55.916405 init -w leoata.p0
[..]

/var/log/messages from the same period:
[..]
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d1 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d2 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d3 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d4 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: removing 1515 blocks of partial stripe at the en
d of leoata.p0
Sep  6 17:50:55 atleo4 /kernel: vinum: leoata.p0.s2 is initializing by force
Sep  6 17:50:55 atleo4 /kernel: vinum: leoata.p0 is initializing
Sep  6 17:50:55 atleo4 /kernel: vinum: leoata.p0.s0 is initializing by force
Sep  6 17:50:56 atleo4 /kernel: vinum: leoata.p0.s1 is initializing by force
Sep  6 17:50:56 atleo4 /kernel: vinum: leoata.p0.s3 is initializing by force
[..]
Sep  6 21:08:09 atleo4 /kernel: vinum: leoata.p0.s0 is initialized by force
Sep  6 21:08:10 atleo4 /kernel: vinum: leoata.p0.s0 is initialized
Sep  6 21:08:10 atleo4 /kernel: vinum: leoata.p0.s1 is initialized by force
Sep  6 21:08:10 atleo4 /kernel: vinum: leoata.p0.s1 is initialized
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s2 is initialized by force
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s2 is initialized
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s3 is initialized by force
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s0 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s1 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s2 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s3 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s3 is up
[..]

newfs, mount, etc worked.

Crash anlysis: 4 crashes total within two days!!
The machine was did not crash before vinum was used on it.

I'm pretty sure, that the modules and kernel are compiled
with debugging symbols, that is, configured with -g (CONFIGARGS= -g), and
makeoptions     DEBUG=-g in the kernel config.

atleo4:/var/crash#file /modules/vinum.ko
/modules/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, version 1 (FreeBSD), not
stripped

atleo4:/var/crash#file kernel.1
kernel.1: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped
atleo4:/var/crash#file kernel.2
kernel.2: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped
atleo4:/var/crash#file kernel.3
kernel.3: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped
atleo4:/var/crash#file kernel.4
kernel.4: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped

But I don't seem to get a proper analysis with your .gdbinit.* files,
and gdb says: no debugging symbols found ???

Maybe there is something I missed, but what ???

However...

Crash 1:
atleo4:/var/crash#gdb -k kernel.1 vmcore.1
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
SMP 4 cpus
IdlePTD 4284416
initial pcb at 3608e0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x0
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc23266ca
stack pointer           = 0x10:0xff806f00
frame pointer           = 0x10:0xff806f1c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks...

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x30
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0273971
stack pointer           = 0x10:0xff806d20
frame pointer           = 0x10:0xff806d24
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0
Uptime: 1h18m17s

dumping to dev #da/0x20001, offset 1048576
dump 512 ...
---
#0  0xc016b6b8 in boot ()
.gdbinit:4: Error in sourced command file:
Attempt to extract a component of a value that is not a structure.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This may be because of missing debugging symbols ??
Stacktrace:
(kgdb) bt
#0  0xc016b6b8 in boot ()
#1  0xc016ba70 in poweroff_wait ()
#2  0xc02d9baf in trap_fatal ()
#3  0xc02d9845 in trap_pfault ()
#4  0xc02d93df in trap ()
#5  0xc0273971 in acquire_lock ()
#6  0xc0277660 in softdep_update_inodeblock ()
#7  0xc0272c5d in ffs_update ()
#8  0xc027a931 in ffs_sync ()
#9  0xc01993f3 in sync ()
#10 0xc016b48b in boot ()
#11 0xc016ba70 in poweroff_wait ()
#12 0xc02d9baf in trap_fatal ()
#13 0xc02d9845 in trap_pfault ()
#14 0xc02d93df in trap ()
#15 0xc23266ca in ?? ()
#16 0xc019136b in biodone ()
#17 0xc02af030 in ad_interrupt ()
#18 0xc02ab3e6 in ata_intr ()
#19 0xc02e202d in intr_mux ()

Crash 2:
[..]
SMP 4 cpus
IdlePTD 4284416
initial pcb at 3608e0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0xc3608010
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc232a112
stack pointer           = 0x10:0xff806ee8
frame pointer           = 0x10:0xff806ef0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks...

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x30
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0273971
stack pointer           = 0x10:0xff806d08
frame pointer           = 0x10:0xff806d0c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0
Uptime: 14h5m17s
[..]

#0  0xc016b6b8 in boot ()
#1  0xc016ba70 in poweroff_wait ()
#2  0xc02d9baf in trap_fatal ()
#3  0xc02d9845 in trap_pfault ()
#4  0xc02d93df in trap ()
#5  0xc0273971 in acquire_lock ()
#6  0xc0277660 in softdep_update_inodeblock ()
#7  0xc0272c5d in ffs_update ()
#8  0xc027a931 in ffs_sync ()
#9  0xc01993f3 in sync ()
#10 0xc016b48b in boot ()
#11 0xc016ba70 in poweroff_wait ()
#12 0xc02d9baf in trap_fatal ()
#13 0xc02d9845 in trap_pfault ()
#14 0xc02d93df in trap ()
#15 0xc232a112 in ?? ()
#16 0xc2326bfc in ?? ()
#17 0xc019136b in biodone ()
#18 0xc02af030 in ad_interrupt ()
#19 0xc02ab3e6 in ata_intr ()
#20 0xc02e202d in intr_mux ()

Crash 3: This one is different ...

SMP 4 cpus
IdlePTD 4272128
initial pcb at 360920
panicstr: ffs_valloc: dup alloc
panic messages:
---
panic: ffs_valloc: dup alloc
mp_lock = 00000001; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks... 166 38 19 5 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
giving up on 2 buffers
Uptime: 11h42m59s
[..]
#0  0xc016b6bc in boot ()
#1  0xc016ba74 in poweroff_wait ()
#2  0xc0270030 in ffs_valloc ()
#3  0xc02817ca in ufs_mkdir ()
#4  0xc02827d5 in ufs_vnoperate ()
#5  0xc019c28a in mkdir ()
#6  0xc02d9f09 in syscall2 ()
#7  0xc02c845b in Xint0x80_syscall ()
#8  0x804efc7 in ?? ()
#9  0x80494fd in ?? ()
[..]

Crash 4:

SMP 4 cpus
IdlePTD 4272128
initial pcb at 360920
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 03000002; cpuid = 3; lapic.id = 02000000
fault virtual address   = 0xc32c9010
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc232a112
stack pointer           = 0x10:0xff81bee8
frame pointer           = 0x10:0xff81bef0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 03000002; cpuid = 3; lapic.id = 02000000
boot() called on cpu#3

syncing disks...

Fatal trap 12: page fault while in kernel mode
mp_lock = 03000003; cpuid = 3; lapic.id = 02000000
fault virtual address   = 0x30
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc027397d
stack pointer           = 0x10:0xff81bd00
frame pointer           = 0x10:0xff81bd04
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 03000003; cpuid = 3; lapic.id = 02000000
boot() called on cpu#3
Uptime: 3h23m29s
[..]
(kgdb) bt
#0  0xc016b6bc in boot ()
#1  0xc016ba74 in poweroff_wait ()
'#2  0xc02d9bdf in trap_fatal ()
#3  0xc02d9875 in trap_pfault ()
#4  0xc02d940f in trap ()
#5  0xc027397d in acquire_lock ()
#6  0xc0277b52 in softdep_fsync_mountdev ()
#7  0xc027bc9a in ffs_fsync ()
#8  0xc027a9c6 in ffs_sync ()
#9  0xc01993e7 in sync ()
#10 0xc016b48f in boot ()
#11 0xc016ba74 in poweroff_wait ()
#12 0xc02d9bdf in trap_fatal ()
#13 0xc02d9875 in trap_pfault ()
#14 0xc02d940f in trap ()
#15 0xc232a112 in ?? ()
#16 0xc2326bfc in ?? ()
#17 0xc019135f in biodone ()
#18 0xc02af068 in ad_interrupt ()
#19 0xc02ab41e in ata_intr ()
#20 0xc02e205d in intr_mux ()
[..]

Of course this could be a ATA problem, but I already had
two crashes in a previous configuration while trying to
set up a stripe with two SCSI disks.
A detailed description of these previous problems has
been sent to Greg Lehey <grog@lemis.com> on August 16 2000.






>How-To-Repeat:

Tricky, this some sort of unique hardware configuration.
On this configuration it seems to be sufficient to 
transfer huge amounts of data to the vinum device
(around 100GB have been transferred in total, with interruptions
of the crashes. The largest portion during uptime may be
around 50GB). The data was transferred via NFS.

The filesystem uses SOFTUPDATES, the first crash 
corrupted it in severe way, so that fsck had to 
be run manually (producing lots of 'unexpected softupdates
inconsistency' errors). But I guess thats just a side-effect.
>Fix:
Nope.

>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback 
State-Changed-By: grog 
State-Changed-When: Sat Sep 9 17:41:25 PDT 2000 
State-Changed-Why:  
Submitter did not supply the required information. 


Responsible-Changed-From-To: freebsd-bugs->grog 
Responsible-Changed-By: grog 
Responsible-Changed-When: Sat Sep 9 17:41:25 PDT 2000 
Responsible-Changed-Why:  
grog supports Vinum. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=21148 

From: Daniel Lang <dl@leo.org>
To: grog@FreeBSD.org
Cc: freebsd-bugs@FreeBSD.org, freebsd-gnats-submit@freebsd.org
Subject: Re: kern/21148: multiple crashes while using vinum
Date: Mon, 11 Sep 2000 13:54:21 +0000

 Ok,
 
 Some more information, unfortunately its no backtrace with 
 "vinum debug" in the calling frame. I will try to build vinum
 statically in the kernel, maybe this could help...
 
 So, I can now reproduce panics in a deterministic way.
 The machine repeatedly crashed during periodic daily,
 and I could track it down to a simple:
 
 find /leo/.mntpts/2 -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or -perm -g+s ) -print0
 
 (with /leo/.mntpts/2 beeing the mountpoint of the vinum volume).
 
 And this also works by just executing
 find /leo/.mntpts/2 -xdev -type f -print.
 
 That may not help much, but's something more...
 
 Daniel
 -- 
 IRCnet: Mr-Spock         - ceterum censeo Microsoftinem esse delendam -  
 *Daniel Lang * dl@leo.org * +49 89 289 25735 * http://www.leo.org/~dl/*
 

From: Daniel Lang <dl@leo.org>
To: grog@FreeBSD.org
Cc: freebsd-bugs@FreeBSD.org, freebsd-gnats-submit@freebsd.org
Subject: Re: kern/21148: multiple crashes while using vinum
Date: Mon, 18 Sep 2000 10:52:52 +0000

 Hi,
 
 to further trace the problem, as the crash-dumps seemed not
 to produce any usable stack traces, I hooked the box up to a
 remote debugging session with DDB/GDB.
 (No problem to panic the box, as described).
 
 As far as I can tell, the crash did not happen inside the
 vinum-module. This may be the cause, why your .gdbinit scripts
 don't seem to apply, I guess.
 The crash happened inside the ata driver, but it seems that a 
 former valid pointer is overwritten somehow, so that it contains
 garbage, which leads to the crash
 (I assume that the 'struct ata_softc' is corrupted).
 
 This probably comes from a buffer overrun somewhere else, that
 writes into already allocated memory. Unfortunately such errors
 are very difficult to trace (well at least for my experience).
 Since the error only appears on the system with a vinum RAID-5
 and only and reproducible while accessing this filesystem,
 I (possibly naive) assume it must be a problem with vinum.
 
 Unfortunatelty I guess we are stuck here, since I am not able
 to produce more data that could help with the problem.
 
 However I would grant access to the machine and the debugger,
 if someone would like to inspect the situation personally.
 
 Best regards,
  Daniel Lang
 -- 
 IRCnet: Mr-Spock              - Truth lies in the eye of the beholder - 
 *Daniel Lang * dl@leo.org * +49 89 289 25735 * http://www.leo.org/~dl/*
 

From: Andy Newman <andy@silverbrook.com.au>
To: freebsd-gnats-submit@FreeBSD.org, dl@leo.org
Cc:  
Subject: Re: kern/21148: multiple crashes while using vinum
Date: Wed, 20 Sep 2000 14:12:59 +1100

 I think I'm suffering this too.  I'm getting reliable panics on a vinum
 RAID 5 config over SCSI drives. Striped or mirror configs are fine. At
 first I was attributing this to problems with an Adpatec 29160 but it
 doesn't appear to be the culprit.
 
 A little inspection of the crash dump (backtrace etc... follows) shows
 problems with the request structure in complete_rqe().  I have two
 dumps, both show different problems but both associated with request
 completion.
 
 BTW, I tried the gdb macros from the module source directory but it
 issues error messages.  Hopefully the crash dump appears sane
 (I've done lots of kernel work on other systems but no FreeBSD),
 following pointers makes it appear so.
 
 
 Script started on Wed Sep 20 13:57:39 2000
 bash-2.04# gdb -k /sys/compile/yoda/kernel.debug vmcore.1
 GNU gdb 4.18
 Copyright 1998 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you
 are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for
 details.
 This GDB was configured as "i386-unknown-freebsd"...
 IdlePTD 4296704
 initial pcb at 2e7a40
 panicstr: page fault
 panic messages:
 ---
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x4
 fault code		= supervisor read, page not present
 instruction pointer	= 0x8:0xc014ab14
 stack pointer	        = 0x10:0xc02c0e94
 frame pointer	        = 0x10:0xc02c0eb0
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio 
 trap number		= 12
 panic: page fault
 
 syncing disks... 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x30
 fault code		= supervisor read, page not present
 instruction pointer	= 0x8:0xc0229aa0
 stack pointer	        = 0x10:0xc02c0cc8
 frame pointer	        = 0x10:0xc02c0ccc
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio 
 trap number		= 12
 panic: page fault
 Uptime: 2h12m16s
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x0
 fault code		= supervisor read, page not present
 instruction pointer	= 0x8:0xc014ab1a
 stack pointer	        = 0x10:0xc02c05a8
 frame pointer	        = 0x10:0xc02c05c4
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio cam 
 trap number		= 12
 panic: page fault
 Uptime: 2h12m16s
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x4
 fault code		= supervisor read, page not present
 instruction pointer	= 0x8:0xc014ab14
 stack pointer	        = 0x10:0xc02bfe88
 frame pointer	        = 0x10:0xc02bfea4
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio cam 
 trap number		= 12
 panic: page fault
 Uptime: 2h12m16s
 
 dumping to dev #da/0x20001, offset 1048704
 dump 511 510 509 508 507 506 505 504 503 502 501 500 499 498 497 496 495
 494 493 492 491 490 489 488 487 486 485 484 483 482 481 480 479 478 477
 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459
 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441
 440 439 438 437 436 435 434 433 432 431 430 429 428 427 426 425 424 423
 422 421 420 419 418 417 416 415 414 413 412 411 410 409 408 407 406 405
 404 403 402 401 400 399 398 397 396 395 394 393 392 391 390 389 388 387
 386 385 384 383 382 381 380 379 378 377 376 375 374 373 372 371 370 369
 368 367 366 365 364 363 362 361 360 359 358 357 356 355 354 353 352 351
 350 349 348 347 346 345 344 343 342 341 340 339 338 337 336 335 334 333
 332 331 330 329 328 327 326 325 324 323 322 321 320 319 318 317 316 315
 314 313 312 311 310 309 308 307 306 305 304 303 302 301 300 299 298 297
 296 295 294 293 292 291 290 289 288 287 286 285 284 283 282 281 280 279
 278 277 276 275 274 273 272 271 270 269 268 267 266 265 264 263 262 261
 260 259 258 257 256 255 254 253 252 251 250 249 248 247 246 245 244 243
 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225
 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207
 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189
 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171
 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153
 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135
 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117
 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99
 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75
 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51
 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27
 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
 ---
 #0  boot (howto=260) at ../../kern/kern_shutdown.c:302
 302			dumppcb.pcb_cr3 = rcr3();
 (kgdb) where
 #0  boot (howto=260) at ../../kern/kern_shutdown.c:302
 #1  0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #2  0xc027a78d in trap_fatal (frame=0xc02bfe48, eva=4) at
 ../../i386/i386/trap.c:951
 #3  0xc027a465 in trap_pfault (frame=0xc02bfe48, usermode=0, eva=4) at
 ../../i386/i386/trap.c:844
 #4  0xc027a04b in trap (frame={tf_fs = -1070923760, tf_es = -1072562160,
 tf_ds = -1072562160, tf_edi = -1017374328, 
       tf_esi = -1043326976, tf_ebp = -1070858588, tf_isp = -1070858636,
 tf_ebx = -1017374328, tf_edx = 6865984, 
       tf_ecx = -1043326816, tf_eax = 0, tf_trapno = 12, tf_err = 0,
 tf_eip = -1072387308, tf_cs = 8, tf_eflags = 66178, 
       tf_esp = -1017374328, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #5  0xc014ab14 in complete_rqe (bp=0xc35c1988) at
 ../../dev/vinum/vinuminterrupt.c:72
 #6  0xc018d82f in biodone (bp=0xc35c1988) at ../../kern/vfs_bio.c:2637
 #7  0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc1cfb800) at
 ../../cam/scsi/scsi_da.c:1262
 #8  0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #9  0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #10 0xc0124930 in xpt_polled_action (start_ccb=0xc02c0238) at
 ../../cam/cam_xpt.c:3393
 #11 0xc012bcc5 in dashutdown (arg=0x0, howto=260) at
 ../../cam/scsi/scsi_da.c:1554
 #12 0xc0168610 in boot (howto=260) at ../../kern/kern_shutdown.c:297
 #13 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #14 0xc027a78d in trap_fatal (frame=0xc02c0568, eva=0) at
 ../../i386/i386/trap.c:951
 #15 0xc027a465 in trap_pfault (frame=0xc02c0568, usermode=0, eva=0) at
 ../../i386/i386/trap.c:844
 #16 0xc027a04b in trap (frame={tf_fs = -1070858224, tf_es = -1072562160,
 tf_ds = -1072562160, tf_edi = -1017372280, 
       tf_esi = -1043326976, tf_ebp = -1070856764, tf_isp = -1070856812,
 tf_ebx = -1017372280, tf_edx = 0, 
       tf_ecx = -1043326816, tf_eax = -1017372672, tf_trapno = 12, tf_err
 = 0, tf_eip = -1072387302, tf_cs = 8, 
       tf_eflags = 66182, tf_esp = -1017372280, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #17 0xc014ab1a in complete_rqe (bp=0xc35c2188) at
 ../../dev/vinum/vinuminterrupt.c:73
 #18 0xc018d82f in biodone (bp=0xc35c2188) at ../../kern/vfs_bio.c:2637
 #19 0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc1eb1400) at
 ../../cam/scsi/scsi_da.c:1262
 #20 0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #21 0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #22 0xc0124930 in xpt_polled_action (start_ccb=0xc02c0958) at
 ../../cam/cam_xpt.c:3393
 #23 0xc012bcc5 in dashutdown (arg=0x0, howto=260) at
 ../../cam/scsi/scsi_da.c:1554
 #24 0xc0168610 in boot (howto=260) at ../../kern/kern_shutdown.c:297
 #25 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #26 0xc027a78d in trap_fatal (frame=0xc02c0c88, eva=48) at
 ../../i386/i386/trap.c:951
 #27 0xc027a465 in trap_pfault (frame=0xc02c0c88, usermode=0, eva=48) at
 ../../i386/i386/trap.c:844
 #28 0xc027a04b in trap (frame={tf_fs = 16, tf_es = 16, tf_ds =
 -1072300016, tf_edi = 0, tf_esi = -1069900608, 
       tf_ebp = -1070854964, tf_isp = -1070854988, tf_ebx = -1070752356,
 tf_edx = 6865984, tf_ecx = 12, tf_eax = 0, 
       tf_trapno = 12, tf_err = 0, tf_eip = -1071474016, tf_cs = 8,
 tf_eflags = 66050, tf_esp = 0, tf_ss = -1070854936})
     at ../../i386/i386/trap.c:443
 #29 0xc0229aa0 in acquire_lock (lk=0xc02d9d9c) at
 ../../ufs/ffs/ffs_softdep.c:265
 #30 0xc022dc62 in softdep_fsync_mountdev (vp=0xd4b08c00) at
 ../../ufs/ffs/ffs_softdep.c:3788
 #31 0xc0231d16 in ffs_fsync (ap=0xc02c0d40) at
 ../../ufs/ffs/ffs_vnops.c:134
 #32 0xc0230a36 in ffs_sync (mp=0xc1cfee00, waitfor=2, cred=0xc1441680,
 p=0xc03a9cc0) at vnode_if.h:537
 #33 0xc01953f7 in sync (p=0xc03a9cc0, uap=0x0) at
 ../../kern/vfs_syscalls.c:544
 #34 0xc0168413 in boot (howto=256) at ../../kern/kern_shutdown.c:224
 #35 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #36 0xc027a78d in trap_fatal (frame=0xc02c0e54, eva=4) at
 ../../i386/i386/trap.c:951
 #37 0xc027a465 in trap_pfault (frame=0xc02c0e54, usermode=0, eva=4) at
 ../../i386/i386/trap.c:844
 #38 0xc027a04b in trap (frame={tf_fs = -1070858224, tf_es = -1072562160,
 tf_ds = -1072562160, tf_edi = -1008691832, 
       tf_esi = -1043326976, tf_ebp = -1070854480, tf_isp = -1070854528,
 tf_ebx = -1008691832, tf_edx = 6865984, 
       tf_ecx = -1043326816, tf_eax = 0, tf_trapno = 12, tf_err = 0,
 tf_eip = -1072387308, tf_cs = 8, tf_eflags = 66182, 
       tf_esp = -1008691832, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #39 0xc014ab14 in complete_rqe (bp=0xc3e09588) at
 ../../dev/vinum/vinuminterrupt.c:72
 #40 0xc018d82f in biodone (bp=0xc3e09588) at ../../kern/vfs_bio.c:2637
 #41 0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc1e9b800) at
 ../../cam/scsi/scsi_da.c:1262
 #42 0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #43 0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #44 0xc0270db0 in splz_swi ()
 (kgdb) up
 #1  0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 552		boot(bootopt);
 (kgdb) 
 #2  0xc027a78d in trap_fatal (frame=0xc02bfe48, eva=4) at
 ../../i386/i386/trap.c:951
 951			panic(trap_msg[type]);
 (kgdb) 
 #3  0xc027a465 in trap_pfault (frame=0xc02bfe48, usermode=0, eva=4) at
 ../../i386/i386/trap.c:844
 844			trap_fatal(frame, eva);
 (kgdb) 
 #4  0xc027a04b in trap (frame={tf_fs = -1070923760, tf_es = -1072562160,
 tf_ds = -1072562160, tf_edi = -1017374328, 
       tf_esi = -1043326976, tf_ebp = -1070858588, tf_isp = -1070858636,
 tf_ebx = -1017374328, tf_edx = 6865984, 
       tf_ecx = -1043326816, tf_eax = 0, tf_trapno = 12, tf_err = 0,
 tf_eip = -1072387308, tf_cs = 8, tf_eflags = 66178, 
       tf_esp = -1017374328, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 443				(void) trap_pfault(&frame, FALSE, eva);
 (kgdb) 
 #5  0xc014ab14 in complete_rqe (bp=0xc35c1988) at
 ../../dev/vinum/vinuminterrupt.c:72
 72	    rqg = rqe->rqg;					    /* and the request group */
 (kgdb) list 65,75
 65	    struct rqelement *rqe;
 66	    struct request *rq;
 67	    struct rqgroup *rqg;
 68	    struct buf *ubp;					    /* user buffer */
 69	    struct drive *drive;
 70	
 71	    rqe = (struct rqelement *) bp;			    /* point to the element
 element that completed */
 72	    rqg = rqe->rqg;					    /* and the request group */
 73	    rq = rqg->rq;					    /* and the complete request */
 74	    ubp = rq->bp;					    /* user buffer */
 75	
 (kgdb) set print pretty
 (kgdb) p *rqe
 $1 = {
   b = {
     b_hash = {
       le_next = 0x0, 
       le_prev = 0x0
     }, 
     b_vnbufs = {
       tqe_next = 0x0, 
       tqe_prev = 0x0
     }, 
     b_freelist = {
       tqe_next = 0x0, 
       tqe_prev = 0x0
     }, 
     b_act = {
       tqe_next = 0xc35c0020, 
       tqe_prev = 0x0
     }, 
     b_flags = 516, 
     b_qindex = 0, 
     b_xflags = 0 '\000', 
     b_lock = {
       lk_interlock = {
         lock_data = 0
       }, 
       lk_flags = 1024, 
       lk_sharecount = 0, 
       lk_waitcount = 0, 
       lk_exclusivecount = 1, 
       lk_prio = 20, 
       lk_wmesg = 0xc029c064 "bufwait", 
       lk_timo = 0, 
       lk_lockholder = 5
     }, 
     b_error = 0, 
     b_bufsize = 0, 
     b_bcount = 16384, 
     b_resid = 0, 
     b_dev = 0x0, 
     b_data = 0xccb1e000 "A\002", 
     b_kvabase = 0x0, 
     b_kvasize = 0, 
     b_lblkno = 0, 
     b_blkno = 15051625, 
     b_offset = 0, 
     b_iodone = 0xc014aafc <complete_rqe>, 
     b_iodone_chain = 0x0, 
     b_vp = 0x0, 
     b_dirtyoff = 0, 
     b_dirtyend = 0, 
     b_rcred = 0xffffffff, 
     b_wcred = 0xffffffff, 
     b_pblkno = 0, 
     b_saveaddr = 0x0, 
     b_driver1 = 0x0, 
     b_driver2 = 0x0, 
     b_caller1 = 0x0, 
     b_caller2 = 0x0, 
     b_pager = {
       pg_spc = 0x0, 
       pg_reqpage = 0
     }, 
     b_cluster = {
       cluster_head = {
         tqh_first = 0x0, 
         tqh_last = 0x0
       }, 
       cluster_entry = {
         tqe_next = 0x0, 
         tqe_prev = 0x0
       }
     }, 
     b_pages = {0x0 <repeats 32 times>}, 
     b_npages = 0, 
     b_dep = {
       lh_first = 0x0
     }, 
     b_chain = {
       parent = 0x0, 
       count = 0
     }
   }, 
   rqg = 0x0, 
   sdoffset = 15051360, 
   useroffset = 0, 
   dataoffset = 0, 
   groupoffset = 0, 
   datalen = 32, 
   grouplen = 0, 
   buflen = 0, 
   flags = 0, 
   sdno = 1, 
   driveno = 1
 }
 (kgdb) quit
 bash-2.04# gdb -k /sys/compile/yoda/kernel.debug vmcore.0
 GNU gdb 4.18
 Copyright 1998 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you
 are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for
 details.
 This GDB was configured as "i386-unknown-freebsd"...
 IdlePTD 4296704
 initial pcb at 2e7a40
 panicstr: page fault
 panic messages:
 ---
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x54
 fault code		= supervisor write, page not present
 instruction pointer	= 0x8:0xc014b0b7
 stack pointer	        = 0x10:0xc02c0e94
 frame pointer	        = 0x10:0xc02c0eb0
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio 
 trap number		= 12
 panic: page fault
 
 syncing disks... 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x30
 fault code		= supervisor read, page not present
 instruction pointer	= 0x8:0xc0229aa0
 stack pointer	        = 0x10:0xc02c0cc8
 frame pointer	        = 0x10:0xc02c0ccc
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio 
 trap number		= 12
 panic: page fault
 Uptime: 3h14m14s
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x54
 fault code		= supervisor write, page not present
 instruction pointer	= 0x8:0xc014b0b7
 stack pointer	        = 0x10:0xc02c05a8
 frame pointer	        = 0x10:0xc02c05c4
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio cam 
 trap number		= 12
 panic: page fault
 Uptime: 3h14m14s
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x54
 fault code		= supervisor write, page not present
 instruction pointer	= 0x8:0xc014b0b7
 stack pointer	        = 0x10:0xc02bfe88
 frame pointer	        = 0x10:0xc02bfea4
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio cam 
 trap number		= 12
 panic: page fault
 Uptime: 3h14m14s
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x54
 fault code		= supervisor write, page not present
 instruction pointer	= 0x8:0xc014b0b7
 stack pointer	        = 0x10:0xc02bf768
 frame pointer	        = 0x10:0xc02bf784
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio cam 
 trap number		= 12
 panic: page fault
 Uptime: 3h14m15s
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address	= 0x54
 fault code		= supervisor write, page not present
 instruction pointer	= 0x8:0xc014b0b7
 stack pointer	        = 0x10:0xc02bf048
 frame pointer	        = 0x10:0xc02bf064
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= Idle
 interrupt mask		= bio cam 
 trap number		= 12
 panic: page fault
 Uptime: 3h14m15s
 
 dumping to dev #da/0x20001, offset 1048704
 dump 511 510 509 508 507 506 505 504 503 502 501 500 499 498 497 496 495
 494 493 492 491 490 489 488 487 486 485 484 483 482 481 480 479 478 477
 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459
 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441
 440 439 438 437 436 435 434 433 432 431 430 429 428 427 426 425 424 423
 422 421 420 419 418 417 416 415 414 413 412 411 410 409 408 407 406 405
 404 403 402 401 400 399 398 397 396 395 394 393 392 391 390 389 388 387
 386 385 384 383 382 381 380 379 378 377 376 375 374 373 372 371 370 369
 368 367 366 365 364 363 362 361 360 359 358 357 356 355 354 353 352 351
 350 349 348 347 346 345 344 343 342 341 340 339 338 337 336 335 334 333
 332 331 330 329 328 327 326 325 324 323 322 321 320 319 318 317 316 315
 314 313 312 311 310 309 308 307 306 305 304 303 302 301 300 299 298 297
 296 295 294 293 292 291 290 289 288 287 286 285 284 283 282 281 280 279
 278 277 276 275 274 273 272 271 270 269 268 267 266 265 264 263 262 261
 260 259 258 257 256 255 254 253 252 251 250 249 248 247 246 245 244 243
 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225
 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207
 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189
 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171
 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153
 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135
 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117
 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99
 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75
 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51
 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27
 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
 ---
 #0  boot (howto=260) at ../../kern/kern_shutdown.c:302
 302			dumppcb.pcb_cr3 = rcr3();
 (kgdb) where
 #0  boot (howto=260) at ../../kern/kern_shutdown.c:302
 #1  0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #2  0xc027a78d in trap_fatal (frame=0xc02bf008, eva=84) at
 ../../i386/i386/trap.c:951
 #3  0xc027a465 in trap_pfault (frame=0xc02bf008, usermode=0, eva=84) at
 ../../i386/i386/trap.c:844
 #4  0xc027a04b in trap (frame={tf_fs = -1070923760, tf_es = -1072300016,
 tf_ds = 7077904, tf_edi = -1038983800, 
       tf_esi = -1038984192, tf_ebp = -1070862236, tf_isp = -1070862284,
 tf_ebx = -1018077120, tf_edx = 0, 
       tf_ecx = 199687681, tf_eax = -7128129, tf_trapno = 12, tf_err = 2,
 tf_eip = -1072385865, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -1038983800, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #5  0xc014b0b7 in complete_rqe (bp=0xc2125d88) at
 ../../dev/vinum/vinuminterrupt.c:192
 #6  0xc018d82f in biodone (bp=0xc2125d88) at ../../kern/vfs_bio.c:2637
 #7  0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc1e1d000) at
 ../../cam/scsi/scsi_da.c:1262
 #8  0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #9  0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #10 0xc0124930 in xpt_polled_action (start_ccb=0xc02bf3f8) at
 ../../cam/cam_xpt.c:3393
 #11 0xc012bcc5 in dashutdown (arg=0x0, howto=260) at
 ../../cam/scsi/scsi_da.c:1554
 #12 0xc0168610 in boot (howto=260) at ../../kern/kern_shutdown.c:297
 #13 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #14 0xc027a78d in trap_fatal (frame=0xc02bf728, eva=84) at
 ../../i386/i386/trap.c:951
 #15 0xc027a465 in trap_pfault (frame=0xc02bf728, usermode=0, eva=84) at
 ../../i386/i386/trap.c:844
 #16 0xc027a04b in trap (frame={tf_fs = -1070923760, tf_es = -1072300016,
 tf_ds = 7077904, tf_edi = -1038249592, 
       tf_esi = -1038249984, tf_ebp = -1070860412, tf_isp = -1070860460,
 tf_ebx = -1018076544, tf_edx = 0, 
       tf_ecx = 199097857, tf_eax = -7128129, tf_trapno = 12, tf_err = 2,
 tf_eip = -1072385865, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -1038249592, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #17 0xc014b0b7 in complete_rqe (bp=0xc21d9188) at
 ../../dev/vinum/vinuminterrupt.c:192
 #18 0xc018d82f in biodone (bp=0xc21d9188) at ../../kern/vfs_bio.c:2637
 #19 0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc1ec5400) at
 ../../cam/scsi/scsi_da.c:1262
 #20 0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #21 0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #22 0xc0124930 in xpt_polled_action (start_ccb=0xc02bfb18) at
 ../../cam/cam_xpt.c:3393
 #23 0xc012bcc5 in dashutdown (arg=0x0, howto=260) at
 ../../cam/scsi/scsi_da.c:1554
 #24 0xc0168610 in boot (howto=260) at ../../kern/kern_shutdown.c:297
 #25 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #26 0xc027a78d in trap_fatal (frame=0xc02bfe48, eva=84) at
 ../../i386/i386/trap.c:951
 #27 0xc027a465 in trap_pfault (frame=0xc02bfe48, usermode=0, eva=84) at
 ../../i386/i386/trap.c:844
 #28 0xc027a04b in trap (frame={tf_fs = -1070923760, tf_es = -1072300016,
 tf_ds = 7077904, tf_edi = -1040079480, 
       tf_esi = -1040079872, tf_ebp = -1070858588, tf_isp = -1070858636,
 tf_ebx = -1018076352, tf_edx = 0, 
       tf_ecx = 198901249, tf_eax = -7128129, tf_trapno = 12, tf_err = 2,
 tf_eip = -1072385865, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -1040079480, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #29 0xc014b0b7 in complete_rqe (bp=0xc201a588) at
 ../../dev/vinum/vinuminterrupt.c:192
 #30 0xc018d82f in biodone (bp=0xc201a588) at ../../kern/vfs_bio.c:2637
 #31 0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc1ec5c00) at
 ../../cam/scsi/scsi_da.c:1262
 #32 0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #33 0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #34 0xc0124930 in xpt_polled_action (start_ccb=0xc02c0238) at
 ../../cam/cam_xpt.c:3393
 #35 0xc012bcc5 in dashutdown (arg=0x0, howto=260) at
 ../../cam/scsi/scsi_da.c:1554
 #36 0xc0168610 in boot (howto=260) at ../../kern/kern_shutdown.c:297
 #37 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #38 0xc027a78d in trap_fatal (frame=0xc02c0568, eva=84) at
 ../../i386/i386/trap.c:951
 #39 0xc027a465 in trap_pfault (frame=0xc02c0568, usermode=0, eva=84) at
 ../../i386/i386/trap.c:844
 #40 0xc027a04b in trap (frame={tf_fs = -1070858224, tf_es = -1072300016,
 tf_ds = 7077904, tf_edi = -1040436856, 
       tf_esi = -1040437248, tf_ebp = -1070856764, tf_isp = -1070856812,
 tf_ebx = -1018076928, tf_edx = 0, 
       tf_ecx = 199491073, tf_eax = -7128129, tf_trapno = 12, tf_err = 2,
 tf_eip = -1072385865, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -1040436856, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #41 0xc014b0b7 in complete_rqe (bp=0xc1fc3188) at
 ../../dev/vinum/vinuminterrupt.c:192
 #42 0xc018d82f in biodone (bp=0xc1fc3188) at ../../kern/vfs_bio.c:2637
 #43 0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc212dc00) at
 ../../cam/scsi/scsi_da.c:1262
 #44 0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #45 0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #46 0xc0124930 in xpt_polled_action (start_ccb=0xc02c0958) at
 ../../cam/cam_xpt.c:3393
 #47 0xc012bcc5 in dashutdown (arg=0x0, howto=260) at
 ../../cam/scsi/scsi_da.c:1554
 #48 0xc0168610 in boot (howto=260) at ../../kern/kern_shutdown.c:297
 #49 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #50 0xc027a78d in trap_fatal (frame=0xc02c0c88, eva=48) at
 ../../i386/i386/trap.c:951
 #51 0xc027a465 in trap_pfault (frame=0xc02c0c88, usermode=0, eva=48) at
 ../../i386/i386/trap.c:844
 #52 0xc027a04b in trap (frame={tf_fs = 16, tf_es = 16, tf_ds =
 -1072300016, tf_edi = 0, tf_esi = -1069900608, 
       tf_ebp = -1070854964, tf_isp = -1070854988, tf_ebx = -1070752356,
 tf_edx = 6865984, tf_ecx = 12, tf_eax = 0, 
       tf_trapno = 12, tf_err = 0, tf_eip = -1071474016, tf_cs = 8,
 tf_eflags = 66050, tf_esp = 0, tf_ss = -1070854936})
     at ../../i386/i386/trap.c:443
 #53 0xc0229aa0 in acquire_lock (lk=0xc02d9d9c) at
 ../../ufs/ffs/ffs_softdep.c:265
 #54 0xc022dc62 in softdep_fsync_mountdev (vp=0xd4b08540) at
 ../../ufs/ffs/ffs_softdep.c:3788
 #55 0xc0231d16 in ffs_fsync (ap=0xc02c0d40) at
 ../../ufs/ffs/ffs_vnops.c:134
 #56 0xc0230a36 in ffs_sync (mp=0xc1cfe200, waitfor=2, cred=0xc1441680,
 p=0xc03a9cc0) at vnode_if.h:537
 #57 0xc01953f7 in sync (p=0xc03a9cc0, uap=0x0) at
 ../../kern/vfs_syscalls.c:544
 #58 0xc0168413 in boot (howto=256) at ../../kern/kern_shutdown.c:224
 #59 0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 #60 0xc027a78d in trap_fatal (frame=0xc02c0e54, eva=84) at
 ../../i386/i386/trap.c:951
 #61 0xc027a465 in trap_pfault (frame=0xc02c0e54, usermode=0, eva=84) at
 ../../i386/i386/trap.c:844
 #62 0xc027a04b in trap (frame={tf_fs = -1070858224, tf_es = -1072300016,
 tf_ds = 6815760, tf_edi = -1038613112, 
       tf_esi = -1038613504, tf_ebp = -1070854480, tf_isp = -1070854528,
 tf_ebx = -1018076736, tf_edx = 0, 
       tf_ecx = 199294465, tf_eax = -6865985, tf_trapno = 12, tf_err = 2,
 tf_eip = -1072385865, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -1038613112, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 #63 0xc014b0b7 in complete_rqe (bp=0xc2180588) at
 ../../dev/vinum/vinuminterrupt.c:192
 #64 0xc018d82f in biodone (bp=0xc2180588) at ../../kern/vfs_bio.c:2637
 #65 0xc012b7fd in dadone (periph=0xc1cec200, done_ccb=0xc2028800) at
 ../../cam/scsi/scsi_da.c:1262
 #66 0xc012771f in camisr (queue=0xc02e5830) at ../../cam/cam_xpt.c:6323
 #67 0xc0127531 in swi_cambio () at ../../cam/cam_xpt.c:6226
 #68 0xc0270db0 in splz_swi ()
 (kgdb) up
 #1  0xc01689c4 in poweroff_wait (junk=0xc02ba0cf, howto=0) at
 ../../kern/kern_shutdown.c:552
 552		boot(bootopt);
 (kgdb) 
 #2  0xc027a78d in trap_fatal (frame=0xc02bf008, eva=84) at
 ../../i386/i386/trap.c:951
 951			panic(trap_msg[type]);
 (kgdb) 
 #3  0xc027a465 in trap_pfault (frame=0xc02bf008, usermode=0, eva=84) at
 ../../i386/i386/trap.c:844
 844			trap_fatal(frame, eva);
 (kgdb) 
 #4  0xc027a04b in trap (frame={tf_fs = -1070923760, tf_es = -1072300016,
 tf_ds = 7077904, tf_edi = -1038983800, 
       tf_esi = -1038984192, tf_ebp = -1070862236, tf_isp = -1070862284,
 tf_ebx = -1018077120, tf_edx = 0, 
       tf_ecx = 199687681, tf_eax = -7128129, tf_trapno = 12, tf_err = 2,
 tf_eip = -1072385865, tf_cs = 8, 
       tf_eflags = 66118, tf_esp = -1038983800, tf_ss = -1043326976}) at
 ../../i386/i386/trap.c:443
 443				(void) trap_pfault(&frame, FALSE, eva);
 (kgdb) 
 #5  0xc014b0b7 in complete_rqe (bp=0xc2125d88) at
 ../../dev/vinum/vinuminterrupt.c:192
 192		    ubp->b_resid = 0;				    /* completed our transfer */
 (kgdb) list 185,195
 185		if (rq->error) {				    /* did we have an error? */
 186		    if (rq->isplex) {				    /* plex operation, */
 187			ubp->b_flags |= B_ERROR;		    /* yes, propagate to user */
 188			ubp->b_error = rq->error;
 189		    } else					    /* try to recover */
 190			queue_daemon_request(daemonrq_ioerror, (union daemoninfo) rq); /*
 let the daemon complete */
 191		} else {
 192		    ubp->b_resid = 0;				    /* completed our transfer */
 193		    if (rq->isplex == 0)			    /* volume request, */
 194			VOL[rq->volplex.volno].active--;	    /* another request finished
 */
 195		    biodone(ubp);				    /* top level buffer completed */
 (kgdb) p ubp
 $1 = (struct buf *) 0x0
 (kgdb) list 160,185
 160	    } else if ((rqg->flags & (XFR_NORMAL_WRITE |
 XFR_DEGRADED_WRITE)) /* RAID 4/5 group write operation  */
 161	    &&(rqg->active == 1))				    /* and this is the last active
 request */
 162		complete_raid5_write(rqe);
 163	    /*
 164	     * This is the earliest place where we can be
 165	     * sure that the request has really finished,
 166	     * since complete_raid5_write can issue new
 167	     * requests.
 168	     */
 169	    rqg->active--;					    /* this request now finished */
 170	    if (rqg->active == 0) {				    /* request group finished, */
 171		rq->active--;					    /* one less */
 172		if (rqg->lock) {				    /* got a lock? */
 173		    unlockrange(rqg->plexno, rqg->lock);	    /* yes, free it */
 174		    rqg->lock = 0;
 175		}
 176	    }
 177	    if (rq->active == 0) {				    /* request finished, */
 178	#if VINUMDEBUG
 179		if (debug & DEBUG_RESID) {
 180		    if (ubp->b_resid != 0)			    /* still something to transfer? */
 181			Debugger("resid");
 182		}
 183	#endif
 184	
 185		if (rq->error) {				    /* did we have an error? */
 (kgdb) p rqg
 $2 = (struct rqgroup *) 0xc2125c00
 (kgdb) set print pretty
 (kgdb) p *rqg
 $4 = {
   next = 0x0, 
   rq = 0xc3516040, 
   count = 2, 
   active = 0, 
   plexno = 0, 
   badsdno = 0, 
   flags = 0, 
   lock = 0x0, 
   lockbase = 199687680, 
   rqe = 0xc2125c20
 }
 (kgdb) p *rq
 $5 = {
   bp = 0x0, 
   flags = 0, 
   volplex = {
     volno = 0, 
     plexno = 0
   }, 
   error = 0, 
   sdno = 0, 
   isplex = 0, 
   active = 0, 
   rqg = 0x0, 
   lrqg = 0xc2125c00, 
   next = 0x0
 }
 (kgdb) quit
 bash-2.04# exit
 
 Script done on Wed Sep 20 13:59:48 2000
 
State-Changed-From-To: feedback->closed 
State-Changed-By: grog 
State-Changed-When: Mon Jan 1 14:35:55 PST 2001 
State-Changed-Why:  
No feedback from submitter. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=21148 

From: Daniel Lang <dl@leo.org>
To: grog@FreeBSD.org
Cc: Andy Newman <andy@silverbrook.com.au>,
	Roman Shterenzon <roman@jamus.xpert.com>, Daniel Lang <dl@leo.org>,
	freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: kern/21148: multiple crashes while using vinum
Date: Wed, 3 Jan 2001 14:52:35 +0000

 Dear Greg, Andy, Roman,
 
 grog@FreeBSD.org wrote on Mon, Jan 01, 2001 at 11:41:19PM +0000:
 > Synopsis: multiple crashes while using vinum
 [..]
 > State-Changed-Why: 
 > No feedback from submitter.
 > 
 > http://www.freebsd.org/cgi/query-pr.cgi?pr=21148
 
 Well, I've sent you stack-traces, with (and alas as well without)
 debugging symbols, I am perfectly aware of your instruction page
 about debugging vinum, and not an ignorant moron, who complains
 without reading. Unfortunately you don't seem to trust me
 or other people in this matter.
 
 If you look at my stack-traces again you will notice, that no
 stack-frame is part of the vinum module, so your .gdb-debugging
 scripts cannot apply.
 
 The reason is, that _some code_ writes into unallocated memory,
 in my case overwriting a data-structure of an ata-request
 with a few zero bytes, causing the panic. The stack trace
 allows me to trace the problem back to this point, but not
 further. I later experienced a similar problem on a 
 scsi-only system.
 
 The reason, why I filed this pr unter 'vinum' is, that it only
 occured on boxes using vinum, and perfectly reproducable
 via simple operations like a 'find /vinum/file/system -print'
 on a larger and moderately filled vinum-filesystem.
 Perfectly reproducable means: each night, periodic daily
 caused the panic (traceable to the find call in /etc/security,
 finding files with setuid bits).
 
 As far as I know, the only way to trace this writing into
 unallocated/otherallocated memory resp. buffer overrun
 would be to set a watchpoint to the overwritten data-structure
 within the kernel-debugger. My stack-traces showed that this
 memory region stays the same on the same machine with the
 same kernel (although I can't tell how reliable this is).
 My experiences with kernel code and kernel-debugging with
 ddb are very limited. So is my time (I know this applies
 to anyone). Therefore I ceased spending time to set up
 remote-gdb sessions and sending you stack traces trying to be
 helpful, since you obviously didn't seem to be interested.
 
 I further decided not to use vinum any more. We spent some
 cash on a few hardware RAIDs, and the boxes run smooth now,
 since.
 
 I am just writing this to state:
  a) I did respond to your requests, trying to be as helpful as
     I could. You could blame me for not knowing or willing to 
     learn how to set up a ddb/gdb session using watchpoints
     and waiting for the next crash in an environmen that should
     be productive (and now is).
  b) I still believe, that there is a problem somewhere in the
     vinum code (probably within raid5 routines, since a mirror
     setup worked fine).
 
 And in fact, I wouldn't have bothered if there weren't any
 other people like Roman Shterenzon and  Andy Newman,
 who seem to have the same problems.
 
 Best regards,
  Daniel Lang
 
 P.S.: I don't use vinum anymore, nor can I take my boxes
       out of production. The debugging kernels and crash-dumps
       are no longer present, sorry.
 -- 
 IRCnet: Mr-Spock     - Der Schatten von Hasenfuss ist ziemlich dunkel -  
 *Daniel Lang * dl@leo.org * +49 89 289 25735 * http://www.leo.org/~dl/*
 

From: Marko Cuk <cuk@nu.cuk.nu>
To: Daniel Lang <dl@leo.org>
Cc: grog@FreeBSD.org, Andy Newman <andy@silverbrook.com.au>,
	Roman Shterenzon <roman@jamus.xpert.com>,
	freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: kern/21148: multiple crashes while using vinum
Date: Wed, 03 Jan 2001 21:51:52 +0100

 I had the same problems on IDE and SCSI configuration on 4.0, 4.1, 4.1.1 .
 I spent cash on buying 2 x 40Gig IDE Wd's and put them into mirror.
 That was a few months ago.
 
 I had crashes also with bridging, but Bosko Milekic and Thomas fixed the
 code with a little help and bridge works superb in 4.2 STABLE.
 
 So I was still interested in RAID5 and also very curious about Vinum in 4.2
 so I decided to test that thing with same disks ( old SCSI disks , bud good
 ) and belive or not, it works also under heavy load.
 It is strange, but I didn't change anything. Controller was the same
 Adaptec, cables were the same, disks also. Bsd was 4.2 STABLE.
 
 Please try it on 4.2 STABLE and report.
 
 I must admit, that I am not a FreeBSD hacker, I don't know many about
 debugging and I was very unhelpful to Grog, but he is very sure, that Vinum
 works and I don't like his way of thinking about that. I also set him
 account on that maschine, to check and debug problems on that maschine and
 repair them, but he wasn't interested in solving such problems.
 He blamed me for config or some other mistake.
 
 Vinum and RAID5 under 4.1 is not stable. That's all.
 
 Cuk
 
 
 
 Daniel Lang wrote:
 
 > Dear Greg, Andy, Roman,
 >
 > grog@FreeBSD.org wrote on Mon, Jan 01, 2001 at 11:41:19PM +0000:
 > > Synopsis: multiple crashes while using vinum
 > [..]
 > > State-Changed-Why:
 > > No feedback from submitter.
 > >
 > > http://www.freebsd.org/cgi/query-pr.cgi?pr=21148
 >
 > Well, I've sent you stack-traces, with (and alas as well without)
 > debugging symbols, I am perfectly aware of your instruction page
 > about debugging vinum, and not an ignorant moron, who complains
 > without reading. Unfortunately you don't seem to trust me
 > or other people in this matter.
 >
 > If you look at my stack-traces again you will notice, that no
 > stack-frame is part of the vinum module, so your .gdb-debugging
 > scripts cannot apply.
 >
 > The reason is, that _some code_ writes into unallocated memory,
 > in my case overwriting a data-structure of an ata-request
 > with a few zero bytes, causing the panic. The stack trace
 > allows me to trace the problem back to this point, but not
 > further. I later experienced a similar problem on a
 > scsi-only system.
 >
 > The reason, why I filed this pr unter 'vinum' is, that it only
 > occured on boxes using vinum, and perfectly reproducable
 > via simple operations like a 'find /vinum/file/system -print'
 > on a larger and moderately filled vinum-filesystem.
 > Perfectly reproducable means: each night, periodic daily
 > caused the panic (traceable to the find call in /etc/security,
 > finding files with setuid bits).
 >
 > As far as I know, the only way to trace this writing into
 > unallocated/otherallocated memory resp. buffer overrun
 > would be to set a watchpoint to the overwritten data-structure
 > within the kernel-debugger. My stack-traces showed that this
 > memory region stays the same on the same machine with the
 > same kernel (although I can't tell how reliable this is).
 > My experiences with kernel code and kernel-debugging with
 > ddb are very limited. So is my time (I know this applies
 > to anyone). Therefore I ceased spending time to set up
 > remote-gdb sessions and sending you stack traces trying to be
 > helpful, since you obviously didn't seem to be interested.
 >
 > I further decided not to use vinum any more. We spent some
 > cash on a few hardware RAIDs, and the boxes run smooth now,
 > since.
 >
 > I am just writing this to state:
 >  a) I did respond to your requests, trying to be as helpful as
 >     I could. You could blame me for not knowing or willing to
 >     learn how to set up a ddb/gdb session using watchpoints
 >     and waiting for the next crash in an environmen that should
 >     be productive (and now is).
 >  b) I still believe, that there is a problem somewhere in the
 >     vinum code (probably within raid5 routines, since a mirror
 >     setup worked fine).
 >
 > And in fact, I wouldn't have bothered if there weren't any
 > other people like Roman Shterenzon and  Andy Newman,
 > who seem to have the same problems.
 >
 > Best regards,
 >  Daniel Lang
 >
 > P.S.: I don't use vinum anymore, nor can I take my boxes
 >       out of production. The debugging kernels and crash-dumps
 >       are no longer present, sorry.
 > --
 > IRCnet: Mr-Spock     - Der Schatten von Hasenfuss ist ziemlich dunkel -
 > *Daniel Lang * dl@leo.org * +49 89 289 25735 * http://www.leo.org/~dl/*
 >
 > To Unsubscribe: send mail to majordomo@FreeBSD.org
 > with "unsubscribe freebsd-stable" in the body of the message
 
 

From: Greg Lehey <grog@lemis.com>
To: Daniel Lang <dl@leo.org>
Cc: Andy Newman <andy@silverbrook.com.au>,
	Roman Shterenzon <roman@jamus.xpert.com>,
	freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: kern/21148: multiple crashes while using vinum
Date: Thu, 4 Jan 2001 10:54:28 +1030

 On Wednesday,  3 January 2001 at 14:52:35 +0000, Daniel Lang wrote:
 > Dear Greg, Andy, Roman,
 >
 > grog@FreeBSD.org wrote on Mon, Jan 01, 2001 at 11:41:19PM +0000:
 >> Synopsis: multiple crashes while using vinum
 > [..]
 >> State-Changed-Why:
 >> No feedback from submitter.
 >>
 >> http://www.freebsd.org/cgi/query-pr.cgi?pr=21148
 >
 > Well, I've sent you stack-traces, with (and alas as well without)
 > debugging symbols, I am perfectly aware of your instruction page
 > about debugging vinum, and not an ignorant moron, who complains
 > without reading. Unfortunately you don't seem to trust me
 > or other people in this matter.
 
 As my closing message says, the reason I closed the PR was:
 
 >> No feedback from submitter.
 
 I sent you a message on 10 September 2000 asking for additional
 information.  I received none.  There's no reason to get all upset
 now, or make claims about my intentions.  This was just a dead PR, and
 you've made it clear, both before and now, that you have no intention
 of following up on it.  This is not a question of "ignorant morons" or
 "trust".
 
 > The reason is, that _some code_ writes into unallocated memory, in
 > my case overwriting a data-structure of an ata-request with a few
 > zero bytes, causing the panic. The stack trace allows me to trace
 > the problem back to this point, but not further. I later experienced
 > a similar problem on a scsi-only system.
 
 Yes, this looks very much like the other issues.  But you must
 understand that there's nothing I can do without further information.
 
 > The reason, why I filed this pr unter 'vinum' is, that it only
 > occured on boxes using vinum, and perfectly reproducable via simple
 > operations like a 'find /vinum/file/system -print' on a larger and
 > moderately filled vinum-filesystem.  Perfectly reproducable means:
 > each night, periodic daily caused the panic (traceable to the find
 > call in /etc/security, finding files with setuid bits).
 >
 > As far as I know, the only way to trace this writing into
 > unallocated/otherallocated memory resp. buffer overrun
 > would be to set a watchpoint to the overwritten data-structure
 > within the kernel-debugger.
 
 The trouble with that is that this only happens when the system is
 very active, and there are thousands of potential buffer headers which
 could be trashed.  I do have a trace facility within Vinum, but even
 with that it's difficult to figure out what's going on.
 
 > My stack-traces showed that this memory region stays the same on the
 > same machine with the same kernel (although I can't tell how
 > reliable this is).
 
 If you mean that the same part of the buffer header gets smashed every
 time, yes, this is reliably reproducible (well, in other words, when
 it happens (at random), it happens in the same place every time).  It
 may mean that Vinum is doing it, but as far as I can tell it's always
 6 words being zeroed out, and I don't do that anywhere in Vinum.  The
 other possibility, which I consider most likely, is that the data
 structures accidentally get freed and used by some other driver (or,
 possibly, that some other driver freed them first and then continued
 using them).  This would explain the observed correlation with the fxp
 driver.
 
 > My experiences with kernel code and kernel-debugging with
 > ddb are very limited. So is my time (I know this applies
 > to anyone). Therefore I ceased spending time to set up
 > remote-gdb sessions and sending you stack traces trying to be
 > helpful, since you obviously didn't seem to be interested.
 >
 > I further decided not to use vinum any more. We spent some
 > cash on a few hardware RAIDs, and the boxes run smooth now,
 > since.
 >
 > I am just writing this to state:
 >  a) I did respond to your requests, trying to be as helpful as
 >     I could.
 
 Well, I sent you a message on 10 September 2000, asking for additional
 information.  You didn't send it to me.
 
 >      You could blame me for not knowing or willing to learn how to
 >     set up a ddb/gdb session using watchpoints and waiting for the
 >     next crash in an environmen that should be productive (and now
 >     is).
 
 No, I wouldn't do that.
 
 >  b) I still believe, that there is a problem somewhere in the
 >     vinum code (probably within raid5 routines, since a mirror
 >     setup worked fine).
 
 Correct.  I have no doubt about it.  But some bugs are difficult to
 find, and I need help.
 
 Greg
 --
 Finger grog@lemis.com for PGP public key
 See complete headers for address and phone numbers
 

From: Greg Lehey <grog@lemis.com>
To: Marko Cuk <cuk@nu.cuk.nu>
Cc: Daniel Lang <dl@leo.org>, Andy Newman <andy@silverbrook.com.au>,
	Roman Shterenzon <roman@jamus.xpert.com>,
	freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: kern/21148: multiple crashes while using vinum
Date: Thu, 4 Jan 2001 11:09:40 +1030

 On Wednesday,  3 January 2001 at 21:51:52 +0100, Marko Cuk wrote:
 > I had the same problems on IDE and SCSI configuration on 4.0, 4.1, 4.1.1 .
 > I spent cash on buying 2 x 40Gig IDE Wd's and put them into mirror.
 > That was a few months ago.
 >
 > I had crashes also with bridging, but Bosko Milekic and Thomas fixed the
 > code with a little help and bridge works superb in 4.2 STABLE.
 >
 > So I was still interested in RAID5 and also very curious about Vinum in 4.2
 > so I decided to test that thing with same disks ( old SCSI disks , bud good
 > ) and belive or not, it works also under heavy load.
 > It is strange, but I didn't change anything. Controller was the same
 > Adaptec, cables were the same, disks also. Bsd was 4.2 STABLE.
 >
 > Please try it on 4.2 STABLE and report.
 >
 > I must admit, that I am not a FreeBSD hacker, I don't know many
 > about debugging and I was very unhelpful to Grog, but he is very
 > sure, that Vinum works and I don't like his way of thinking about
 > that.
 
 I'm not sure what you're saying here.  I've made it clear (even in the
 man pages) that there are some problems with RAID-5.  What else (apart
 from fix the problem :-) do you want me to do?
 
 > I also set him account on that maschine, to check and debug problems
 > on that maschine and repair them, but he wasn't interested in
 > solving such problems.  He blamed me for config or some other
 > mistake.
 
 I don't have any record of this.  Did you use some other name?  The
 last exchange we had (27 November 2000), you didn't want to submit the
 information I asked for, I said I wouldn't be able to help you much
 without it, and you said you would go back and get the information.  I
 don't see anything about being offered an account on the machine,
 which indeed would have been of assistance.  But basically, all I
 needed at that point was a (preferably unmutilated) copy of the
 information I asked for.  Based on what you supplied, I don't even
 know if you had a panic or just a freeze.
 
 > Vinum and RAID5 under 4.1 is not stable. That's all.
 
 Ah, but there's the problem.  There are no changes between 4.1 and
 4.2.  In some cases, we run into these problems, but for the most part
 it just works.
 
 Greg
 --
 Finger grog@lemis.com for PGP public key
 See complete headers for address and phone numbers
 
 
>Unformatted:
Greg Lehey, 10 September 2000

As you say, you have reported this to me before.  I've pointed you to
the instructions at www.vinumvm.org, but you still haven't given me
the information I need.  Instead you give me information like dmesg
and config files, about which I say:

  Please don't supply the following information unless I ask for it: 

    The output of the vinum printconfig command. 
    Your Vinum configuration file, unless your problem is that you
    can't start Vinum at all. 
    Your dmesg output. 
    Your kernel configuration file. 
    Processor dumps. 

At the very least, I need a valid backtrace from your dump.  To do
that, you need to follow the instructions given at vinumvm.org or in
vinum(4).  Until I get that, I can't do anything.

Greg Lehey, 2 January 2001

This PR has now been in feedback state for over three months.  Since I
don't have any information to go on, I'm closing it.  If you can find
time to give me the infomration I ask for, please enter a new PR.
