From langd@informatik.tu-muenchen.de  Mon Jun 28 10:55:06 2004
Return-Path: <langd@informatik.tu-muenchen.de>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id AB96A16A4CE
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 28 Jun 2004 10:55:06 +0000 (GMT)
Received: from mailout1.informatik.tu-muenchen.de (mailout1.informatik.tu-muenchen.de [131.159.0.18])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9C75543D3F
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 28 Jun 2004 10:55:05 +0000 (GMT)
	(envelope-from langd@informatik.tu-muenchen.de)
Message-Id: <20040628105441.D988628448@atrbg11.informatik.tu-muenchen.de>
Date: Mon, 28 Jun 2004 12:54:41 +0200 (CEST)
From: Daniel Lang <dl@leo.org>
Reply-To: Daniel Lang <dl@leo.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: panic - acquiring duplicate lock of same type: "sleepq chain"
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         68442
>Category:       kern
>Synopsis:       [panic] acquiring duplicate lock of same type: "sleepq chain"
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jun 28 11:00:36 GMT 2004
>Closed-Date:    Fri Nov 16 04:43:46 UTC 2007
>Last-Modified:  Fri Nov 16 04:43:46 UTC 2007
>Originator:     Daniel Lang
>Release:        FreeBSD 5.2-CURRENT i386
>Organization:
LEO
>Environment:
System: FreeBSD  5.2-CURRENT FreeBSD 5.2-CURRENT #3: Mon Jun 28 11:05:52 CEST 2004     root@:/usr/obj/usr/src/sys/PAE-ATLEO6  i386

Hardware: DELL PowerEdge 2650, 2x2.4GHz Xeon, HTT DISABLED (I thought HTT
may be the problem, so I disabled it).

dmesg:
[..]
Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.2-CURRENT #3: Mon Jun 28 11:05:52 CEST 2004
    root@:/usr/obj/usr/src/sys/PAE-ATLEO6
WARNING: WITNESS option enabled, expect reduced performance.
Preloaded elf kernel "/boot/kernel/kernel" at 0xa0622000.
ACPI APIC Table: <DELL   PE2650  >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) XEON(TM) CPU 2.40GHz (2392.26-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf24  Stepping = 4
  Features=0x3febfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM>
real memory  = 6442450944 (6144 MB)
avail memory = 6174314496 (5888 MB)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  6
ioapic0: Changing APIC ID to 8
ioapic1: Changing APIC ID to 9
ioapic2: Changing APIC ID to 10
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
ioapic2 <Version 1.1> irqs 32-47 on motherboard
random: <entropy source, Software, Yarrow>
Pentium Pro MTRR support enabled
acpi0: <DELL PE2650> on motherboard
acpi0: [GIANT-LOCKED]
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
pcibios: BIOS version 2.10
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib0: slot 15 INTA is routed to irq 5
pci0: <unknown> at device 4.0 (no driver attached)
pci0: <unknown> at device 4.1 (no driver attached)
cbb0: <PCI-CardBus Bridge> port 0xecf4-0xecf7 irq 27 at device 4.2 on pci0
cbb0: failed: rid 0x10 is ioport, requested 3
cbb0: Could not map register memory
device_attach: cbb0 attach returned 12
pci0: <display, VGA> at device 14.0 (no driver attached)
atapci0: <ServerWorks CSB5 UDMA100 controller> port 0x8b0-0x8bf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: simplex device, DMA on primary only
ata1: at 0x170 irq 15 on atapci0
pci0: <serial bus, USB> at device 15.2 (no driver attached)
isab0: <PCI-ISA bridge> at device 15.3 on pci0
isa0: <ISA bus> on isab0
pcib1: <ACPI Host-PCI bridge> on acpi0
pci4: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 8.0 on pci4
pci5: <ACPI PCI bus> on pcib2
ahc0: <Adaptec aic7899 Ultra160 SCSI adapter> port 0xac00-0xacff mem 0xfc8ff000-0xfc8fffff irq 30 at device 6.0 on pci5
ahc0: [GIANT-LOCKED]
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
ahc1: <Adaptec aic7899 Ultra160 SCSI adapter> port 0xa800-0xa8ff mem 0xfc8fe000-0xfc8fefff irq 31 at device 6.1 on pci5
ahc1: [GIANT-LOCKED]
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
pcib3: <ACPI Host-PCI bridge> on acpi0
pci3: <ACPI PCI bus> on pcib3
bge0: <Broadcom BCM5701 Gigabit Ethernet, ASIC rev. 0x105> mem 0xfcb10000-0xfcb1ffff irq 28 at device 6.0 on pci3
miibus0: <MII bus> on bge0
brgphy0: <BCM5701 10/100/1000baseTX PHY> on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
bge0: Ethernet address: 00:06:5b:3f:36:a8
bge0: [GIANT-LOCKED]
bge1: <Broadcom BCM5701 Gigabit Ethernet, ASIC rev. 0x105> mem 0xfcb00000-0xfcb0ffff irq 29 at device 8.0 on pci3
miibus1: <MII bus> on bge1
brgphy1: <BCM5701 10/100/1000baseTX PHY> on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
bge1: Ethernet address: 00:06:5b:3f:36:a9
bge1: [GIANT-LOCKED]
pcib4: <ACPI Host-PCI bridge> on acpi0
pci2: <ACPI PCI bus> on pcib4
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.25> mem 0xfcd20000-0xfcd2ffff,0xfcd00000-0xfcd1ffff irq 24 at device 6.0 on pci2
em0: [GIANT-LOCKED]
em0: Ethernet address: 00:03:47:df:26:4d
em0:  Speed:1000 Mbps  Duplex:Full
pcib5: <ACPI Host-PCI bridge> on acpi0
pci1: <ACPI PCI bus> on pcib5
ahc2: <Adaptec 29160 Ultra160 SCSI adapter> port 0xdc00-0xdcff mem 0xfcf01000-0xfcf01fff irq 16 at device 6.0 on pci1
ahc2: [GIANT-LOCKED]
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
ahc3: <Adaptec 29160 Ultra160 SCSI adapter> port 0xd800-0xd8ff mem 0xfcf00000-0xfcf00fff irq 20 at device 8.0 on pci1
ahc3: [GIANT-LOCKED]
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
isa_dmainit(2, 1024) failed
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
sio0 port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A, console
sio1 port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
orm0: <Option ROMs> at iomem 0xec000-0xeffff,0xc8000-0xcdfff,0xc0000-0xc7fff on isa0
pmtimer0 on isa0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 10.000 msec
acd0: CDROM <SAMSUNG CD-ROM SN-124> at ata0-master PIO4
Waiting 3 seconds for SCSI devices to settle
ses0 at ahc0 bus 0 target 6 lun 0
ses0: <PE/PV 1x5 SCSI BP 0.25> Fixed Processor SCSI-2 device 
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
da0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST336706LC 010A> Fixed Direct Access SCSI-3 device 
da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C)
da1 at ahc0 bus 0 target 1 lun 0
da1: <SEAGATE ST336706LC 8A03> Fixed Direct Access SCSI-3 device 
da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da1: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C)
da2 at ahc0 bus 0 target 2 lun 0
da2: <FUJITSU MAS3367NC 0104> Fixed Direct Access SCSI-3 device 
da2: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabled
da2: 35068MB (71819496 512 byte sectors: 255H 63S/T 4470C)
da3 at ahc2 bus 0 target 0 lun 0
da3: <FX-1600U 3-R 0001> Fixed Direct Access SCSI-3 device 
da3: 160.000MB/s transfers (80.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da3: 204800MB (419430400 512 byte sectors: 255H 63S/T 26108C)
da10 at ahc3 bus 0 target 14 lun 0
da10: <G-Force RI > Fixed Direct Access SCSI-2 device 
da10: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da10: 439840MB (900792320 512 byte sectors: 255H 63S/T 56071C)
da4 at ahc2 bus 0 target 0 lun 1
da4: <FX-1600U 3-R 0001> Fixed Direct Access SCSI-3 device 
da4: 160.000MB/s transfers (80.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da4: 204800MB (419430400 512 byte sectors: 255H 63S/T 26108C)
da5 at ahc2 bus 0 target 0 lun 2
da5: <FX-1600U 3-R 0001> Fixed Direct Access SCSI-3 device 
da5: 160.000MB/s transfers (80.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da5: 277016MB (567328768 512 byte sectors: 255H 63S/T 35314C)
da6 at ahc2 bus 0 target 0 lun 3
da6: <FX-1600U 3-R 0001> Fixed Direct Access SCSI-3 device 
da6: 160.000MB/s transfers (80.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da6: 286720MB (587202560 512 byte sectors: 255H 63S/T 36551C)
da7 at ahc2 bus 0 target 0 lun 4
da7: <FX-1600U 3-R 0001> Fixed Direct Access SCSI-3 device 
da7: 160.000MB/s transfers (80.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da7: 286720MB (587202560 512 byte sectors: 255H 63S/T 36551C)
da8 at ahc2 bus 0 target 0 lun 5
da8: <FX-1600U 3-R 0001> Fixed Direct Access SCSI-3 device 
da8: 160.000MB/s transfers (80.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da8: 286720MB (587202560 512 byte sectors: 255H 63S/T 36551C)
da9 at ahc2 bus 0 target 0 lun 6
da9: <FX-1600U 3-R 0001> Fixed Direct Access SCSI-3 device 
da9: 160.000MB/s transfers (80.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da9: 284504MB (582664192 512 byte sectors: 255H 63S/T 36269C)
SMP: AP CPU #1 Launched!
Mounting root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted

KERNEL CONFIG (two files, uses PAE)

PAE-ATLEO6:
----------
#
# PAE -- Generic kernel configuration file for FreeBSD/i386 PAE
#
# $FreeBSD: src/sys/i386/conf/PAE,v 1.8 2003/11/03 22:49:19 jhb Exp $

include ATLEO6

ident		PAE-ATLEO6

# To make a PAE kernel, the next option is needed
options		PAE			# Physical Address Extensions Kernel

options		KVA_PAGES=768

options		COMPAT_LINUX
options         DDB_NOKLDSYM


#options		ADAPTIVE_MUTEXES
# Compile acpi in statically since the module isn't built properly.  Most
# machines which support large amounts of memory require acpi.
device		acpi

# Don't build modules with this kernel config, since they are not built with
# the correct options headers.
makeoptions	NO_MODULES=yes

# What follows is a list of drivers that are normally in GENERIC, but either
# don't work or are untested with PAE.  Be very careful before enabling any
# of these drivers.  Drivers which use DMA and don't handle 64 bit physical
# address properly may cause data corruption when used in a machine with more
# than 4 gigabytes of memory.

nodevice	ahb
nodevice	amd
nodevice	isp
nodevice	sym
nodevice	trm

nodevice	adv
nodevice	adw
nodevice	aha
nodevice	aic
nodevice	bt

nodevice	ncv
nodevice	nsp
nodevice	stg

nodevice	asr
nodevice	dpt
nodevice	iir
nodevice	mly

nodevice	amr
nodevice	ida
nodevice	mlx
nodevice	pst

nodevice	agp

nodevice	de
nodevice	txp
nodevice	vx

nodevice	dc
nodevice	pcn
nodevice	rl
nodevice	sf
nodevice	sis
nodevice	ste
nodevice	tl
nodevice	tx
nodevice	vr
nodevice	wb

nodevice	cs
nodevice	ed
nodevice	ex
nodevice	ep
nodevice	fe
nodevice	ie
nodevice	lnc
nodevice	sn
nodevice	xe

nodevice	wlan
nodevice	an
nodevice	awi
nodevice	wi

nodevice	uhci
nodevice	ohci
nodevice	usb
nodevice	ugen
nodevice	uhid
nodevice	ukbd
nodevice	ulpt
nodevice	umass
nodevice	ums
nodevice	urio
nodevice	uscanner
nodevice	aue
nodevice	axe
nodevice	cue
nodevice	kue
----

ATLEO6 (regular config file):

------
#
# ATLEO6
#

machine		i386
#cpu		I486_CPU
#cpu		I586_CPU
cpu		I686_CPU
ident		ATLEO6

# To statically compile in device wiring instead of /boot/device.hints
#hints		"GENERIC.hints"		# Default places to look for devices.

makeoptions	DEBUG=-g		# Build kernel with gdb(1) debug symbols

options 	SCHED_4BSD		# ULE scheduler
options 	INET			# InterNETworking
options 	INET6			# IPv6 communications protocols
#options	NETATALK		# appletalk stuff
options 	FFS			# Berkeley Fast Filesystem
options 	SOFTUPDATES		# Enable FFS soft updates support
options 	UFS_ACL			# Support for access control lists
options 	UFS_DIRHASH		# Improve performance on big directories
options 	MD_ROOT			# MD is a potential root device
options 	NFSCLIENT		# Network Filesystem Client
options 	NFSSERVER		# Network Filesystem Server
options 	NFS_ROOT		# NFS usable as /, requires NFSCLIENT
options 	MSDOSFS			# MSDOS Filesystem
options 	CD9660			# ISO 9660 Filesystem
options 	PROCFS			# Process filesystem (requires PSEUDOFS)
options 	PSEUDOFS		# Pseudo-filesystem framework
options 	COMPAT_43		# Compatible with BSD 4.3 [KEEP THIS!]
options 	COMPAT_FREEBSD4		# Compatible with FreeBSD4
options 	SCSI_DELAY=3000	# Delay (in ms) before probing SCSI
options 	KTRACE			# ktrace(1) support
options 	SYSVSHM			# SYSV-style shared memory
options 	SYSVMSG			# SYSV-style message queues
options 	SYSVSEM			# SYSV-style semaphores
options 	_KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
options 	KBD_INSTALL_CDEV	# install a CDEV entry in /dev
options 	AHC_REG_PRETTY_PRINT	# Print register bitfields in debug
					# output.  Adds ~128k to driver.
options 	AHD_REG_PRETTY_PRINT	# Print register bitfields in debug
					# output.  Adds ~215k to driver.
options 	PFIL_HOOKS		# pfil(9) framework
#options	ICMP_BANDLIM		#Rate limit bad replies

# Debugging for use in -current
options 	DDB			# Enable the kernel debugger
#options 	INVARIANTS		# Enable calls of extra sanity checking
#options 	INVARIANT_SUPPORT	# Extra sanity checks of internal structures, required by INVARIANTS
options 	WITNESS			# Enable checks to detect deadlocks and cycles
#options 	WITNESS_SKIPSPIN	# Don't run witness on spinlocks for speed
options		DDB_TRACE
options		DDB_NUMSYM

# To make an SMP kernel, the next two are needed
options 	SMP		# Symmetric MultiProcessor Kernel
device		apic		# I/O APIC

device		isa
device		eisa
device		pci

# Floppy drives
device		fdc

# ATA and ATAPI devices
device		ata
device		atadisk		# ATA disk drives
device		ataraid		# ATA RAID drives
device		atapicd		# ATAPI CDROM drives
device		atapifd		# ATAPI floppy drives
device		atapist		# ATAPI tape drives
options 	ATA_STATIC_ID	# Static device numbering

# SCSI Controllers
device		ahc		# AHA2940 and onboard AIC7xxx devices
device		ahd		# AHA39320/29320 and onboard AIC79xx devices
options		AHC_ALLOW_MEMIO

# SCSI peripherals
device		scbus		# SCSI bus (required for SCSI)
device		ch		# SCSI media changers
device		da		# Direct Access (disks)
device		sa		# Sequential Access (tape etc)
device		cd		# CD
device		pass		# Passthrough device (direct SCSI access)
device		ses		# SCSI Environmental Services (and SAF-TE)

# wired?

# RAID controllers interfaced to the SCSI subsystem

# RAID controllers

# atkbdc0 controls both the keyboard and the PS/2 mouse
device		atkbdc		# AT keyboard controller
device		atkbd		# AT keyboard
device		psm		# PS/2 mouse

device		vga		# VGA video card driver

device		splash		# Splash screen and screen saver support

# syscons is the default console driver, resembling an SCO console
device		sc
options		SC_NORM_ATTR="(FG_LIGHTGREY|BG_BLACK)"
options		SC_NORM_REV_ATTR="(FG_YELLOW|BG_GREEN)"
options		SC_KERNEL_CONS_ATTR="(FG_WHITE|BG_BLUE)"
options		SC_KERNEL_CONS_REV_ATTR="(FG_BLACK|BG_RED)"

# Enable this for the pcvt (VT220 compatible) console driver

#
device		agp		# support several AGP chipsets

# Floating point support - do not disable.
device		npx

# Power management support (see NOTES for more options)
#device		apm
# Add suspend/resume support for the i8254.
device		pmtimer

# PCCARD (PCMCIA) support
# PCMCIA and cardbus bridge support
device		cbb		# cardbus (yenta) bridge
#device		pcic		# ExCA ISA and PCI bridges
device		pccard		# PC Card (16-bit) bus
device		cardbus		# CardBus (32-bit) bus

# SMbus and I2C
#device	smbus	# Bus support, required for smb below.
#device	intpm
#device	alpm
#device	ichsmb
#device	viapm
#device	amdpm
#device	nfpm
#device	smb

#device	iicbus	# Bus support, required for ic/iic/iicsmb below.
#device	iicbb
#device	ic
#device	iic
#device	iicsmb	# smb over i2c bridge


# Serial (COM) ports
device		sio		# 8250, 16[45]50 based serial ports
options		CONSPEED=19200

# Parallel port
device		ppc
device		ppbus		# Parallel port bus (required)
device		lpt		# Printer
device		plip		# TCP/IP over parallel
device		ppi		# Parallel port interface device
#device		vpo		# Requires scbus and da

# If you've got a "dumb" serial or parallel PCI card that is
# supported by the puc(4) glue driver, uncomment the following
# line to enable it (connects to the sio and/or ppc drivers):
#device         puc

# PCI Ethernet NICs.
device		de		# DEC/Intel DC21x4x (``Tulip'')
device		em		# Intel PRO/1000 adapter Gigabit Ethernet Card
device		txp		# 3Com 3cR990 (``Typhoon'')
device		vx		# 3Com 3c590, 3c595 (``Vortex'')

# PCI Ethernet NICs that use the common MII bus controller code.
# NOTE: Be sure to keep the 'device miibus' line in order to use these NICs!
device		miibus		# MII bus support
device		bfe		# Broadcom BCM440x 10/100 ethernet
device		bge		# Broadcom BCM570xx Gigabit Ethernet
device		dc		# DEC/Intel 21143 and various workalikes
device		fxp		# Intel EtherExpress PRO/100B (82557, 82558)
device		xl		# 3Com 3c90x (``Boomerang'', ``Cyclone'')

# Pseudo devices - the number indicates how many units to allocate.
device		random		# Entropy device
device		loop		# Network loopback
device		ether		# Ethernet support
device		sl		# Kernel SLIP
device		ppp		# Kernel PPP
device		tun		# Packet tunnel.
device		pty		# Pseudo-ttys (telnet etc)
device		md		# Memory "disks"
device		gif		# IPv6 and IPv4 tunneling
device		faith		# IPv6-to-IPv4 relaying (translation)

# The `bpf' device enables the Berkeley Packet Filter.
# Be aware of the administrative consequences of enabling this!
device		bpf		# Berkeley packet filter

# USB support
---------

Things tried out to solve/circumvent the problem,
none did help:

SCHED_ULE vs. SCHED_4BSD 
ADAPTIVE_MUTEXES (on/off)
Hyperthreading enabled/disabled in the BIOS

>Description:

Machine hangs frequently, after adding DDB and various WITNESS
options, I get the following:

acquiring duplicate lock of same type: "sleepq chain"
 1st Giant @ /usr/src/sys/kern/uipc_syscalls.c:1735
  2nd sleepq chain @ /usr/src/sys/kern/subr_sleepqueue.c:193
Stack backtrace:
backtrace(a0367684,a0534930,a0534930,a05031e4,a05578d4) at 0xa034b25e = backtra2
witness_checkorder(a0530784,9,a04ce113,c1,140) at 0xa03666c4 = witness_checkord4
_mtx_lock_spin_flags(a0530784,0,a04ce113,c1,a8888e70) at 0xa0344096 = _mtx_locke
sleepq_lookup(a8a50860,a8a683dc,0,a04ce113,145) at 0xa03639d6 = sleepq_lookup+0e
sleepq_catch_signals(a8a50860,0,100,0,a04d0a46) at 0xa0363c69 = sleepq_catch_sid
msleep(a8a50860,a8a50830,158,a04d089f,0) at 0xa0350953 = msleep+0x233
sbwait(a8a5081c,0,4eb,0,0) at 0xa037ced3 = sbwait+0x33
do_sendfile(a8888e70,cf0f8d14,0,cf0f8d40,a049070b) at 0xa0380e21 = do_sendfile+5
sendfile(a8888e70,cf0f8d14,8,d3,202) at 0xa03801c0 = sendfile+0x10
syscall(40df002f,9f7f002f,9f7f002f,8,0) at 0xa049070b = syscall+0x217
Xint0x80_syscall() at 0xa047e88f = Xint0x80_syscall+0x1f
--- syscall (393), eip = 0x2812123b, esp = 0x9f7fde4c, ebp = 0x9f7fdea8 ---
Jun 28 11:34:35 atleo6 ftpd[2810]: getsockname (/usr/libexec/ftpd): Socket opert
Jun 28 11:36:12 atleo6 ftpd[2870]: getsockname (/usr/libexec/ftpd): Socket opert
Jun 28 11:38:05 atleo6 ftpd[2910]: getsockname (/usr/libexec/ftpd): Socket opert
Jun 28 11:40:57 atleo6 ftpd[2993]: getsockname (/usr/libexec/ftpd): Socket opert
Jun 28 11:43:02 atleo6 ftpd[3036]: getsockname (/usr/libexec/ftpd): Socket opert
Jun 28 11:43:02 atleo6 ftpd[3037]: getsockname (/usr/libexec/ftpd): Socket opert
Jun 28 11:43:02 atleo6 ftpd[3043]: getsockname (/usr/libexec/ftpd): Socket opert

Fatal double fault:
eip = 0xa0367274
esp = 0xcc331000
ebp = 0xcc33100c
cpuid = 1; apic id = 06
panic: double fault
cpuid = 1;
Stack backtrace:
backtrace(100,a7d7f3f0,0,0,0) at 0xa034b25e = backtrace+0x12
panic(a04e403e,a04e41c0,6,0,0) at 0xa034b37e = panic+0x11e
dblfault_handler(a051b7ec,1,60,a7d2c000,a0529590) at 0xa04904f2 = dblfault_handa
Debugger("panic")
spin lock witness lock held by 0xa7d7f3f0 for > 5 seconds

After this, the machine is locked, I don't get a ddb prompt.
I need to reset the machine. No crash-dump of course.


The machine has PAE enabled and 6GB of RAM. I will not try to
boot a non-PAE kernel. The problems started with the PAE stuff :-/



>How-To-Repeat:
 - 
>Fix:
 - 

>Release-Note:
>Audit-Trail:

From: Daniel Lang <dl@leo.org>
To: freebsd-gnats-submit@FreeBSD.org, dl@leo.org
Cc:  
Subject: Re: kern/68442: panic - acquiring duplicate lock of same type: "sleepq chain"
Date: Mon, 28 Jun 2004 22:24:34 +0200

 Ok,
 
 the problem occurs also with:=20
 
 - non PAE kernel (use the same kernel config without the PAE
   part, which includes the other=20
 
 - completeley changed RAM, which was known to work before
   (I did a system upgrade in between)
 
 I could get some more data and a ddb stacktrace.
 
 I could not get gdb to analyse it, although I could obtain
 a dump, because /var was too full, so it was not saved. :(
 
 I usually get no ddb prompt and no crash dump, so this was
 a rare occasion :((
 
 Here is what I got:
 [..]
 
 login: lock order reversal
  1st 0x80746940 sio (sio) @ /usr/src/sys/dev/sio/sio.c:3220
  2nd 0x807105b4 sleepq chain (sleepq chain) @ /usr/src/sys/kern/subr_sleepq=
 ueue3
 Stack backtrace:
 backtrace(ffffffff,80713568,80713590,806e31a4,807365dc) at 0x8051e012 =3D b=
 acktra2
 witness_checkorder(807105b4,9,8069fdd9,c1,1310) at 0x80539428 =3D witness_c=
 heckor4
 _mtx_lock_spin_flags(807105b4,0,8069fdd9,c1,83fee000) at 0x80516e4a =3D _mt=
 x_locke
 sleepq_lookup(86ea7ad8,86ea7aa8,100,86ea7aa8,7d7) at 0x8053673a =3D sleepq_=
 lookupe
 msleep(86ea7ad8,86ea7aa8,158,806a2582,0) at 0x8052357a =3D msleep+0xa6
 sbwait(86ea7a94,0,1235,0,0) at 0x8054fbaf =3D sbwait+0x33
 do_sendfile(83f9b540,ab0a9d14,0,ab0a9d40,80665343) at 0x80553afd =3D do_sen=
 dfile+5
 sendfile(83f9b540,ab0a9d14,8,a9,202) at 0x80552e9c =3D sendfile+0x10
 syscall(2f,2f,7fbf002f,8,0) at 0x80665343 =3D syscall+0x217
 Xint0x80_syscall() at 0x8065467f =3D Xint0x80_syscall+0x1f
 --- syscall (393), eip =3D 0x2812123b, esp =3D 0x7fbfde4c, ebp =3D 0x7fbfde=
 a8 ---
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid =3D 1; apic id =3D 06
 fault virtual address   =3D 0x34
 fault code              =3D supervisor read, page not present
 instruction pointer     =3D 0x8:0x805390fa
 stack pointer           =3D 0x10:0xaae8eb1c
 frame pointer           =3D 0x10:0xaae8eb40
 code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                         =3D DPL 0, pres 1, def32 1, gran 1
 processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
 current process         =3D 6196 (ftpd)
 kernel: type 12 trap, code=3D0
 Stopped at      0x805390fa =3D witness_checkorder+0x176:  movl    0x34(%eax=
 ),%edx
 db>
 db> trace
 witness_checkorder(8071021c,9,8069fdd9,c1,f78) at 0x805390fa =3D witness_ch=
 eckord6
 _mtx_lock_spin_flags(8071021c,0,8069fdd9,c1,83e78e70) at 0x80516e4a =3D _mt=
 x_locke
 sleepq_lookup(87a86370,86a5a904,0,8069fdd9,145) at 0x8053673a =3D sleepq_lo=
 okup+0e
 sleepq_catch_signals(87a86370,0,100,0,806a2729) at 0x805369cd =3D sleepq_ca=
 tch_sid
 msleep(87a86370,87a86340,158,806a2582,0) at 0x80523707 =3D msleep+0x233
 sbwait(87a8632c,0,1ac6,0,0) at 0x8054fbaf =3D sbwait+0x33
 do_sendfile(83e78e70,aae8ed14,0,aae8ed40,80665343) at 0x80553afd =3D do_sen=
 dfile+5
 sendfile(83e78e70,aae8ed14,8,9,202) at 0x80552e9c =3D sendfile+0x10
 syscall(7fbf002f,7fbf002f,7fbf002f,7,0) at 0x80665343 =3D syscall+0x217
 Xint0x80_syscall() at 0x8065467f =3D Xint0x80_syscall+0x1f
 --- syscall (393, FreeBSD ELF32, sendfile), eip =3D 0x2812123b, esp =3D 0x7=
 fbfde4c,-
 
 
 Hope this helps.
 
 Cheers,
  Daniel
 --=20
 IRCnet: Mr-Spock   - signs of absurd developments in the net community:=20
 #42:   - "Wurstbrot gehoert m.E. zum Fruehstuecks-botnet von Cartoon" - =20
  Daniel Lang * dl@leo.org * +49 89 289 18532 * http://www.leo.org/~dl/

From: Daniel Lang <dl@leo.org>
To: freebsd-gnats-submit@FreeBSD.org
Cc: dl@leo.org
Subject: Re: kern/68442: panic - acquiring duplicate lock of same type: "sleepq chain"
Date: Mon, 28 Jun 2004 22:32:09 +0200

 Another panic message, which hangs the machine, no ddb prompt,
 no dump, no reboot:
 
 [..]
 Fatal trap 12: page fault while in kernel mode
 cpuid = 1; apic id = 06
 fault virtual address   = 0x34
 fault code              = supervisor read, page not present
 instruction pointer     = 0x8:0x8053938f
 stack pointer           = 0x10:0xaade3ab0
 frame pointer           = 0x10:0xaade3ad4
 code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
 processor eflags        = resume, IOPL = 0
 current process         = 913 (cvsupd)
 [..]
 
 Cheers,
  Daniel
 -- 
 IRCnet: Mr-Spock     - Cool people don't move, they just hang around. -  
 Daniel Lang * dl@leo.org * ++49 89 289 18532  * http://www.leo.org/~dl/

From: Daniel Lang <dl@leo.org>
To: freebsd-gnats-submit@FreeBSD.org
Cc: dl@leo.org, bzeeb+freebsd+lor@zabbadoz.net,
	freebsd-current@freebsd.org
Subject: Re: kern/68442: panic - acquiring duplicate lock of same type: "sleepq chain"
Date: Tue, 29 Jun 2004 17:39:21 +0200

 Hi again,
 
 Ok, here are some LOR's that occurred today on the machine
 before it paniced and wedged. The LOR's are not yet listed
 on the LOR page, so I cc: Bjoern.
 
 The panic and subsequent system wedge did not happen
 immediately after the LORs, the system continued to
 run for a while, but only for short while after the second LOR.
 
 Important INFO: I cvsuped and built new world and kernel
 (in single user mode, the system appears to be able to do
 some work in single user) of _today_. I hoped the problem might
 go away. I also removed KVA_PAGES=512 from kernel config,
 so default KVA_PAGES apply. It apparently did not help.
 
 LOR1:
 [..]
 login: lock order reversal
  1st 0xc070a0c0 sched lock (sched lock) @ /usr/src/sys/kern/kern_proc.c:672
  2nd 0xc0745d40 sio (sio) @ /usr/src/sys/dev/sio/sio.c:3205
 Stack backtrace:
 backtrace(ffffffff,c0712948,c0712b00,c06e25c4,c07358dc) at 0xc051e066 = backtra2
 witness_checkorder(c0745d40,9,c06b1e34,c85,3f8) at 0xc05393c4 = witness_checkor4
 _mtx_lock_spin_flags(c0745d40,0,c06b1e34,c85,c0712498) at 0xc0516e9e = _mtx_loce
 siocnputc(c06f9c40,63) at 0xc064375f = siocnputc+0x6b
 cnputc(63) at 0xc05483ed = cnputc+0x4d
 putchar(63,e5bd87e0) at 0xc053387a = putchar+0x92
 kvprintf(c069d236,c05337e8,e5bd87e0,a,e5bd8800) at 0xc0533a87 = kvprintf+0x77
 printf(c069d236,32cf1,0,32ce6,0,dd8,c9115828) at 0xc0533763 = printf+0x43
 calcru(c91156e0,e5bd8ad0,e5bd8ad8,0,e5bd8a10) at 0xc051cb9e = calcru+0x1f2
 fill_kinfo_thread(c436e930,e5bd88fc,e5bd8b98,c05194b6,c91156e0) at 0xc0518f2a =6
 fill_kinfo_proc(c91156e0,e5bd88fc,dd8,288,0) at 0xc0518c01 = fill_kinfo_proc+0x1
 sysctl_out_proc(c91156e0,e5bd8c08,4,0,4) at 0xc05194b6 = sysctl_out_proc+0x32
 sysctl_kern_proc(c06e2c20,e5bd8c90,0,e5bd8c08,c06e2c20) at 0xc05199d8 = sysctl_4
 sysctl_root(0,e5bd8c84,3,e5bd8c08,ca65fe70) at 0xc052519f = sysctl_root+0x10f
 userland_sysctl(ca65fe70,e5bd8c84,3,0,bfbfe47c) at 0xc052535c = userland_sysctlc
 __sysctl(ca65fe70,e5bd8d14,6,3,296) at 0xc052521d = __sysctl+0x71
 syscall(2f,2f,2f,3,0) at 0xc0664713 = syscall+0x217
 Xint0x80_syscall() at 0xc06539ff = Xint0x80_syscall+0x1f
 --- syscall (202), eip = 0x280dd05b, esp = 0xbfbfe40c, ebp = 0xbfbfe448 ---
 [..]
 
 LOR2:
 [..]
 acquiring duplicate lock of same type: "sleepq chain"
  1st Giant @ /usr/src/sys/kern/uipc_syscalls.c:1735
  2nd sleepq chain @ /usr/src/sys/kern/subr_sleepqueue.c:223
 Stack backtrace:
 backtrace(c053a384,c0712970,c0712970,c06e25c4,c07359bc) at 0xc051e066 = backtra2
 witness_checkorder(c070ea1c,9,c069f25e,df,398) at 0xc05393c4 = witness_checkord4
 _mtx_lock_spin_flags(c070ea1c,0,c069f25e,df,c3f96150) at 0xc0516e9e = _mtx_locke
 sleepq_lookup(c9f29724,c8cc0abc,0,c069f25e,16b) at 0xc053675e = sleepq_lookup+0e
 sleepq_catch_signals(c9f29724,0,100,0,c06a1bac) at 0xc05369f1 = sleepq_catch_sid
 msleep(c9f29724,c9f296f4,158,c06a1a05,0) at 0xc052375b = msleep+0x233
 sbwait(c9f296e0,0,51d,0,0) at 0xc054f90f = sbwait+0x33
 do_sendfile(c3f96150,e5400d14,0,e5400d40,c0664713) at 0xc055385d = do_sendfile+5
 sendfile(c3f96150,e5400d14,8,8,202) at 0xc0552bfc = sendfile+0x10
 syscall(806002f,2819002f,bfbf002f,8,0) at 0xc0664713 = syscall+0x217
 Xint0x80_syscall() at 0xc06539ff = Xint0x80_syscall+0x1f
 --- syscall (393), eip = 0x2812123b, esp = 0xbfbfde4c, ebp = 0xbfbfdea8 ---
 [..]
 
 And here the panic message:
 [..]
 Fatal trap 12: page fault while in kernel mode
 cpuid = 1; apic id = 06
 fault virtual address   = 0x34
 fault code              = supervisor read, page not present
 instruction pointer     = 0x8:0xc053932b
 stack pointer           = 0x10:0xe53d8ab0
 frame pointer           = 0x10:0xe53d8ad4
 code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
 processor eflags        = resume, IOPL = 0
 current process         = 2550 (cvsupd)
 [..]
 
 No ddb prompt, no crash-dump, no reboot. I need to go and
 reset the thing (now for the dozenth time :-/).
 
 Thanks and best regards,
  Daniel
 -- 
 IRCnet: Mr-Spock     - Cool people don't move, they just hang around. -  
 Daniel Lang * dl@leo.org * ++49 89 289 18532  * http://www.leo.org/~dl/

From: John Baldwin <jhb@FreeBSD.org>
To: freebsd-current@FreeBSD.org
Cc: Daniel Lang <dl@leo.org>, freebsd-gnats-submit@FreeBSD.org,
	bzeeb+freebsd+lor@zabbadoz.net
Subject: Re: kern/68442: panic - acquiring duplicate lock of same type: "sleepq chain"
Date: Tue, 29 Jun 2004 12:00:12 -0400

 On Tuesday 29 June 2004 11:39 am, Daniel Lang wrote:
 > Hi again,
 >
 > Ok, here are some LOR's that occurred today on the machine
 > before it paniced and wedged. The LOR's are not yet listed
 > on the LOR page, so I cc: Bjoern.
 >
 > The panic and subsequent system wedge did not happen
 > immediately after the LORs, the system continued to
 > run for a while, but only for short while after the second LOR.
 >
 > Important INFO: I cvsuped and built new world and kernel
 > (in single user mode, the system appears to be able to do
 > some work in single user) of _today_. I hoped the problem might
 > go away. I also removed KVA_PAGES=512 from kernel config,
 > so default KVA_PAGES apply. It apparently did not help.
 >
 > LOR1:
 > [..]
 > login: lock order reversal
 >  1st 0xc070a0c0 sched lock (sched lock) @ /usr/src/sys/kern/kern_proc.c:672
 >  2nd 0xc0745d40 sio (sio) @ /usr/src/sys/dev/sio/sio.c:3205
 > Stack backtrace:
 > backtrace(ffffffff,c0712948,c0712b00,c06e25c4,c07358dc) at 0xc051e066 =
 > backtra2 witness_checkorder(c0745d40,9,c06b1e34,c85,3f8) at 0xc05393c4 =
 > witness_checkor4 _mtx_lock_spin_flags(c0745d40,0,c06b1e34,c85,c0712498) at
 > 0xc0516e9e = _mtx_loce siocnputc(c06f9c40,63) at 0xc064375f =
 > siocnputc+0x6b
 > cnputc(63) at 0xc05483ed = cnputc+0x4d
 > putchar(63,e5bd87e0) at 0xc053387a = putchar+0x92
 > kvprintf(c069d236,c05337e8,e5bd87e0,a,e5bd8800) at 0xc0533a87 =
 > kvprintf+0x77 printf(c069d236,32cf1,0,32ce6,0,dd8,c9115828) at 0xc0533763 =
 > printf+0x43 calcru(c91156e0,e5bd8ad0,e5bd8ad8,0,e5bd8a10) at 0xc051cb9e =
 > calcru+0x1f2
 > fill_kinfo_thread(c436e930,e5bd88fc,e5bd8b98,c05194b6,c91156e0) at
 > 0xc0518f2a =6 fill_kinfo_proc(c91156e0,e5bd88fc,dd8,288,0) at 0xc0518c01 =
 > fill_kinfo_proc+0x1 sysctl_out_proc(c91156e0,e5bd8c08,4,0,4) at 0xc05194b6
 > = sysctl_out_proc+0x32
 > sysctl_kern_proc(c06e2c20,e5bd8c90,0,e5bd8c08,c06e2c20) at 0xc05199d8 =
 > sysctl_4 sysctl_root(0,e5bd8c84,3,e5bd8c08,ca65fe70) at 0xc052519f =
 > sysctl_root+0x10f userland_sysctl(ca65fe70,e5bd8c84,3,0,bfbfe47c) at
 > 0xc052535c = userland_sysctlc __sysctl(ca65fe70,e5bd8d14,6,3,296) at
 > 0xc052521d = __sysctl+0x71
 > syscall(2f,2f,2f,3,0) at 0xc0664713 = syscall+0x217
 > Xint0x80_syscall() at 0xc06539ff = Xint0x80_syscall+0x1f
 > --- syscall (202), eip = 0x280dd05b, esp = 0xbfbfe40c, ebp = 0xbfbfe448 ---
 > [..]
 >
 > LOR2:
 > [..]
 > acquiring duplicate lock of same type: "sleepq chain"
 >  1st Giant @ /usr/src/sys/kern/uipc_syscalls.c:1735
 >  2nd sleepq chain @ /usr/src/sys/kern/subr_sleepqueue.c:223
 
 This message makes no sense at all as they aren't the same type.
 
 > And here the panic message:
 > [..]
 > Fatal trap 12: page fault while in kernel mode
 > cpuid = 1; apic id = 06
 > fault virtual address   = 0x34
 > fault code              = supervisor read, page not present
 > instruction pointer     = 0x8:0xc053932b
 > stack pointer           = 0x10:0xe53d8ab0
 > frame pointer           = 0x10:0xe53d8ad4
 > code segment            = base 0x0, limit 0xfffff, type 0x1b
 >                         = DPL 0, pres 1, def32 1, gran 1
 > processor eflags        = resume, IOPL = 0
 > current process         = 2550 (cvsupd)
 > [..]
 >
 > No ddb prompt, no crash-dump, no reboot. I need to go and
 > reset the thing (now for the dozenth time :-/).
 
 Can you pop up gdb -k on the kernel.debug and do 'l *0xc053932b'
 
 -- 
 John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
 "Power Users Use the Power to Serve"  =  http://www.FreeBSD.org

From: Daniel Lang <dl@leo.org>
To: John Baldwin <jhb@FreeBSD.org>, freebsd-gnats-submit@FreeBSD.org
Cc: freebsd-current@FreeBSD.org, Bruce Evans <bde@zeta.org.au>,
	Colin Percival <colin.percival@wadham.ox.ac.uk>
Subject: Re: kern/68442: panic - acquiring duplicate lock of same type: "sleepq chain"
Date: Thu, 1 Jul 2004 17:32:21 +0200

 Hi,
 
 John Baldwin wrote on Tue, Jun 29, 2004 at 02:53:34PM -0400:
 [..]
 > Well, it may be a hint sadly enough.  The fact that it thinks Giant is a spin 
 > lock means that witness is confused, and this panic may be further such 
 > confusion.  One possibility is that somehow the sleep queue chain mutexes 
 > have been corrupted.
 [..]
 
 First: the system locked up also with a UP kernel, so maybe this
 is not entirely related to SMP locking? Or maybe locks are now
 used in a non-smp environment as well. I guess so since the advent
 of KSE? Anyway...
 
 I have applied Bruce Evans' sio.c patch and now I did get a 'ddb'
 prompt after the machine paniced. I could call doadump()
 and this time there is enough space in /var.
 The following applies again to the SMP kernel but no
 PAE stuff.
 
 Here is the panic message and ddb trace:
 [..]
 Fatal trap 12: page fault while in kernel mode
 cpuid = 1; apic id = 01
 fault virtual address   = 0x34
 fault code              = supervisor read, page not present
 instruction pointer     = 0x8:0xc0539096
 stack pointer           = 0x10:0xe5502ab0
 frame pointer           = 0x10:0xe5502ad4
 code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
 processor eflags        = interrupt enabled, resume, IOPL = 0
 current process         = 9176 (cvs)
 kernel: type 12 trap, code=0
 Stopped at      0xc0539096 = witness_checkorder+0x176:  movl    0x34(%eax),%edx
 
 db> trace
 witness_checkorder(c07100a4,9,c069f530,19b,500) at 0xc0539096 = witness_checkor6
 _mtx_lock_spin_flags(c07100a4,0,c069f530,19b,c070a0e0) at 0xc0516e9e = _mtx_loce
 turnstile_lookup(c070a0e0,c070a0e0,3bc,c06a390d,e5502b54) at 0xc05382ae = turnse
 _mtx_lock_sleep(c070a0e0,0,c06a390d,3bc) at 0xc051701e = _mtx_lock_sleep+0x66
 _mtx_lock_flags(c070a0e0,0,c06a390d,3bc,c0711c30) at 0xc0516e0f = _mtx_lock_fla7
 kern_open(c403bd20,81ef440,0,1,1b6) at 0xc056873f = kern_open+0xc7
 open(c403bd20,e5502d14,3,ab,296) at 0xc0568674 = open+0x18
 syscall(2f,2835002f,bfbf002f,4,283502e0) at 0xc0664703 = syscall+0x217
 Xint0x80_syscall() at 0xc06539ef = Xint0x80_syscall+0x1f
 --- syscall (5, FreeBSD ELF32, open), eip = 0x282c9e1b, esp = 0xbfbfe0fc, ebp =-
 [..]
 
 Please note, this time, there was no LOR happening!
 
 Here is a log of my sort of fruitless gdb session.
 Since there was no LOR, I am not sure if the sleepq_chain is
 related to the panic? 
 
 However, the panic is obviously triggered inside the witness
 code, because *lock_list = 0x0 in line 749. Although a few lines
 above, the list is checked for beeing empty (line 707), just
 Colin has already pointed out from the first trace I could
 get. But between line 707 and 749 there is no obvious modification
 to this list. I am not sure what 'find_instance()' does?
 So maybe another thread on another CPU has modified the locklist
 meanwhile? Is this possible?
 
 Anyway, here is my gdb session. I poked around in the mutexes
 in the 'turnstile_chain' without finding anything obviously
 wrong.
 
 [..]
 -rw-r--r--   1 root  wheel           5 Feb 23 20:42 minfree
 -rw-------   1 root  wheel  2147352576 Jul  1 16:58 vmcore.1
 atleo6:/var/crash#gdb6 -k kernel.1 vmcore.1 
 GNU gdb 20040615 [GDB v6.x for FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for details.
 This GDB was configured as "i386-portbld-freebsd5.2"...
 panic messages:
 ---
 Fatal trap 12: page fault while in kernel mode
 cpuid = 1; apic id = 01
 fault virtual address	= 0x34
 fault code		= supervisor read, page not present
 instruction pointer	= 0x8:0xc0539096
 stack pointer	        = 0x10:0xe5502ab0
 frame pointer	        = 0x10:0xe5502ad4
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, def32 1, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= 9176 (cvs)
 kernel: type 12 trap, code=0
 Dumping 2047 MB
  16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 944 960 976 992 1008 1024 1040 1056 1072 1088 1104 1120 1136 1152 1168 1184 1200 1216 1232 1248 1264 1280 1296 1312 1328 1344 1360 1376 1392 1408 1424 1440 1456 1472 1488 1504 1520 1536 1552 1568 1584 1600 1616 1632 1648 1664 1680 1696 1712 1728 1744 1760 1776 1792 1808 1824 1840 185 6 1872 1888 1904 1920 1936 1952 1968 1984 2000 2016 2032
 ---
 #0  doadump () at /usr/src/sys/kern/kern_shutdown.c:236
 236		dumping++;
 doadump () at /usr/src/sys/kern/kern_shutdown.c:236
 236		dumping++;
 (kgdb) bt
 #0  doadump () at /usr/src/sys/kern/kern_shutdown.c:236
 #1  0xc0451f2e in db_fncall (dummy1=0, dummy2=0, dummy3=-1066115808, 
     dummy4=0xe55028ec "\b)P\032oQ ]tF") at /usr/src/sys/ddb/db_command.c:551
 #2  0xc0451d3c in db_command (last_cmdp=0xc0702150, cmd_table=0x0, aux_cmd_tablep=0xc06ba454, 
     aux_cmd_tablep_end=0xc06ba46c) at /usr/src/sys/ddb/db_command.c:348
 #3  0xc0451e14 in db_command_loop () at /usr/src/sys/ddb/db_command.c:475
 #4  0xc0454551 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:73
 #5  0xc06523fd in kdb_trap (type=12, code=0, regs=0xe5502a70)
     at /usr/src/sys/i386/i386/db_interface.c:159
 #6  0xc0664433 in trap_fatal (frame=0xe5502a70, eva=52) at /usr/src/sys/i386/i386/trap.c:810
 #7  0xc0664177 in trap_pfault (frame=0xe5502a70, usermode=0, eva=52) at /usr/src/sys/i386/i386/trap.c:733
 #8  0xc0663e19 in trap (frame=
       {tf_fs = -1066336232, tf_es = -931856368, tf_ds = 16, tf_edi = 9, tf_esi = -1066336092, tf_ebp = -447730988, tf_isp = -447731044, tf_ebx = -4194252, tf_edx = -1, tf_ecx = 0, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1068265322, tf_cs = 8, tf_eflags = 66118, tf_esp = -447731012, tf_ss = -1068404931})
     at /usr/src/sys/i386/i386/trap.c:420
 #9  0xc065399a in calltrap () at /usr/src/sys/i386/i386/exception.s:140
 #10 0xc0710018 in turnstile_chains ()
 #11 0xc8750010 in ?? ()
 #12 0x00000010 in ?? ()
 #13 0x00000009 in ?? ()
 #14 0xc07100a4 in turnstile_chains ()
 #15 0xe5502ad4 in ?? ()
 #16 0xe5502a9c in ?? ()
 #17 0xffc00034 in ?? ()
 #18 0xffffffff in ?? ()
 #19 0x00000000 in ?? ()
 #20 0x00000000 in ?? ()
 #21 0x0000000c in ?? ()
 #22 0x00000000 in ?? ()
 #23 0xc0539096 in witness_checkorder (lock=0xc07100a4, flags=9, 
     file=0xc069f530 "/usr/src/sys/kern/subr_turnstile.c", line=411)
     at /usr/src/sys/kern/subr_witness.c:749
 #24 0xc0516e9e in _mtx_lock_spin_flags (m=0xc07100a4, opts=0, 
     file=0xc069f530 "/usr/src/sys/kern/subr_turnstile.c", line=411) at /usr/src/sys/kern/kern_mutex.c:354
 #25 0xc05382ae in turnstile_lookup (lock=0xc070a0e0) at /usr/src/sys/kern/subr_turnstile.c:411
 #26 0xc051701e in _mtx_lock_sleep (m=0xc070a0e0, opts=0, 
     file=0xc06a390d "/usr/src/sys/kern/vfs_syscalls.c", line=956) at /usr/src/sys/kern/kern_mutex.c:458
 #27 0xc0516e0f in _mtx_lock_flags (m=0xc070a0e0, opts=0, 
     file=0xc06a390d "/usr/src/sys/kern/vfs_syscalls.c", line=956) at /usr/src/sys/kern/kern_mutex.c:252
 #28 0xc056873f in kern_open (td=0xc403bd20, path=0x0, pathseg=UIO_USERSPACE, flags=1, mode=438)
     at /usr/src/sys/kern/vfs_syscalls.c:956
 #29 0xc0568674 in open (td=0xc403bd20, uap=0x0) at /usr/src/sys/kern/vfs_syscalls.c:926
 #30 0xc0664703 in syscall (frame=
       {tf_fs = 47, tf_es = 674562095, tf_ds = -1078001617, tf_edi = 4, tf_esi = 674562784, tf_ebp = -1077944024, tf_isp = -447730316, tf_ebx = 674484940, tf_edx = 0, tf_ecx = 0, tf_eax = 5, tf_trapno = 12, tf_err = 2, tf_eip = 674012699, tf_cs = 31, tf_eflags = 662, tf_esp = -1077944068, tf_ss = 47})
     at /usr/src/sys/i386/i386/trap.c:1004
 #31 0xc06539ef in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:201
 #32 0x0000002f in ?? ()
 #33 0x2835002f in ?? ()
 #34 0xbfbf002f in ?? ()
 #35 0x00000004 in ?? ()
 #36 0x283502e0 in ?? ()
 #37 0xbfbfe128 in ?? ()
 #38 0xe5502d74 in ?? ()
 #39 0x2833d2cc in ?? ()
 #40 0x00000000 in ?? ()
 #41 0x00000000 in ?? ()
 #42 0x00000005 in ?? ()
 #43 0x0000000c in ?? ()
 #44 0x00000002 in ?? ()
 ---Type <return> to continue, or q <return> to quit---up 23
 #45 0x282c9e1b in ?? ()
 #46 0x0000001f in ?? ()
 #47 0x00000296 in ?? ()
 #48 0xbfbfe0fc in ?? ()
 #49 0x0000002f in ?? ()
 #50 0x00000000 in ?? ()
 #51 0x00000000 in ?? ()
 #52 0x0804ad90 in ?? ()
 #53 0x00000000 in ?? ()
 #54 0x67ff8000 in ?? ()
 #55 0xc8755dc0 in ?? ()
 #56 0xc8755e6c in ?? ()
 #57 0xe55029c4 in ?? ()
 #58 0xe55029b4 in ?? ()
 #59 0xc403bd20 in ?? ()
 #60 0xc052bd30 in sched_switch (td=0x2833d2cc) at /usr/src/sys/kern/sched_4bsd.c:676
 Previous frame inner to this frame (corrupt stack?)
 (kgdb) up 23
 #23 0xc0539096 in witness_checkorder (lock=0xc07100a4, flags=9, 
     file=0xc069f530 "/usr/src/sys/kern/subr_turnstile.c", line=411)
     at /usr/src/sys/kern/subr_witness.c:749
 749		lock1 = &(*lock_list)->ll_children[(*lock_list)->ll_count - 1];
 (kgdb) l
 744		/*
 745		 * Check for duplicate locks of the same type.  Note that we only
 746		 * have to check for this on the last lock we just acquired.  Any
 747		 * other cases will be caught as lock order violations.
 748		 */
 749		lock1 = &(*lock_list)->ll_children[(*lock_list)->ll_count - 1];
 750		w1 = lock1->li_lock->lo_witness;
 751		if (w1 == w) {
 752			if (w->w_same_squawked || (lock->lo_flags & LO_DUPOK))
 753				return;
 (kgdb) p lock_list
 $2 = (struct lock_list_entry *) 0x0
 (kgdb) up
 #24 0xc0516e9e in _mtx_lock_spin_flags (m=0xc07100a4, opts=0, 
     file=0xc069f530 "/usr/src/sys/kern/subr_turnstile.c", line=411) at /usr/src/sys/kern/kern_mutex.c:354
 354		WITNESS_CHECKORDER(&m->mtx_object, opts | LOP_NEWORDER | LOP_EXCLUSIVE,
 (kgdb) l
 349	
 350		MPASS(curthread != NULL);
 351		KASSERT(m->mtx_object.lo_class == &lock_class_mtx_spin,
 352		    ("mtx_lock_spin() of sleep mutex %s @ %s:%d",
 353		    m->mtx_object.lo_name, file, line));
 354		WITNESS_CHECKORDER(&m->mtx_object, opts | LOP_NEWORDER | LOP_EXCLUSIVE,
 355		    file, line);
 356	#if defined(SMP) || LOCK_DEBUG > 0 || 1
 357		_get_spin_lock(m, curthread, opts, file, line);
 358	#else
 (kgdb) p m
 $3 = (struct mtx *) 0xc07100a4
 (kgdb) p *m
 $4 = {mtx_object = {lo_class = 0xc06e25a4, lo_name = 0xc069f553 "turnstile chain", 
     lo_type = 0xc069f553 "turnstile chain", lo_flags = 196608, lo_list = {tqe_next = 0xc07100cc, 
       tqe_prev = 0xc071008c}, lo_witness = 0xc0712900}, mtx_lock = 4, mtx_recurse = 0}
 (kgdb) p m->mtx_object
 $5 = {lo_class = 0xc06e25a4, lo_name = 0xc069f553 "turnstile chain", 
   lo_type = 0xc069f553 "turnstile chain", lo_flags = 196608, lo_list = {tqe_next = 0xc07100cc, 
     tqe_prev = 0xc071008c}, lo_witness = 0xc0712900}
 (kgdb) p m->mtx_object->lo_class
 $6 = (struct lock_class *) 0xc06e25a4
 (kgdb) p m->mtx_object->lo_class
 $7 = {lc_name = 0xc069cce2 "spin mutex", lc_flags = 10}
 (kgdb) p m->mtx_object->lo_list
 $12 = {tqe_next = 0xc07100cc, tqe_prev = 0xc071008c}
 (kgdb) p m->mtx_object->lo_list->tqe_next
 $13 = (struct lock_object *) 0xc07100cc
 (kgdb) p m->mtx_object->lo_list->tqe_next
 $14 = {lo_class = 0xc06e25a4, lo_name = 0xc069f553 "turnstile chain", 
   lo_type = 0xc069f553 "turnstile chain", lo_flags = 196608, lo_list = {tqe_next = 0xc07100f4, 
     tqe_prev = 0xc07100b4}, lo_witness = 0xc0712900}
 (kgdb) p *m->mtx_object->lo_list->tqe_next->lo_list->tqe_next
 $15 = {lo_class = 0xc06e25a4, lo_name = 0xc069f553 "turnstile chain", 
   lo_type = 0xc069f553 "turnstile chain", lo_flags = 196608, lo_list = {tqe_next = 0xc071011c, 
     tqe_prev = 0xc07100dc}, lo_witness = 0xc0712900}
 (kgdb) p opts
 $16 = 0
 (kgdb) p file
 $17 = 0xc069f530 "/usr/src/sys/kern/subr_turnstile.c"
 (kgdb) p line
 $18 = 411
 (kgdb) p *m->mtx_object->lo_list->tqe_next->lo_list->tqe_next->lo_list->tqe_next
 $22 = {lo_class = 0xc06e25a4, lo_name = 0xc069f553 "turnstile chain", 
   lo_type = 0xc069f553 "turnstile chain", lo_flags = 196608, lo_list = {tqe_next = 0xc0710144, 
     tqe_prev = 0xc0710104}, lo_witness = 0xc0712900}
 (kgdb) p *m->mtx_object->lo_list->tqe_next->lo_list->tqe_next
 $23 = {lo_class = 0xc06e25a4, lo_name = 0xc069f553 "turnstile chain", 
   lo_type = 0xc069f553 "turnstile chain", lo_flags = 196608, lo_list = {tqe_next = 0xc071011c, 
     tqe_prev = 0xc07100dc}, lo_witness = 0xc0712900}
 (kgdb) p *m->mtx_object->lo_list->tqe_next->lo_list->tqe_next->lo_list->tqe_next->lo_list->tqe_next
 $24 = {lo_class = 0xc06e25a4, lo_name = 0xc069f553 "turnstile chain", 
   lo_type = 0xc069f553 "turnstile chain", lo_flags = 196608, lo_list = {tqe_next = 0xc071016c, 
     tqe_prev = 0xc071012c}, lo_witness = 0xc0712900}
 (kgdb) p *m->mtx_object->lo_list->tqe_next->lo_list->tqe_next->lo_list->tqe_next->lo_list->tqe_next->lo_cl ass
 $25 = {lc_name = 0xc069cce2 "spin mutex", lc_flags = 10}
 (kgdb) p m->mtx_object->lo_list->tqe_next->lo_list->tqe_next->lo_list->tqe_next->lo_list->tqe_next->lo_listt->tqe_next->lo_list
 $27 = {tqe_next = 0xc0710194, tqe_prev = 0xc0710154}
 (kgdb) quit
 Script done on Thu Jul  1 17:19:36 2004
 [..]
 
 Cheers,
  Daniel
 -- 
 IRCnet: Mr-Spock         - ceterum censeo Microsoftinem esse delendam -  
  Daniel Lang * dl@leo.org * +49 89 289 18532 * http://www.leo.org/~dl/
State-Changed-From-To: open->closed 
State-Changed-By: kmacy 
State-Changed-When: Fri Nov 16 04:43:09 UTC 2007 
State-Changed-Why:  

Have not seen this in years and the relevant code has changed substantially. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=68442 
>Unformatted:
