From phil@nixil.net  Wed Dec 29 09:41:52 2004
Return-Path: <phil@nixil.net>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 83FF516A4CE
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 29 Dec 2004 09:41:52 +0000 (GMT)
Received: from nixil.net (nixil.net [161.58.222.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E7BDA43D46
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 29 Dec 2004 09:41:51 +0000 (GMT)
	(envelope-from phil@nixil.net)
Received: from nixil.net (localhost [127.0.0.1])
	by nixil.net (8.13.1/8.13.1) with ESMTP id iBT9fpuG028243
	for <FreeBSD-gnats-submit@freebsd.org>; Wed, 29 Dec 2004 02:41:51 -0700 (MST)
Received: (from phil@localhost)
	by nixil.net (8.13.1/8.13.1/Submit) id iBT9fpG8028242;
	Wed, 29 Dec 2004 02:41:51 -0700 (MST)
Message-Id: <200412290941.iBT9fpG8028242@nixil.net>
Date: Wed, 29 Dec 2004 02:41:51 -0700 (MST)
From: Phil Oleson <oz@nixil.net>
Reply-To: Phil Oleson <oz@nixil.net>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: 5.3 kernel crash
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         75603
>Category:       kern
>Synopsis:       5.3 kernel crash
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-scsi
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Dec 29 09:50:19 GMT 2004
>Closed-Date:    Thu Apr 14 03:54:09 GMT 2005
>Last-Modified:  Thu Apr 14 03:54:09 GMT 2005
>Originator:     Phil Oleson
>Release:        FreeBSD 5.3-STABLE i386
>Organization:
N/A
>Environment:
System: FreeBSD demigorgon 5.3-STABLE FreeBSD 5.3-STABLE #0: Tue Dec 28 18:41:32 MST 2004 root@demigorgon:/usr/src/sys/i386/compile/demigorgon i386


>Description:
	This crash occurs on the 5.3 release as well as the current RELENG_5 branch.
System has a BT-958 host controller (Bios v 4.96I/ Host adapter firmware v5.07B)
Disk that causes system to bail is a Seagate ST118202LW firmware v06.
If I take the disk out of the loop, the system boots up fine. When it's 
initialized the system takes a dive. Same hardware setup worked with 4.X

-----------------------------------------------------------------------------------
Connected to Comport 1
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.3-STABLE #0: Tue Dec 28 20:38:42 MST 2004
root@demigorgon:/usr/src/sys/i386/compile/demigorgon
MPTable: <DELL Opti GX270 >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.80GHz (2793.19-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0xf29 Stepping = 9
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C
MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Hyperthreading: 2 logical CPUs
real memory = 527892480 (503 MB)
avail memory = 507027456 (483 MB)
ioapic0: Changing APIC ID to 2
ioapic0: Assuming intbase of 0
ioapic0 <Version 2.0> irqs 0-23 on motherboard
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <MPTable Host-PCI bridge> pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
agp0: <Intel 82865G (865G GMCH) SVGA controller> port 0xed98-0xed9f mem 
0xfeb800
00-0xfebfffff,0xe8000000-0xefffffff irq 16 at device 2.0 on pci0
agp0: detected 8060k stolen memory
agp0: aperture size is 128M
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xff80-0xff9f 
irq 16 at
device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xff60-0xff7f 
irq 19 at
device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xff40-0xff5f 
irq 18 at
device 29.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xff20-0xff3f 
irq 16 at
device 29.3 on pci0
uhci3: [GIANT-LOCKED]
usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
usb3: USB revision 1.0
uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
pci0: <serial bus, USB> at device 29.7 (no driver attached)
pcib1: <MPTable PCI-PCI bridge> at device 30.0 on pci0
pci1: <PCI bus> on pcib1
bt0: <Buslogic Multi-Master SCSI Host Adapter> port 0xdf3c-0xdf3f mem 
0xfe9df000
-0xfe9dffff irq 16 at device 7.0 on pci1
bt0: BT-958 FW Rev. 5.07B Ultra Wide SCSI Host Adapter, SCSI ID 7, 192 CCBs
bt0: [GIANT-LOCKED]
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 
0xdf40-0xdf7f
mem 0xfe9e0000-0xfe9fffff irq 18 at device 12.0 on pci1
em0: Ethernet address: 00:0d:56:95:40:72
em0: Speed:N/A Duplex:N/A
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 
0xffa0-0xffaf,0x376,0x170-0x177,0x
3f6,0x1f0-0x1f7 irq 18 at device 31.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
atapci1: <Intel ICH5 SATA150 controller> port 
0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20
-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 irq 18 at device 31.2 on pci0
ata2: channel #0 on atapci1
ata3: channel #1 on atapci1
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
pcm0: <Intel ICH5 (82801EB)> port 0xedc0-0xedff,0xee00-0xeeff mem 
0xfeb7f900-0xf
eb7f9ff,0xfeb7fa00-0xfeb7fbff irq 17 at device 31.5 on pci0
pcm0: [GIANT-LOCKED]
pcm0: <Analog Devices AD1981B AC97 Codec>
cpu0 on motherboard
orm0: <ISA Option ROMs> at iomem 
0xd6800-0xd7fff,0xd5000-0xd67ff,0xce800-0xd4fff
,0xca800-0xce7ff,0xc0000-0xca7ff on isa0
pmtimer0 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A, console
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0401> can't assign resources (port)
unknown: <PNP0700> can't assign resources (port)
unknown: <PNP0f13> can't assign resources (irq)
unknown: <PNP0303> can't assign resources (port)
Timecounter "TSC" frequency 2793190664 Hz quality 800
Timecounters tick every 10.000 msec
ad0: 76293MB <WDC WD800BB-75FRA0/77.07W77> [155009/16/63] at ata0-master 
UDMA100
acd0: CDROM <Lite-On LTN486S 48x Max/YDS6> at ata1-master UDMA33
Waiting 15 seconds for SCSI devices to settle
da0 at bt0 bus 0 target 0 lun 0
da0: <SEAGATE ST118202LW 0006> Fixed Direct Access SCSI-2 device
Fatal trap 18: integer divide fault while in kernel mode
instruction pointer    = 0x8:0xc043c40b
stack pointer          = 0x10:0xd4f9881c
frame pointer          = 0x10:0xd4f9882c
code segment           = base 0x0, limit 0xfffff, type 0x1b
                       = DPL 0, pres 1, def32 1, gran 1
processor eflags       = interrupt enabled, resume, IOPL = 0
current process        = 41 (swi3: cambio)
[thread 100042]
Stopped at scsi_calc_syncsrate+0x53: divl %ecx,%eax
db> trace
scsi_calc_syncsrate(0,0,a,0,a) at scsi_calc_syncsrate+0x53
xpt_announce_periph(c1b75400,d4f98c74,3d672000,4,200) at xpt_announce_periph+0xe
0
dadone(c1b75400,c1b95c00) at dadone+0x618
camisr(c072b320) at camisr+0x1f1
ithread_loop(c193c900,d4f98d48) at ithread_loop+0x151
fork_exit(c0513690,c193c900,d4f98d48) at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xd4f98d7c, ebp = 0 ---
db>
>How-To-Repeat:
	I guess use the above drive with the above listed scsi host controller
>Fix:
.	



>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-scsi 
Responsible-Changed-By: kris 
Responsible-Changed-When: Wed Jan 5 01:01:30 GMT 2005 
Responsible-Changed-Why:  
Assign to SCSI mailing list for evaluation 

http://www.freebsd.org/cgi/query-pr.cgi?pr=75603 

From: DJ Gregor <dgregor@interhack.com>
To: freebsd-gnats-submit@FreeBSD.org, oz@nixil.net
Cc:  
Subject: Re: kern/75603: 5.3 kernel crash
Date: Tue, 5 Apr 2005 20:18:41 -0400

 I've had the same problem with 5.3-RELEASE when I connected a Hitachi 
 IC35L073UCDY10 (this is Ultra320) to a Qlogic ISP1080 card.  Here is 
 what I get when booting in verbose mode (note: this was copied by hand, 
 so there might be errors):
 
 ...
 pass0 at isp0 bus 0 target 0 lun 0
 pass0: <IBM IC35L073UCDY10-0 S27F> Fixed Direct Access SCSI-3 device
 pass0: Serial Number E6W7735C
 
 Fatal trap 18: integer divide fault while in kernel mode
 instruction pointer     = 0x8:0xc044b9cf
 stack pointer           = 0x10:0xe5184644
 frame pointer           = 0x10:0xe5184654
 code segment            = base 0x0, limit 0xffff, type 0x1b
                          = DPL 0, pres 1, def32 1, gran 1
 processor eflags        = interrupt enabled, resume, IOPL = 0
 current process         = 33 (swi3: cambio)
 trap number             = 18
 panic: integer divide fault
 Uptime: 20s
 Cannot dump. No dump device define.
 Shutting down ACPI
 
 
 I'm installing FreeBSD to another drive, and I'll see what data I can 
 gather with a normal kernel.  I'll also see if I can try other drives 
 and another controller, although my time to test is a bit limited at 
 this point.  Please let me know if there is anything I ought to be 
 looking at or keeping my eye out for.
 
 
 	Thanks,
 	- djg
 

From: Phil Oleson <oz@nixil.net>
To: DJ Gregor <dgregor@interhack.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/75603: 5.3 kernel crash
Date: Tue, 05 Apr 2005 18:40:20 -0600

 I did try to debug this some, though I got distracted by work related 
 tasks.  In my opinion,
 scsi_calc_syncsrate() has a div by 0 possibility, where the old ifdef'd 
 code explicitly avoided
 this prossibility.   something about this code in bt.c which sets 
 sync_period.  The 4.x tree
 called scsi_calc_syncparam() which is safer in this regard, but it's 
 ifdef'd out of the 5.x+ tree.
 and I dont think it's doing the right thing.  But this is as far as I got.
 
     Phil.
 
 bt.c:
 <snip>
 #ifdef  CAM_NEW_TRAN_CODE
         cts->protocol = PROTO_SCSI;
         cts->protocol_version = SCSI_REV_2;
         cts->transport = XPORT_SPI;
         cts->transport_version = 2;
 
         spi->sync_period = sync_period;
         spi->valid |= CTS_SPI_VALID_SYNC_RATE;
         spi->sync_offset = sync_offset;
         spi->valid |= CTS_SPI_VALID_SYNC_OFFSET;
 
         spi->valid |= CTS_SPI_VALID_BUS_WIDTH;
         spi->bus_width = bus_width;
 
         if (cts->ccb_h.target_lun != CAM_LUN_WILDCARD) {
                 scsi->valid = CTS_SCSI_VALID_TQ;
                 spi->valid |= CTS_SPI_VALID_DISC;
         } else
                 scsi->valid = 0;
 
 #else
         /* Convert ns value to standard SCSI sync rate */
         if (cts->sync_offset != 0)
                 cts->sync_period = scsi_calc_syncparam(sync_period);
         else
                 cts->sync_period = 0;
         cts->sync_offset = sync_offset;
         cts->bus_width = MSG_EXT_WDTR_BUS_8_BIT;
 
         cts->valid = CCB_TRANS_SYNC_RATE_VALID
                    | CCB_TRANS_SYNC_OFFSET_VALID
                    | CCB_TRANS_BUS_WIDTH_VALID;
 
 #endif
 </snip>
 
 
State-Changed-From-To: open->closed 
State-Changed-By: mjacob 
State-Changed-When: Thu Apr 14 03:53:34 GMT 2005 
State-Changed-Why:  
Put in some guard code so we don't divide by zero. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=75603 
>Unformatted:
