From doranj@ucsu.Colorado.EDU Sat Jun 19 11:34:39 1999
Return-Path: <doranj@ucsu.Colorado.EDU>
Received: from ucsu.Colorado.EDU (ucsu.Colorado.EDU [128.138.129.83])
	by hub.freebsd.org (Postfix) with ESMTP id C10F314F24
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 19 Jun 1999 11:34:36 -0700 (PDT)
	(envelope-from doranj@ucsu.Colorado.EDU)
Received: (from doranj@localhost)
	by ucsu.Colorado.EDU (8.9.3/8.9.3/ITS-5.0/standard) id MAA02840
	for FreeBSD-gnats-submit@freebsd.org; Sat, 19 Jun 1999 12:34:35 -0600 (MDT)
Message-Id: <199906191834.MAA02840@ucsu.Colorado.EDU>
Date: Sat, 19 Jun 1999 12:34:35 -0600 (MDT)
From: Jonathon Doran <doranj@Colorado.EDU>
Sender: doranj@ucsu.Colorado.EDU
To: FreeBSD-gnats-submit@freebsd.org
Subject: SMP kernel reboots without panic message

>Number:         12299
>Category:       i386
>Synopsis:       SMP kernel reboots without panic message
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gnats-admin
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Jun 19 11:40:01 PDT 1999
>Closed-Date:    Tue Jun 22 11:54:27 PDT 1999
>Last-Modified:  Wed Oct 26 06:14:16 GMT 2005
>Originator:     
>Release:        
>Organization:
>Environment:
>Description:
 Submitter-Id:   current-users
 Originator:     doranj@colorado.edu
 Organization:   University of Colorado
 Confidential:   no
 Synopsis:       SMP kernel reboots (not panic) periodically
 Severity:       serious
 Priority:       high
 Category:       kern
 Release:        FreeBSD 3.2-RELEASE i386
 Class:          sw-bug
 Environment: 
 
 Tyan Tomcat 2, 2x P133, 64M
 
 CPU: Pentium/P54C (586-class CPU)
 Origin = "GenuineIntel"  Id = 0x52c  Stepping=1
 Features=0x3bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,APIC>
 real memory  = 67108864 (65536K bytes)
 avail memory = 62144512 (60688K bytes)
 Programming 24 pins in IOAPIC #0
 FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00030010, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00030010, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
 Preloaded elf kernel "kernel" at 0xc02f9000.
 Probing for devices on PCI bus 0:
 <Intel 82439HX PCI cache memory controller> rev 0x01 on pci0.0.0
 <Intel 82371SB PCI to ISA bridge> rev 0x00 on pci0.7.0
 ahc0: <Adaptec 2930 Ultra2 SCSI adapter> rev 0x00 int a irq 19 on pci0.17.0
 ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs
 xl0: <3Com 3c905-TX Fast Etherlink XL> rev 0x00 int a irq 18 on pci0.18.0
 xl0: Ethernet address: 00:60:08:08:ba:48
 xl0: autoneg complete, link status good (half-duplex, 10Mbps)
 vga0: <Matrox MGA 2064W graphics accelerator> rev 0x01 int a irq 16 on pci0.20.0
 
 ...
 APIC_IO: Testing 8254 interrupt delivery
 APIC_IO: Broken MP table detected: 8254 is not connected to IO APIC int pin 2
 APIC_IO: routing 8254 via 8259 on pin 0
 Waiting 15 seconds for SCSI devices to settle
 SMP: AP CPU #1 Launched!
 
 Description: 
 
 I've ran an SMP kernel on this machine since 2.2? (mid 1996) without
 incident.  I can also run 3.2 GENERIC on this machine (using only 1 CPU)
 without incident.
 
 When I run the SMP kernel, the machine will reboot periodically -- about
 once per 24 hours.  Nothing is logged.  The machine acts as if the reset
 switch were pressed.  
 
 One time this occurred I was in a different room configuring apache, 
 I went to reload a page and the server didn't respond.  I investigated and 
 found that the SMP machine had rebooted.
 
 Another time I was sitting at the console playing freeciv when the reset
 occurred.  I went from the X11 desktop to a BIOS memory check.
 
 I shut down the machine and thought about it for a while.  The only hardware
 change I had made to this box was swapping an Adaptec 2940 for a 2930U2,
 and I added an 18G U2 drive and a spare CDROM.  I booted the GENERIC kernel
 and have been running for a week without incident.  I am thinking this is
 not related to the new hardware, or GENERIC would have trouble.  And I was
 happily running SMP under 2.2.6 (prior to the hardware addition and the
 required upgrade to 3.2 to get driver support for my 2930U2).
 
 How-To-Repeat: 
 Boot an SMP kernel, and wait.
 
 Fix: 
 	I have no idea.  I have quite a bit of industry kernel and device 
 driver experience and am more than willing to debug this if someone would
 give me some pointers.  As it stands, there isn't much of a place to start.
 
 I'd also like to point out that I've heard several similar reports on
 freebsd-questions.  I'd encourage everyone not to be too quick to dismiss
 this as a hardware problem.
 
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:

From: Poul-Henning Kamp <phk@critter.freebsd.dk>
To: Jonathon Doran <doranj@Colorado.EDU>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: pending/12299: SMP kernel reboots without panic message 
Date: Sat, 19 Jun 1999 20:47:11 +0200

 Jonathon, 
 
 "Hardware" would be my initial reaction.
 
 Try to wrap your computer in a blanket, if your problem is termal
 this is sure to trigger it.
 
 Check all fans.
 
 Check all voltages on your power-supply, monitor the +12v while
 the disks spin up, if it dives below 10.8v at any time, it's too
 weak to carry the load (use a 'scope to see any transients.)
 
 --
 Poul-Henning Kamp             FreeBSD coreteam member
 phk@FreeBSD.ORG               "Real hackers run -current on their laptop."
 FreeBSD -- It will take a long time before progress goes too far!
 
State-Changed-From-To: open->closed 
State-Changed-By: phk 
State-Changed-When: Tue Jun 22 11:54:27 PDT 1999 
State-Changed-Why:  
thermal problems confirmed.  The addition of two new drives pushed 
the machine over the edge (and into the sauna). 
>Unformatted:
