From nobody@FreeBSD.org  Tue Jan 15 03:20:39 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B0D6816A417
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 15 Jan 2008 03:20:39 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id A2D8013C457
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 15 Jan 2008 03:20:39 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.2/8.14.2) with ESMTP id m0F3JN2i049575
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 15 Jan 2008 03:19:23 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.2/8.14.1/Submit) id m0F3JNxf049573;
	Tue, 15 Jan 2008 03:19:23 GMT
	(envelope-from nobody)
Message-Id: <200801150319.m0F3JNxf049573@www.freebsd.org>
Date: Tue, 15 Jan 2008 03:19:23 GMT
From: Leo Bicknell <bicknell@ufp.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: apic_hpet0 probe causes divide by zero kernel panic
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         119675
>Category:       kern
>Synopsis:       [acpi] apic_hpet0 probe causes divide by zero kernel panic
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    jhb
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jan 15 03:30:02 UTC 2008
>Closed-Date:    Thu Mar 13 16:23:46 UTC 2008
>Last-Modified:  Thu Mar 13 16:23:46 UTC 2008
>Originator:     Leo Bicknell
>Release:        7.0-RC1
>Organization:
>Environment:
Reproduce with 7.0-RC1 i386 or amd64 install disks.
>Description:
When probing apic_hpet0 the kernel divides by zero and panics, preventing
boot.  Disabling ACPI disables the probe, and hence is a work around, but
also means other features (e.g. the second core) are disabled.

I noticed this was documented in
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2007-05/msg00704.html
(regarding a 6.x version) and not solved.  The code of interest is in
/usr/src/sys/dev/apica/acpi_hpet.c:

	/* Read basic statistics about the timer. */
	val = bus_read_4(sc->mem_res, HPET_OFFSET_PERIOD);
	freq = (1000000000000000LL + val / 2) / val;
	if (bootverbose) {
		val = bus_read_4(sc->mem_res, HPET_OFFSET_INFO);
		device_printf(dev,
		    "vend: 0x%x rev: 0x%x num: %d hz: %jd opts:%s%s\n",
		    val >> 16, val & 0xff, (val >> 18) & 0xf, freq,
		    ((val >> 15) & 1) ? " leg_route" : "",
		    ((val >> 13) & 1) ? " count_size" : "");
	}

On this particular machine (Gigabyte GA-54SLI-S4 Motherboard w/Athlon
64 x2 4200) val = 0, so the second line causes a divide by zero error.

On this machine in the motherboard BIOS I can set the "high precision
timer" to enabled or disabled, 32 bit or 64 bit mode.  I have tried all
4 combinations, all produce the same result.

I can use the hardware causing this problem as a test case for at least
a week or two, if someone can get back to me relatively quickly I can
attempt to perform more tests for you and/or get other info fairly promptly.

>How-To-Repeat:
It appears to me from looking at the CVS diff that booting any i386 or
amd64 image on this hardware will cause the problem.  Based on google
searching it appears while not common, this does affect more than one
chipset and motherboard.

>Fix:
Obviously commenting out the division and hand setting the freq value
"fixes" the problem in that the machine will boot, but I suspect this
leaves a completely broken hpet0 device.  Disabling ACPI disables the
probe, allowing boot, but means none of the other APCI goodness is
available.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-acpi 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Tue Jan 15 04:01:31 UTC 2008 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=119675 

From: John Baldwin <jhb@FreeBSD.org>
To: bug-followup@freebsd.org,
 bicknell@ufp.org
Cc:  
Subject: Re: kern/119675: [acpi] apic_hpet0 probe causes divide by zero kernel panic
Date: Tue, 15 Jan 2008 10:13:07 -0500

 You can try the patch below.  It fixes a couple of places where we don't
 honor the spec (we don't shut it off in S1 and S2 as required and we don't
 preserve reserved bits in the global configuration register).  It also
 fails the attach if the period is zero which should fix your panic and
 just leave you with no HPET.
 
 Index: acpi_hpet.c
 ===================================================================
 RCS file: /host/cvs/usr/cvs/src/sys/dev/acpica/acpi_hpet.c,v
 retrieving revision 1.12
 diff -u -r1.12 acpi_hpet.c
 --- acpi_hpet.c	9 Oct 2007 07:48:07 -0000	1.12
 +++ acpi_hpet.c	15 Jan 2008 14:53:21 -0000
 @@ -82,6 +82,24 @@
  	return (bus_read_4(sc->mem_res, HPET_OFFSET_VALUE));
  }
  
 +static void
 +hpet_enable(struct acpi_hpet_softc *sc)
 +{
 +	uint32_t val;
 +	
 +	val = bus_read_4(sc->mem_res, HPET_OFFSET_ENABLE);
 +	bus_write_4(sc->mem_res, HPET_OFFSET_ENABLE, val | 1);
 +}
 +
 +static void
 +hpet_disable(struct acpi_hpet_softc *sc)
 +{
 +	uint32_t val;
 +	
 +	val = bus_read_4(sc->mem_res, HPET_OFFSET_ENABLE);
 +	bus_write_4(sc->mem_res, HPET_OFFSET_ENABLE, val & ~1);
 +}
 +
  /* Discover the HPET via the ACPI table of the same name. */
  static void 
  acpi_hpet_identify(driver_t *driver, device_t parent)
 @@ -166,10 +184,16 @@
  	}
  
  	/* Be sure timer is enabled. */
 -	bus_write_4(sc->mem_res, HPET_OFFSET_ENABLE, 1);
 +	hpet_enable(sc);
  
  	/* Read basic statistics about the timer. */
  	val = bus_read_4(sc->mem_res, HPET_OFFSET_PERIOD);
 +	if (val == 0) {
 +		device_printf(dev, "invalid period\n");
 +		hpet_disable(sc);
 +		bus_free_resource(dev, SYS_RES_MEMORY, sc->mem_res);
 +	}
 +
  	freq = (1000000000000000LL + val / 2) / val;
  	if (bootverbose) {
  		val = bus_read_4(sc->mem_res, HPET_OFFSET_INFO);
 @@ -192,7 +216,7 @@
  	val2 = bus_read_4(sc->mem_res, HPET_OFFSET_VALUE);
  	if (val == val2) {
  		device_printf(dev, "HPET never increments, disabling\n");
 -		bus_write_4(sc->mem_res, HPET_OFFSET_ENABLE, 0);
 +		hpet_disable(sc);
  		bus_free_resource(dev, SYS_RES_MEMORY, sc->mem_res);
  		return (ENXIO);
  	}
 @@ -214,13 +238,29 @@
  }
  
  static int
 +acpi_hpet_suspend(device_t dev)
 +{
 +	struct acpi_hpet_softc *sc;
 +
 +	/*
 +	 * Disable the timer during suspend.  The timer will not lose
 +	 * its state in S1 or S2, but we are required to disable
 +	 * it.
 +	 */
 +	sc = device_get_softc(dev);
 +	hpet_disable(sc);
 +
 +	return (0);
 +}
 +
 +static int
  acpi_hpet_resume(device_t dev)
  {
  	struct acpi_hpet_softc *sc;
  
  	/* Re-enable the timer after a resume to keep the clock advancing. */
  	sc = device_get_softc(dev);
 -	bus_write_4(sc->mem_res, HPET_OFFSET_ENABLE, 1);
 +	hpet_enable(sc);
  
  	return (0);
  }
 @@ -260,6 +300,7 @@
  	DEVMETHOD(device_probe, acpi_hpet_probe),
  	DEVMETHOD(device_attach, acpi_hpet_attach),
  	DEVMETHOD(device_detach, acpi_hpet_detach),
 +	DEVMETHOD(device_suspend, acpi_hpet_suspend),
  	DEVMETHOD(device_resume, acpi_hpet_resume),
  
  	{0, 0}
 
 -- 
 John Baldwin

From: Leo Bicknell <bicknell@ufp.org>
To: John Baldwin <jhb@FreeBSD.org>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/119675: [acpi] apic_hpet0 probe causes divide by zero kernel panic
Date: Tue, 15 Jan 2008 11:33:14 -0500

 In a message written on Tue, Jan 15, 2008 at 10:13:07AM -0500, John Baldwin wrote:
 > You can try the patch below.  It fixes a couple of places where we don't
 > honor the spec (we don't shut it off in S1 and S2 as required and we don't
 > preserve reserved bits in the global configuration register).  It also
 > fails the attach if the period is zero which should fix your panic and
 > just leave you with no HPET.
 
 Good news and bad news.
 
 With the patch "invalid period" is printed out, so I believe it's
 correctly detecting the hpet0 issue.
 
 However, I immediately get an "integer divide fault while in kernel
 mode" panic and the boot still fails.  I tried with boot -v and the
 message is right after the "invalid period", so I'm not quite sure
 what's causing it.
 
 Any recomendations, other than setting up a kernel debugger to see where
 it's coming from?
 
 -- 
        Leo Bicknell - bicknell@ufp.org - CCIE 3440
         PGP keys at http://www.ufp.org/~bicknell/
 Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org

From: Nate Lawson <nate@root.org>
To: Leo Bicknell <bicknell@ufp.org>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/119675: [acpi] apic_hpet0 probe causes divide by zero	kernel
 panic
Date: Tue, 15 Jan 2008 08:48:14 -0800

 Leo Bicknell wrote:
 >  In a message written on Tue, Jan 15, 2008 at 10:13:07AM -0500, John Baldwin wrote:
 >  > You can try the patch below.  It fixes a couple of places where we don't
 >  > honor the spec (we don't shut it off in S1 and S2 as required and we don't
 >  > preserve reserved bits in the global configuration register).  It also
 >  > fails the attach if the period is zero which should fix your panic and
 >  > just leave you with no HPET.
 >  
 >  Good news and bad news.
 >  
 >  With the patch "invalid period" is printed out, so I believe it's
 >  correctly detecting the hpet0 issue.
 >  
 >  However, I immediately get an "integer divide fault while in kernel
 >  mode" panic and the boot still fails.  I tried with boot -v and the
 >  message is right after the "invalid period", so I'm not quite sure
 >  what's causing it.
 >  
 >  Any recomendations, other than setting up a kernel debugger to see where
 >  it's coming from?
 
 John's patch should be committed anyway.  However, it's possible your
 panic is happening in non-acpi code.  There's no way to narrow it down
 perfectly without enabled the debugger (options DDB etc.) and typing
 "trace" after it panics.
 
 However, a boot verbose (boot -v) may give a little more information.
 
 -- 
 Nate
State-Changed-From-To: open->patched 
State-Changed-By: jhb 
State-Changed-When: Tue Jan 15 18:50:59 UTC 2008 
State-Changed-Why:  
Fix committed to HEAD. 


Responsible-Changed-From-To: freebsd-acpi->jhb 
Responsible-Changed-By: jhb 
Responsible-Changed-When: Tue Jan 15 18:50:59 UTC 2008 
Responsible-Changed-Why:  
Fix committed to HEAD. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=119675 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/119675: commit references a PR
Date: Tue, 15 Jan 2008 18:50:54 +0000 (UTC)

 jhb         2008-01-15 18:50:48 UTC
 
   FreeBSD src repository
 
   Modified files:
     sys/dev/acpica       acpi_hpet.c 
   Log:
   Fix a few minor issues based on a bug report and reading over the HPET
   spec:
   - Use read/modify/write cycles to enable and disable the HPET instead of
     writing 0 to reserved bits.
   - Shutdown the HPET during suspend as encouraged by the spec.
   - Fail to attach to an HPET with a period of zero.
   
   MFC after:      1 week
   PR:             kern/119675 [3]
   Reported by:    Leo Bicknell | bicknell ufp.org
   
   Revision  Changes    Path
   1.13      +45 -3     src/sys/dev/acpica/acpi_hpet.c
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: jhb 
State-Changed-When: Thu Mar 13 16:23:31 UTC 2008 
State-Changed-Why:  
Fix MFC'd to 6.x, 7.x, and 7.0. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=119675 
>Unformatted:
