From nobody@FreeBSD.org  Sun Nov 14 20:11:19 2010
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5EA92106564A
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 14 Nov 2010 20:11:19 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 4CFCB8FC18
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 14 Nov 2010 20:11:19 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id oAEKBIIh018833
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 14 Nov 2010 20:11:18 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id oAEKBIAH018826;
	Sun, 14 Nov 2010 20:11:18 GMT
	(envelope-from nobody)
Message-Id: <201011142011.oAEKBIAH018826@www.freebsd.org>
Date: Sun, 14 Nov 2010 20:11:18 GMT
From: Loic Pefferkorn <loic-freebsd@loicp.eu>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [patch] Kernel panic when hw.ciss.expose_hidden_physical is set
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         152250
>Category:       kern
>Synopsis:       [acpi] [patch] Kernel panic when hw.ciss.expose_hidden_physical is set
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    sbruno
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Nov 14 20:20:08 UTC 2010
>Closed-Date:    Thu Apr 04 15:13:22 UTC 2013
>Last-Modified:  Thu Apr 04 15:13:22 UTC 2013
>Originator:     Loic Pefferkorn
>Release:        7.2-RELEASE
>Organization:
>Environment:
FreeBSD squeak.estat 7.2-STABLE FreeBSD 7.2-STABLE #5: Sun Nov 14 20:35:21 CET 2010     root@squeak.estat:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
HP ProLiant DL360 G6 server with an HP StorageWorks MSL4048 Tape Library

# grep ciss /boot/loader.conf 
hw.ciss.expose_hidden_physical=1


When the tunable hw.ciss.expose_hidden_physical is set at boot time, I have a kernel panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x8
fault code		= supervisor read data, page not present
instruction pointer	= 0x8:0xffffffff80201686
stack pointer	        = 0x10:0xffffff807c6ab930
frame pointer	        = 0x10:0x400
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 77 (sysctl)
trap number		= 12
panic: page fault
cpuid = 0
Uptime: 6s
Physical memory: 4073 MB
Dumping 1230 MB:

Backtrace from the core dump:

(kgdb) bt
#0  doadump () at pcpu.h:195
#1  0x0000000000000004 in ?? ()
#2  0xffffffff8054cff9 in boot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:418
#3  0xffffffff8054d402 in panic (fmt=0x104 <Address 0x104 out of bounds>)
    at /usr/src/sys/kern/kern_shutdown.c:574
#4  0xffffffff80812563 in trap_fatal (frame=0xffffff0003eb4390, eva=Variable "eva" is not available.
)
    at /usr/src/sys/amd64/amd64/trap.c:756
#5  0xffffffff80812935 in trap_pfault (frame=0xffffff807c6ab880, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:672
#6  0xffffffff80813274 in trap (frame=0xffffff807c6ab880)
    at /usr/src/sys/amd64/amd64/trap.c:443
#7  0xffffffff807fd2ce in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:218
#8  0xffffffff80201686 in acpi_child_pnpinfo_str_method (cbdev=Variable "cbdev" is not available.
)
    at /usr/src/sys/dev/acpica/acpi.c:850
#9  0xffffffff805753c9 in device_sysctl_handler (oidp=Variable "oidp" is not available.
)
    at /usr/src/sys/kern/subr_bus.c:260
#10 0xffffffff8055654f in sysctl_root (oidp=Variable "oidp" is not available.
)
    at /usr/src/sys/kern/kern_sysctl.c:1419
#11 0xffffffff805578c5 in userland_sysctl (td=0x0, name=0xffffff807c6abac0, 
    namelen=4, old=0x0, oldlenp=Variable "oldlenp" is not available.
) at /usr/src/sys/kern/kern_sysctl.c:1522
#12 0xffffffff80557ad2 in __sysctl (td=0xffffff0003eb4390, 
    uap=0xffffff807c6abbf0) at /usr/src/sys/kern/kern_sysctl.c:1449
#13 0xffffffff80812bb7 in syscall (frame=0xffffff807c6abc80)
    at /usr/src/sys/amd64/amd64/trap.c:899
#14 0xffffffff807fd4db in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:339
#15 0x0000000800719cac in ?? ()
Previous frame inner to this frame (corrupt stack?)

Faulty instruction:
(kgdb) x/i 0xffffffff80201686
0xffffffff80201686 <acpi_child_pnpinfo_str_method+70>:  mov    0x8(%rbx),%edx

>How-To-Repeat:
With the same hardware, put hw.ciss.expose_hidden_physical=1 in loader.conf and reboot.
>Fix:
Last called function is acpi_child_pnpinfo_str_method in sys/dev/acpica/acpi.c

static int
acpi_child_pnpinfo_str_method(device_t cbdev, device_t child, char *buf,
    size_t buflen)
{
    ACPI_BUFFER adbuf = {ACPI_ALLOCATE_BUFFER, NULL};
    ACPI_DEVICE_INFO *adinfo;
    struct acpi_device *dinfo = device_get_ivars(child);
    char *end;
    int error;

    error = AcpiGetObjectInfo(dinfo->ad_handle, &adbuf);
    adinfo = (ACPI_DEVICE_INFO *) adbuf.Pointer;
    if (error)
        snprintf(buf, buflen, "unknown");
    else
        snprintf(buf, buflen, "_HID=%s _UID=%lu",
                 (adinfo->Valid & ACPI_VALID_HID) ?
                 adinfo->HardwareId.Value : "none",
                 (adinfo->Valid & ACPI_VALID_UID) ?
                 strtoul(adinfo->UniqueId.Value, &end, 10) : 0);
    if (adinfo)
        AcpiOsFree(adinfo);

    return (0);
}

buf is modified accordingly to "error" value. 

I have found adbuf.Pointer to be set to 0x0 while "error" was set to a zero value. Therefore, references to adinfo struct in snprintf have 0x0 as base.

"error" value is not set correctly. Let's see why in AcpiGetObjectInfo, in sys/contrib/dev/acpica/nsxfname.c

Node = AcpiNsMapHandleToNode (Handle);
if (!Node)
{
    (void) AcpiUtReleaseMutex (ACPI_MTX_NAMESPACE);
    goto Cleanup;
}
(...)
Cleanup:
    ACPI_FREE (Info);
    if (CidList)
    {
        ACPI_FREE (CidList);
    }
    return (Status);

If AcpiNsMapHandleToNode fails, we release a mutex and go to Cleanup:, which does not update Status value before return. 
Status value hence is the one from AcpiUtAcquireMutex called earlier, which is wrong.

Setting Status to AE_BAD_PARAMETER before going to Cleanup fix the issue (I found that AE_BAD_PARAMETER is used elsewhere in the kernel in similar flows when AcpiNsMapHandleToNode is called).

7.0 to 7.3 are affected, patch is attached.

Hope I'm right :)

Patch attached with submission follows:

--- src/sys/contrib/dev/acpica/nsxfname.c.orig  2010-11-14 20:51:57.000000000 +0100
+++ src/sys/contrib/dev/acpica/nsxfname.c       2010-11-14 20:50:46.000000000 +0100
@@ -361,6 +361,7 @@
     if (!Node)
     {
         (void) AcpiUtReleaseMutex (ACPI_MTX_NAMESPACE);
+        Status = AE_BAD_PARAMETER;
         goto Cleanup;
     }
 


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-scsi 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Nov 15 13:15:23 UTC 2010 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=152250 

From: =?ISO-8859-1?Q?Lo=EFc_Pefferkorn?= <loic-freebsd@loicp.eu>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/152250: [ciss] [patch] Kernel panic when hw.ciss.expose_hidden_physical
 is set
Date: Thu, 10 Mar 2011 21:04:24 +0100

 Hello,
 
 Are some informations missing or not clear enough ?
 If my report is not usable please tell me how to improve it :)
 
 Regards,
 Loc
Responsible-Changed-From-To: freebsd-scsi->sbruno 
Responsible-Changed-By: sbruno 
Responsible-Changed-When: Sat Jan 12 01:24:52 UTC 2013 
Responsible-Changed-Why:  
Taking ticket as this is in my universe ish 

http://www.freebsd.org/cgi/query-pr.cgi?pr=152250 

From: Sean Bruno <seanbru@yahoo-inc.com>
To: bug-followup@FreeBSD.org, loic-freebsd@loicp.eu
Cc:  
Subject: Re: kern/152250: [ciss] [patch] Kernel panic when
 hw.ciss.expose_hidden_physical is set
Date: Fri, 11 Jan 2013 17:27:59 -0800

 This looks correct to me, but the ticket is misfiled as "ciss" and not
 "acpi" because of gnats.
 
 Sean
 

From: =?ISO-8859-1?Q?Lo=EFc_Pefferkorn?= <loic-freebsd@loicp.eu>
To: bug-followup@FreeBSD.org, loic-freebsd@loicp.eu
Cc:  
Subject: Re: kern/152250: [acpi] [patch] Kernel panic when hw.ciss.expose_hidden_physical
 is set
Date: Tue, 02 Apr 2013 23:32:31 +0200

 Hello Sean,
 
 Category has been changed to acpi.
 
 Cheers,
 Loic

From: Sean Bruno <seanwbruno@gmail.com>
To: bug-followup@FreeBSD.org, loic-freebsd@loicp.eu
Cc:  
Subject: Re: kern/152250: [acpi] [patch] Kernel panic when
 hw.ciss.expose_hidden_physical is set
Date: Wed, 03 Apr 2013 14:48:24 -0700

 --=-9DUaDRUEZrDvTlf+g7tY
 Content-Type: text/plain; charset="us-ascii"
 Content-Transfer-Encoding: quoted-printable
 
 This is only applicable to stable/7 so I'll go ahead and commit this as
 it is appropriate.
 
 stable/8 and newer use different code paths and this problem does not
 exist.
 
 Sean
 
 --=-9DUaDRUEZrDvTlf+g7tY
 Content-Type: application/pgp-signature; name="signature.asc"
 Content-Description: This is a digitally signed message part
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (FreeBSD)
 
 iQEcBAABAgAGBQJRXKOoAAoJEBkJRdwI6BaH7/0H/1GEDXlPDNUO+eoLhEBxFB56
 l+MuUxkS5xVvAI5Q0xfuiPt8xBiw9r3w3fSj1aLKjprE2wGYZMhBMjP0TxNM9k3k
 Jr5wbogVZ3yVD5K0ViEXKBQ1yqgfOdo3q3Rq/pEr08zep4NcowOsJAz1cJ2XW7qA
 TbcthcZcOwh1defFiTiDfsqR6tW3NVe1TdVbbo9HruvyystkYXaRCCpAmIJjs/pT
 LpWUtWHwVXjKpzxmrk83ZOqpSCdzyrcd5cVdcP5Tnod2QR75IMVjSoBc5TQsUpmt
 q+piDG4W9hBcqWesh1C4ZU1B2k8va7A7pGy2vCocbcDvVv/+vOUmaWX//XDbTN4=
 =n5Qb
 -----END PGP SIGNATURE-----
 
 --=-9DUaDRUEZrDvTlf+g7tY--
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/152250: commit references a PR
Date: Wed,  3 Apr 2013 23:11:33 +0000 (UTC)

 Author: sbruno
 Date: Wed Apr  3 23:11:15 2013
 New Revision: 249073
 URL: http://svnweb.freebsd.org/changeset/base/249073
 
 Log:
   Resolve kernel panic that occurs on callback from sysctl when setting
   hw.ciss.expose_hidden_physical=1 on a HP ProLiant DL360 G6 (and possibly
   others) due to mishandling of error value in acpica on stable/7
   
   Note that this is a direct commit as this code has been fixed in stable/8
   (8.4 included) and higher release for quite some time.
   
   PR:	kern/152250
   Submitted by:	Loic Pefferkorn <loic-freebsd@loicp.eu>
   Reviewed by:	avg@
 
 Modified:
   stable/7/sys/contrib/dev/acpica/nsxfname.c
 
 Modified: stable/7/sys/contrib/dev/acpica/nsxfname.c
 ==============================================================================
 --- stable/7/sys/contrib/dev/acpica/nsxfname.c	Wed Apr  3 22:37:40 2013	(r249072)
 +++ stable/7/sys/contrib/dev/acpica/nsxfname.c	Wed Apr  3 23:11:15 2013	(r249073)
 @@ -361,6 +361,7 @@ AcpiGetObjectInfo (
      if (!Node)
      {
          (void) AcpiUtReleaseMutex (ACPI_MTX_NAMESPACE);
 +        Status = AE_BAD_PARAMETER;
          goto Cleanup;
      }
  
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->closed 
State-Changed-By: sbruno 
State-Changed-When: Thu Apr 4 15:12:13 UTC 2013 
State-Changed-Why:  
Ticket is resolved on stable/7 as of svn r249073 and does not apply to any 
currently supported branch. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=152250 
>Unformatted:
