From nobody@FreeBSD.org  Sun Jul 27 13:43:58 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9EDCE106564A
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 27 Jul 2008 13:43:58 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 925B78FC14
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 27 Jul 2008 13:43:58 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.2/8.14.2) with ESMTP id m6RDhw0t078201
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 27 Jul 2008 13:43:58 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.2/8.14.1/Submit) id m6RDhwXQ078199;
	Sun, 27 Jul 2008 13:43:58 GMT
	(envelope-from nobody)
Message-Id: <200807271343.m6RDhwXQ078199@www.freebsd.org>
Date: Sun, 27 Jul 2008 13:43:58 GMT
From: Kirk Strauser <kirk@strauser.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Installing java/diablo-jdk15 crashes an amd64 machine
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         126002
>Category:       kern
>Synopsis:       [panic] Installing java/diablo-jdk15 crashes an amd64 machine
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    jhb
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jul 27 13:50:01 UTC 2008
>Closed-Date:    Fri Aug 01 18:19:37 UTC 2008
>Last-Modified:  Fri Aug 01 18:19:37 UTC 2008
>Originator:     Kirk Strauser
>Release:        7.0-STABLE from 2008-07-25
>Organization:
The Strauser Group
>Environment:
FreeBSD kanga.honeypot.net 7.0-STABLE FreeBSD 7.0-STABLE #0: Fri Jul 25 22:27:10 CDT 2008     root@kanga.honeypot.net:/usr/obj/usr/src/sys/KANGA  amd64

>Description:
When installing the java/diablo-jdk15 port, my amd64 machine crashes with:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address  = 0x0
fault code             = supervisor write data, page not present
[...]
current process        = 52615 (zsh)
trap number            = 12
panic: page fault
cpuid = 1

This server isn't near a keyboard, so I literally took a picture of the screen and transcribed it.  The snapshot and/or more details are available if needed.

My kernel config is very minimal:

include         GENERIC
ident           KANGA
option          PMAP_SHPGPERPROC=400

In make.conf, I set CPUTYPE?=core2 and no other compiler flags.  Basically, I did a fresh install of FreeBSD 7.0 and csup'ed /usr/src to get my re(4) NICs working.

As this is a replacement for an older x86 system, I'd moved /var/db/pkg, installed portupgrade, and used portupgrade -fa to reinstall every port that'd been on the old system.  I did *not* move over any libraries or binaries from the old system.  Every executable on this system was originally built on it.
>How-To-Repeat:
cd /usr/ports/java/diablo-jdk15; make install

If TZUPDATE is set, then the port gets to "Updating time zones..." and then hangs.  If TZUPDATE is unset, it gets further.  I accidentally closed the shell window showing when it hung, but can reproduce this if I need to.
>Fix:


>Release-Note:
>Audit-Trail:

From: Kris Kennaway <kris@FreeBSD.org>
To: Kirk Strauser <kirk@strauser.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/126002: Installing java/diablo-jdk15 crashes an amd64 machine
Date: Sun, 27 Jul 2008 16:29:30 +0200

 Kirk Strauser wrote:
 > Fatal trap 12: page fault while in kernel mode
 > cpuid = 1; apic id = 01
 > fault virtual address  = 0x0
 > fault code             = supervisor write data, page not present
 > [...]
 > current process        = 52615 (zsh)
 > trap number            = 12
 > panic: page fault
 > cpuid = 1
 > 
 > This server isn't near a keyboard, so I literally took a picture of the screen and transcribed it.  The snapshot and/or more details are available if needed.
 
 We need the backtrace.
 
 Kris
 

From: Kirk Strauser <kirk@strauser.com>
To: Kris Kennaway <kris@FreeBSD.org>, freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/126002: Installing java/diablo-jdk15 crashes an amd64 machine
Date: Sun, 27 Jul 2008 10:09:45 -0500

 Kris Kennaway wrote:
 
 > We need the backtrace.
 
 That might be problematic.  The panic screen ends with "Dumping 399 MB:" 
 but never moves past that point.  I'll see what I can coerce from it.
 -- 
 Kirk Strauser

From: Kris Kennaway <kris@FreeBSD.org>
To: Kirk Strauser <kirk@strauser.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/126002: Installing java/diablo-jdk15 crashes an amd64 machine
Date: Sun, 27 Jul 2008 17:36:12 +0200

 Kirk Strauser wrote:
 > Kris Kennaway wrote:
 > 
 >> We need the backtrace.
 > 
 > That might be problematic.  The panic screen ends with "Dumping 399 MB:" 
 > but never moves past that point.  I'll see what I can coerce from it.
 
 OK, if you can replicate the problem then try with DDB enabled instead 
 of KDB_UNATTENDED.  If you cannot replicate it and have no backtrace 
 there is unfortunately nothing we can do with this PR.
 
 Kris
State-Changed-From-To: open->feedback 
State-Changed-By: gavin 
State-Changed-When: Sun Jul 27 15:55:56 UTC 2008 
State-Changed-Why:  
Also, if you still have hte camera image and can transcribe the 
information you missed out, that may be of use.  Take the value 
of "instruction pointer" and use: 
addr2line -e /boot/kernel/kernel.symbols 0x<address> 
There is a possibility that this might give enough information 
to begin to guess what caused this.  Otherwise, Kris's suggestion 
of recompiling wiuth DDB and obtaining a backtrace is best if you 
can recreate this panic, however. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=126002 

From: Kirk Strauser <kirk@strauser.com>
To: Kris Kennaway <kris@FreeBSD.org>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/126002: Installing java/diablo-jdk15 crashes an amd64 machine
Date: Sun, 27 Jul 2008 15:47:56 -0500

 Here's what I came up with:
 
 $ sudo kgdb kernel.debug /var/tmp/vmcore.0
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and  
 you are
 welcome to change it and/or distribute copies of it under certain  
 conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for  
 details.
 This GDB was configured as "amd64-marcel-freebsd"...
 
 Unread portion of the kernel message buffer:
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address	= 0x8
 fault code		= supervisor write data, page not present
 instruction pointer	= 0x8:0xffffffff8044ea20
 stack pointer	        = 0x10:0xffffffffafe239d0
 frame pointer	        = 0x10:0xffffff002350ea20
 code segment		= base 0x0, limit 0xfffff, type 0x1b
 			= DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags	= interrupt enabled, resume, IOPL = 0
 current process		= 2558 (make)
 trap number		= 12
 panic: page fault
 cpuid = 0
 Uptime: 28m24s
 Physical memory: 2034 MB
 Dumping 324 MB: 309 293 277 261 245 229 213 197 181 165 149 133 117  
 101 85 69 53 37 21 5
 
 Reading symbols from /boot/kernel/tmpfs.ko...Reading symbols from / 
 boot/kernel/tmpfs.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/tmpfs.ko
 Reading symbols from /boot/kernel/pflog.ko...Reading symbols from / 
 boot/kernel/pflog.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/pflog.ko
 Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/ 
 kernel/pf.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/pf.ko
 Reading symbols from /boot/kernel/linux.ko...Reading symbols from / 
 boot/kernel/linux.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/linux.ko
 Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from / 
 boot/kernel/nullfs.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/nullfs.ko
 Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from / 
 boot/kernel/fdescfs.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/fdescfs.ko
 Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from / 
 boot/kernel/accf_http.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/accf_http.ko
 #0  doadump () at pcpu.h:194
 194		__asm __volatile("movq %%gs:0,%0" : "=r" (td));
 (kgdb) list *0xffffffff8044ea20
 0xffffffff8044ea20 is in cpuset_rel (atomic.h:166).
 161	 */
 162	static __inline u_int
 163	atomic_fetchadd_int(volatile u_int *p, u_int v)
 164	{
 165	
 166		__asm __volatile(
 167		"	" MPLOCKED "		"
 168		"	xaddl	%0, %1 ;	"
 169		"# atomic_fetchadd_int"
 170		: "+r" (v),			/* 0 (result) */
 (kgdb) backtrace
 #0  doadump () at pcpu.h:194
 #1  0x0000000000000004 in ?? ()
 #2  0xffffffff8047e591 in boot (howto=260) at /usr/src/sys/kern/ 
 kern_shutdown.c:418
 #3  0xffffffff8047e9c2 in panic (fmt=0x104 <Address 0x104 out of  
 bounds>) at /usr/src/sys/kern/kern_shutdown.c:572
 #4  0xffffffff8072ca4a in trap_fatal (frame=0xffffff0003fdf000,  
 eva=18446742974791723232) at /usr/src/sys/amd64/amd64/trap.c:724
 #5  0xffffffff8072cdf1 in trap_pfault (frame=0xffffffffafe23920,  
 usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641
 #6  0xffffffff8072d6af in trap (frame=0xffffffffafe23920) at /usr/src/ 
 sys/amd64/amd64/trap.c:410
 #7  0xffffffff807142ee in calltrap () at /usr/src/sys/amd64/amd64/ 
 exception.S:169
 #8  0xffffffff8044ea20 in cpuset_rel (set=0x0) at atomic.h:166
 #9  0xffffffff8048aa6c in thread_free (td=0xffffff00237a4360) at /usr/ 
 src/sys/kern/kern_thread.c:344
 #10 0xffffffff8048ab51 in thread_reap () at /usr/src/sys/kern/ 
 kern_thread.c:308
 #11 0xffffffff8045da4b in kern_wait (td=0xffffffffafe23b3c,  
 pid=Variable "pid" is not available.
 ) at /usr/src/sys/kern/kern_exit.c:787
 #12 0xffffffff8045e039 in wait4 (td=Variable "td" is not available.
 ) at /usr/src/sys/kern/kern_exit.c:654
 #13 0xffffffff8072d05c in syscall (frame=0xffffffffafe23c70) at /usr/ 
 src/sys/amd64/amd64/trap.c:852
 #14 0xffffffff807144fb in Xfast_syscall () at /usr/src/sys/amd64/amd64/ 
 exception.S:290
 #15 0x000000000041ee7c in ?? ()
 Previous frame inner to this frame (corrupt stack?)
 (kgdb) quit
 
 Is this enough information?  I'm a kgdb newbie and not too sure what  
 is needed.  I'll happily provide anything else you need.
 
 
 Side note 1 ----
 
 I was not able to save a crash dump to either of these drives:
 
 ad0: 715403MB <Hitachi HDS721075KLA330 GK8OA70M> at ata0-master SATA300
 ad8: 286167MB <WDC WD3000JB-00KFA0 08.05J08> at ata4-master UDMA100
 
 It would start writing, but each 16MB chunk seemed to be exponentially  
 slower than the last, so that after 5 or 6 chunks it was effectively  
 hung.   I ended up dumping to a USB keychain drive.
 
 
 Side note 2 ----
 
 The dumpon(8) man page says:
 
       The dumpon utility will refuse to enable a dump device which is  
 smaller
       than the total amount of physical memory as reported by the  
 hw.physmem
       sysctl(8) variable.
 
 I made a crash dump of a 2GB RAM system to a 512MB flash drive which  
 dumpon happily enabled.
State-Changed-From-To: feedback->open 
State-Changed-By: kris 
State-Changed-When: Sun Jul 27 21:04:02 UTC 2008 
State-Changed-Why:  
This looks like a bug in the recent cpuset MFC 


Responsible-Changed-From-To: freebsd-bugs->jhb 
Responsible-Changed-By: kris 
Responsible-Changed-When: Sun Jul 27 21:04:02 UTC 2008 
Responsible-Changed-Why:  
This looks like a bug in the recent cpuset MFC 

http://www.freebsd.org/cgi/query-pr.cgi?pr=126002 

From: Kris Kennaway <kris@FreeBSD.org>
To: Kirk Strauser <kirk@strauser.com>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/126002: Installing java/diablo-jdk15 crashes an amd64 machine
Date: Sun, 27 Jul 2008 23:04:00 +0200

 Kirk Strauser wrote:
 
 > (kgdb) backtrace
 > #0  doadump () at pcpu.h:194
 > #1  0x0000000000000004 in ?? ()
 > #2  0xffffffff8047e591 in boot (howto=260) at 
 > /usr/src/sys/kern/kern_shutdown.c:418
 > #3  0xffffffff8047e9c2 in panic (fmt=0x104 <Address 0x104 out of 
 > bounds>) at /usr/src/sys/kern/kern_shutdown.c:572
 > #4  0xffffffff8072ca4a in trap_fatal (frame=0xffffff0003fdf000, 
 > eva=18446742974791723232) at /usr/src/sys/amd64/amd64/trap.c:724
 > #5  0xffffffff8072cdf1 in trap_pfault (frame=0xffffffffafe23920, 
 > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641
 > #6  0xffffffff8072d6af in trap (frame=0xffffffffafe23920) at 
 > /usr/src/sys/amd64/amd64/trap.c:410
 > #7  0xffffffff807142ee in calltrap () at 
 > /usr/src/sys/amd64/amd64/exception.S:169
 > #8  0xffffffff8044ea20 in cpuset_rel (set=0x0) at atomic.h:166
 > #9  0xffffffff8048aa6c in thread_free (td=0xffffff00237a4360) at 
 > /usr/src/sys/kern/kern_thread.c:344
 > #10 0xffffffff8048ab51 in thread_reap () at 
 > /usr/src/sys/kern/kern_thread.c:308
 > #11 0xffffffff8045da4b in kern_wait (td=0xffffffffafe23b3c, pid=Variable 
 > "pid" is not available.
 > ) at /usr/src/sys/kern/kern_exit.c:787
 > #12 0xffffffff8045e039 in wait4 (td=Variable "td" is not available.
 > ) at /usr/src/sys/kern/kern_exit.c:654
 > #13 0xffffffff8072d05c in syscall (frame=0xffffffffafe23c70) at 
 > /usr/src/sys/amd64/amd64/trap.c:852
 > #14 0xffffffff807144fb in Xfast_syscall () at 
 > /usr/src/sys/amd64/amd64/exception.S:290
 > #15 0x000000000041ee7c in ?? ()
 > Previous frame inner to this frame (corrupt stack?)
 > (kgdb) quit
 
 Thanks, that is great.  It at least tells me whose bug it is :)
 
 Kris
State-Changed-From-To: open->closed 
State-Changed-By: jhb 
State-Changed-When: Fri Aug 1 18:19:26 UTC 2008 
State-Changed-Why:  
Fixed by kib@. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=126002 
>Unformatted:
