From nobody@FreeBSD.org  Mon Apr 15 21:47:33 2013
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1])
	by hub.freebsd.org (Postfix) with ESMTP id CC1E9AC6
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 15 Apr 2013 21:47:33 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id BC7C91CFC
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 15 Apr 2013 21:47:33 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.5/8.14.5) with ESMTP id r3FLlV6H054206
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 15 Apr 2013 21:47:31 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.5/8.14.5/Submit) id r3FLlVjh054205;
	Mon, 15 Apr 2013 21:47:31 GMT
	(envelope-from nobody)
Message-Id: <201304152147.r3FLlVjh054205@red.freebsd.org>
Date: Mon, 15 Apr 2013 21:47:31 GMT
From: Joe Holden <joe@rewt.org.uk>
To: freebsd-gnats-submit@FreeBSD.org
Subject: kernel stack overflow panic on mips64, EdgeRouter Lite
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         177876
>Category:       kern
>Synopsis:       [mips] kernel stack overflow panic on mips64, EdgeRouter Lite
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-mips
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Apr 15 21:50:00 UTC 2013
>Closed-Date:    
>Last-Modified:  Wed May 14 14:50:00 UTC 2014
>Originator:     Joe Holden
>Release:        FreeBSD 10.0-CURRENT #4 r249363
>Organization:
Pseudo Networks Limited
>Environment:
FreeBSD erl1 10.0-CURRENT FreeBSD 10.0-CURRENT #4 r249363: Mon Apr 15 07:55:44 BST 2013     root@abby.lhr1.as41113.net:/usr/obj/mips.mips64/usr/src/sys/OCTEON-ERL  mips
>Description:
Under heavy CPU load, after a an hour or so FreeBSD will panic and reboot with the following:

panic: kernel stack overflow - trapframe at 0xffffffff80835eb0
cpuid = 0

The panic is identical all the time and only happens under heavy CPU load which of course may well happen in the field.

Ideally I'd like someone to advise me on how to debug the problem, failing that it should be easy enough to recreate.

It may well be a hardware issue as these boxes do run very hot, bordering on thermal design limits.
>How-To-Repeat:
Start compiling a port.
>Fix:


>Release-Note:
>Audit-Trail:

From: Joe Holden <joe@rewt.org.uk>
To: bug-followup@FreeBSD.org, joe@rewt.org.uk
Cc:  
Subject: Re: misc/177876: kernel stack overflow panic on mips64, EdgeRouter
 Lite
Date: Mon, 15 Apr 2013 23:42:19 +0100

 Having rebuilt the kernel with more debug and bringing my tree upto 
 HEAD, the kernel is consistently dropping to db on boot:
 
 FreeBSD 10.0-CURRENT #5 r249529: Mon Apr 15 23:28:04 BST 2013
  
 root@abby.lhr1.as41113.net:/usr/obj/mips.mips64/usr/src/sys/OCTEON-ERL mips
 gcc version 4.2.1 20070831 patched [FreeBSD]
 cpu:0-Trap cause = 2 (TLB miss (load or instr. fetch) - kernel mode)
 [ thread pid 0 tid 0 ]
 Stopped at      0xffffffff80268bdc:     lb      v0,0(s2)
 db> bt
 Tracing pid 0 tid 0 td 0xffffffff808a25f0
 ffffffff8067fd98+40 (?,?,?,?) ra ffffffff8013ba24 sp 98000000009305b0 sz 16
 ffffffff8013b898+18c (0,?,ffffffff,?) ra ffffffff8013b068 sp 
 98000000009305c0 sz 48
 ffffffff8013abe0+488 (?,?,?,?) ra ffffffff8013b334 sp 98000000009305f0 
 sz 192
 ffffffff8013b240+f4 (?,?,?,?) ra ffffffff8013ed88 sp 98000000009306b0 sz 16
 ffffffff8013ebe0+1a8 (?,?,?,?) ra ffffffff8031e180 sp 98000000009306c0 
 sz 816
 ffffffff8031dfe0+1a0 (?,?,?,?) ra ffffffff80696238 sp 98000000009309f0 sz 64
 trap+18a0 (?,?,?,?) ra ffffffff8068104c sp 9800000000930a30 sz 288
 MipsKernGenException+15c (0,bef000,ffffffff,3) ra ffffffff80268bdc sp 
 9800000000930b50 sz 368
 ffffffff80268b70+6c (?,?,?,?) ra ffffffff80248754 sp 9800000000930cc0 sz 48
 ffffffff80248548+20c (?,?,?,?) ra ffffffff80100134 sp 9800000000930cf0 sz 32
 ffffffff80100080+b4 (?,?,?,?) ra 0 sp 9800000000930d10 sz 0
 pid 0
 db>
 
 Not sure if that means anything to anyone!
 
 Cheers,
 Joe
Responsible-Changed-From-To: freebsd-bugs->freebsd-mips 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sun Apr 21 19:29:06 UTC 2013 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=177876 

From: Joe Holden <joe@rewt.org.uk>
To: bug-followup@FreeBSD.org, joe@rewt.org.uk
Cc:  
Subject: Re: kern/177876: [mips] kernel stack overflow panic on mips64, EdgeRouter
 Lite
Date: Mon, 22 Apr 2013 03:58:00 +0100

 So the TLB miss problem was fixed by Warner, but since about then the 
 following happens when booting (either from NFS or USB), completely 
 fresh world and src tree, no special make options or optimisations...
 
 Kernel config: http://sprunge.us/EVjO
 
 Trying to mount root from nfs: []...
 NFS ROOT: 172.16.8.3:/nfs/bsd/fbsd/erl
 warning: no time-of-day clock registered, system time will not be set 
 accurately
 warning: no time-of-day clock registered, system time will not be set 
 accurately
 start_init: trying /sbin/init
 Cannot map anonymous memory
 Out of memory
 Enter full pathname of shell or RETURN for /bin/sh:
 Cannot map anonymous memory
 Out of memory
 Cannot map anonymous memory
 Out of memory
 Enter full pathname of shell or RETURN for /bin/sh:
 
 Usual procedure to cross-build from amd64:
 
 make buildworld buildkernel KERNCONF=OCTEON-ERL TARGET=mips64 
 TARGET_ARCH=mips TARGET_CPUTYPE=octeon WITHOUT_MODULES="cxgbe mwlfw mwl 
 ralfw ral runfw run"
 
 src.conf just contains NO_FSCHG=

From: Stacey Son <sson@me.com>
To: bug-followup@FreeBSD.org, joe@rewt.org.uk
Cc:  
Subject: Re: kern/177876: [mips] kernel stack overflow panic on mips64,
 EdgeRouter Lite
Date: Wed, 14 May 2014 10:41:13 -0400

 The following set of patches increases the kernel thread stack size to 16K by using a 16K page size for just the kernel stack.   Unlike my previous patch set it doesn't require additional wired TLB entries.  I have been using this patch set for a few months on my ERL with a NFS mount to 'buildworld' and for port building and have not seen the 'kernel stack overflow' panic.  It does add a bit of MIPS64 dependent code in the VM layer.  Maybe this should be moved to the pmap layer at some point.
 
 The patch set:
 
 http://people.freebsd.org/~sson/mips/kstack/kstack_large_page_1.diff
 http://people.freebsd.org/~sson/mips/kstack/kstack_large_page_2.diff
 http://people.freebsd.org/~sson/mips/kstack/kstack_large_page_3.diff
 
 or one large patch:
 
 http://people.freebsd.org/~sson/mips/kstack/kstack_large_page.diff
 
 "option KSTACK_LARGE_PAGE" needs to be added to the kernel conf file to enable.
 
 -stacey. (sson@)
>Unformatted:
