From jdc@parodius.com  Mon Nov 26 02:29:18 2007
Return-Path: <jdc@parodius.com>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 927F216A417
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 26 Nov 2007 02:29:18 +0000 (UTC)
	(envelope-from jdc@parodius.com)
Received: from mx01.sc1.parodius.com (mx01.sc1.parodius.com [72.20.106.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 8555513C448
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 26 Nov 2007 02:29:18 +0000 (UTC)
	(envelope-from jdc@parodius.com)
Received: by mx01.sc1.parodius.com (Postfix, from userid 1000)
	id DBBF11CC07B; Sun, 25 Nov 2007 18:29:14 -0800 (PST)
Message-Id: <20071126022914.DBBF11CC07B@mx01.sc1.parodius.com>
Date: Sun, 25 Nov 2007 18:29:14 -0800 (PST)
From: Jeremy Chadwick <koitsu@FreeBSD.org>
Reply-To: Jeremy Chadwick <koitsu@FreeBSD.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: savecore never finding kernel core dumps (rcorder problem)
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         118255
>Category:       conf
>Synopsis:       savecore never finding kernel core dumps (rcorder problem)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-rc
>State:          feedback
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Nov 26 02:30:01 UTC 2007
>Closed-Date:    
>Last-Modified:  Thu Oct 25 19:36:16 UTC 2012
>Originator:     Jeremy Chadwick
>Release:        FreeBSD 6.3-PRERELEASE i386
>Organization:
>Environment:
System: FreeBSD eos.sc1.parodius.com 6.3-PRERELEASE FreeBSD 6.3-PRERELEASE #0: Wed Nov 7 13:14:24 PST 2007 root@eos.sc1.parodius.com:/usr/obj/usr/src/sys/EOS i386
>Description:
	One of our production systems has begun kernel panic'ing for reasons unknown;
	we're in the process of figuring out why that's happening.  On the other hand,
	none of our kernel panics (which are being written to disk when doing "panic"
	from DDB) are being dropped into /var/crash when savecore runs.

	Details of our configuration and what actually happens were posted to
	freebsd-stable.  It shows that a kernel core dump is indeed written to the
	correct device (/dev/ad0s1b), but savecore never detects the cores:

	http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038069.html
	http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038569.html

	I believe the problem is that /etc/rc.d/swap1 (which does `swapon -a`) is
	being called _before_ /etc/rc.d/savecore, thus clobbering/stomping over any
	core dumps that exist.  See the 2nd URL above for some additional details.

	I'm marking this serious/medium because people being able to get vmcore
	images after a kernel panic is important.  :-)
>How-To-Repeat:
	Set dumpdev and dumpdir in /etc/rc.conf, panic system, and see.
>Fix:
	I believe the issue can be fixed by adjusting some of the rcorder(8) values
	so that savecore gets run *before* swap1. I'm not familiar with what needs to
	be changed to make this work.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-rc 
Responsible-Changed-By: remko 
Responsible-Changed-When: Mon Nov 26 06:22:07 UTC 2007 
Responsible-Changed-Why:  
reassign to rc team 

http://www.freebsd.org/cgi/query-pr.cgi?pr=118255 

From: Antony Mawer <gnats@mawer.org>
To: bug-followup@FreeBSD.org, koitsu@FreeBSD.org
Cc:  
Subject: Re: conf/118255: savecore never finding kernel core dumps (rcorder
 problem)
Date: Fri, 30 Nov 2007 12:30:17 +1100

 There seems to be conflicting information about what constitutes the 
 correct behaviour here. The original 4.4BSD "Unix System Manager's 
 Manual (SMM)", found here:
 
      http://docs.freebsd.org/44doc/smm/02.config/paper-6.html
 
 Indicates the following (found under the "System dumps" heading):
 
      - Kernel dumps write from the end of swap and work backwards
      - The kernel uses swap from the front and works forward
      - This way it reduces the chance of swapping overwriting the dump
         during the boot process until savecore is run
 
 This somewhat more modern posting suggests that is still the case:
 
 http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2005-11/0703.html
 
 However the FreeBSD Developers' Handbook suggests a behaviour that does 
 not match the current reality:
 
 http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html#EXTRACT-DUMP
 
 Can anyone speak with more authority on this...?
 
 --Antony

From: Jeremy Chadwick <koitsu@FreeBSD.org>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: conf/118255: savecore never finding kernel core dumps (rcorder
	problem)
Date: Sat, 1 Dec 2007 02:55:36 -0800

 This is great information; thanks for providing it! I found it quite
 educational/informational.  The documentation Antony provided seems to
 indicate two things:
 
 1) That savecore(8) should really be run before swapon(8) -- I don't see
 any indication that swap needs to be made available prior to mounting
 filesystems, which is what Doug B. was stating was a necessity.
 
 2) That even regardless of Item #1, savecore(8) should be working
 (assuming that kernel dumps are still written from the end of the swap
 device to the front (e.g. backwards)), and that swapon(8) shouldn't be
 stomping on kernel dumps.
 
 I haven't tried changing the rcorder of the /etc/rc.d scripts in
 question to see if it works.  My gut feeling says it probably will, but
 I'm not sure of the implications.
 
 Doug, can you provide some comments/insight here when time permits?
 
 -- 
 | Jeremy Chadwick                                    jdc at parodius.com |
 | Parodius Networking                           http://www.parodius.com/ |
 | UNIX Systems Administrator                      Mountain View, CA, USA |
 | Making life hard for others since 1977.                  PGP: 4BD6C0CB |
 
State-Changed-From-To: open->feedback 
State-Changed-By: crees 
State-Changed-When: Thu Oct 25 19:36:15 UTC 2012 
State-Changed-Why:  
This indicates to me that your swap device is possibly too small to fit 
the coredump and the fsck results.  How big is it? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=118255 
>Unformatted:
