From newton@atdot.dotat.org  Sun Sep 13 23:28:37 1998
Received: from atdot.dotat.org (atdot.dotat.org [203.23.150.35])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id XAA11246
          for <FreeBSD-gnats-submit@freebsd.org>; Sun, 13 Sep 1998 23:28:34 -0700 (PDT)
          (envelope-from newton@atdot.dotat.org)
Received: (from newton@localhost) by atdot.dotat.org (8.8.8/8.7) id PAA05701; Mon, 14 Sep 1998 15:58:18 +0930 (CST)
Message-Id: <199809140628.PAA05701@atdot.dotat.org>
Date: Mon, 14 Sep 1998 15:58:18 +0930 (CST)
From: Mark Newton <newton@atdot.dotat.org>
Reply-To: newton@atdot.dotat.org
To: FreeBSD-gnats-submit@freebsd.org
Subject: sendmail, inetd barf after a week or so of uptime
X-Send-Pr-Version: 3.2

>Number:         7925
>Category:       kern
>Synopsis:       sendmail, inetd SIGSEGV after forking after "enough" days of uptime
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Sep 13 23:30:01 PDT 1998
>Closed-Date:    Fri Jul 23 14:27:22 PDT 1999
>Last-Modified:  Fri Jul 23 14:29:36 PDT 1999
>Originator:     Mark Newton
>Release:        FreeBSD 3.0-980524-SNAP i386
>Organization:
>Environment:

	Intel P200
	64Mbytes RAM
	aic7880 UltraSCSI
	Tulip 10Mbit/sec ethernet

>Description:

	After "n" days of uptime long-running daemons which fork often
	(inetd and sendmail in particular, but sshd has done it once or
	twice) start to dump core shortly after forking, i.e.: they
	accept a connection, then the child process immediately dies,
	shutting down the connection before any data is transferred.

	Killing and restarting the offending daemon temporarily fixes
	the problem, although it usually only takes a day or so to 
	recur.  Rebooting solves the problem for another week or so.
	Daemons which fork more often appear to fail more often too (so
	sendmail failures are more common than inetd failures, for instance).

	Each crash is accompanied by a message on the console:

	pid 5358 (sendmail), uid 0: exited on signal 10

	Are we dragging bogus pages from memory mapped files into the cache?

	I might have PR'ed something about this before, but I can't find
	it in the database so perhaps I'm imagining it.

)

>How-To-Repeat:

	Reboot.  Accept moderate amounts of mail.  Wait.
	Can't narrow it down to a particular cause, unfortunately, although
	it does seem to be more common after running apps which use more 
	memory.  Perhaps, however, my perceptions are being colored by 
	my suspicion that we're mapping (dirty?) pages from the page cache
	where we shouldn't be.

>Fix:
	
	Reboot weekly :-(

>Release-Note:
>Audit-Trail:

From: Mark Newton <newton@atdot.dotat.org>
To: FreeBSD-gnats-submit@FreeBSD.ORG, freebsd-bugs@FreeBSD.ORG
Cc:  Subject: Re: kern/7925: sendmail, inetd barf after a week or so of uptime
Date: Mon, 21 Dec 1998 19:46:05 +1030 (CST)

 FreeBSD-gnats-submit@FreeBSD.ORG wrote:
 
  > >Category:       kern
  > >Responsible:    freebsd-bugs
  > >Synopsis:       sendmail, inetd SIGSEGV after forking after "enough" days of uptime
  > >Arrival-Date:   Sun Sep 13 23:30:01 PDT 1998
 
 Further info:  Many repeats of this bug indicate that it occurs
 after the kernel prints "swap_pager: suggest more swap space: 128 MB"
 messages.  No reboots are required until that point.
 
 I've added another disk to the machine with a 64 Mbyte swap partition
 since then and the problem hasn't bitten at all.
 
 pstat -s never indicated that I was running out of swap even when
 that message was being printed, so I treated it as a "something to give
 attention to one day" rather than an "emergency! your system will need a
 reboot any minute now" kind of message.
 
 The "suggest more swap" message used to come up fairly reliably if I
 had two users running KDE and/or Netscape :-(
 
 Message comes up if we've never seen it before and if
 vm_swap_size < btodb(cnt.v_page_count * PAGE_SIZE).  Where on earth
 is "cnt" declared?  A macro somewhere?  hmm.
 
 Anyway, hopefully this lends more info that can be used to fix this
 rather long-standing bug (if it isn't fixed already;  The PR is still
 open, so I'm gathering it isn't).
 
 --------------------------------------------------------------------
 I tried an internal modem,                    newton@atdot.dotat.org
      but it hurt when I walked.                          Mark Newton
 ----- Voice: +61-4-1958-3414 ------------- Fax: +61-8-83034403 -----
State-Changed-From-To: open->feedback 
State-Changed-By: sheldonh 
State-Changed-When: Wed Jul 21 12:46:02 PDT 1999 
State-Changed-Why:  
Mark, could you see whether you still have this problem in 3.2-STABLE 
or 4.0-CURRENT?  I've just MFC'd a number of recent changes to  
STABLE's inetd, and a goodly number of bugs have been fixed this year. 
;-) 
State-Changed-From-To: feedback->closed 
State-Changed-By: sheldonh 
State-Changed-When: Fri Jul 23 14:27:22 PDT 1999 
State-Changed-Why:  
Fixed in rev 1.105 of swap_pager.c 1998/12/29. 
>Unformatted:
