From nobody@FreeBSD.org  Wed Feb 20 10:53:59 2002
Return-Path: <nobody@FreeBSD.org>
Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21])
	by hub.freebsd.org (Postfix) with ESMTP id C053437B417
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 20 Feb 2002 10:53:58 -0800 (PST)
Received: (from nobody@localhost)
	by freefall.freebsd.org (8.11.6/8.11.6) id g1KIrwl19976;
	Wed, 20 Feb 2002 10:53:58 -0800 (PST)
	(envelope-from nobody)
Message-Id: <200202201853.g1KIrwl19976@freefall.freebsd.org>
Date: Wed, 20 Feb 2002 10:53:58 -0800 (PST)
From: Eric Anderson <anderson@centtech.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: High NFSD load in FreeBSD 4.5R
X-Send-Pr-Version: www-1.0

>Number:         35151
>Category:       misc
>Synopsis:       High NFSD load in FreeBSD 4.5R
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Feb 20 11:00:01 PST 2002
>Closed-Date:    Sun Dec 08 09:15:11 PST 2002
>Last-Modified:  Sun Dec 08 09:15:11 PST 2002
>Originator:     Eric Anderson
>Release:        FreeBSD 4.5R (from ISO)
>Organization:
Centaur Technology
>Environment:
FreeBSD tesla.centtech.com 4.5-RELEASE FreeBSD 4.5-RELEASE #0: Mon Feb 18 09:54:43 CST 2002     root@tesla.centtech.com:/usr/obj/usr/src/sys/TESLA  i386
>Description:
      I have a FreeBSD 4.5R NFS server, and the nfsd is slamming the cpu, for no
apparent reason.  I have Linux (RedHat 6.2, 7.2) and Solaris (2.7, 2.6) clients
accessing it via NFS.  It is part of an NIS domain.  I have had no problems in
the past, and had FreeBSD 4.4R on this machine and had NO TROUBLES until I
reinstalled and put 4.5R on it.  I can not get the load down at all, and even
some clients timeout while talking to this server.  Here is a clipped portion
from a top:
----------------
last pid:   328;  load averages:  1.00,  1.02,  1.00   up 18+23:27:18  09:07:20
53 processes:  3 running, 50 sleeping
CPU states: 18.3% user,  0.0% nice, 80.5% system,  1.2% interrupt,  0.0% idle
Mem: 21M Active, 747M Inact, 182M Wired, 52M Cache, 112M Buf, 1664K Free
Swap: 2048M Total, 164K Used, 2048M Free

  PID USERNAME   PRI NICE  SIZE    RES STATE    TIME   WCPU    CPU COMMAND
30358 root        61   0   364K   160K RUN    165.4H 94.43% 94.43% nfsd
30359 root         2   0   356K   152K RUN     32:39  2.93%  2.93% nfsd
30360 root         2   0   356K   152K nfsd     0:49  0.00%  0.00% nfsd
  129 root         2   0  2512K  1524K select   0:38  0.00%  0.00% sendmail
   80 root         2   0  1300K   748K select   0:35  0.00%  0.00% ntpd
23865 root         2   0  1868K  1032K select   0:21  0.00%  0.00% nmbd
   87 root         2   0   720K   508K select   0:14  0.00%  0.00% mountd
30361 root         2   0   356K   152K nfsd     0:12  0.00%  0.00% nfsd
   82 daemon       2   0   960K   560K select   0:11  0.00%  0.00% portmap
30362 root         2   0   356K   152K nfsd     0:07  0.00%  0.00% nfsd
  115 root         2   0  1160K   756K select   0:07  0.00%  0.00% amd
-----------------

Here is some other info about the box:
machine has dual nics, and is serving data on both ports.
in rc.conf:
nfs_reserved_port_only="YES"
nfs_server_enable="YES"
nfs_server_flags="-h 10.177.178.51 -h 10.177.176.40 -u -t -n 20"

and in sysctl.conf:
vfs.nfs.gatherdelay=0
vfs.nfs.async=1
vfs.vmiodirenable=1
kern.ipc.maxsockbuf=2097152
kern.ipc.somaxconn=8192
kern.ipc.maxsockets=16424
kern.maxfiles=65536
kern.maxfilesperproc=32768
net.inet.tcp.rfc1323=1
net.inet.tcp.delayed_ack=0
net.inet.tcp.sendspace=65535
net.inet.tcp.recvspace=65535
net.inet.udp.recvspace=65535
net.inet.udp.maxdgram=57344
net.local.stream.recvspace=65535
net.local.stream.sendspace=65535
net.inet.ip.forwarding=1


>How-To-Repeat:
      Build an NFS server using FreeBSD 4.5R.  The machine should have 2 nics, and exporting filesystems to be nfs mounted (automounted) via linux/freebsd/solaris boxes.  The nfsd load immediately jumps up.
>Fix:
    None!  
>Release-Note:
>Audit-Trail:

From: Alexander Haderer <alexander.haderer@charite.de>
To: freebsd-gnats-submit@FreeBSD.org, anderson@centtech.com
Cc: dirk.emmel@charite.de
Subject: Re: misc/35151: High NFSD load in FreeBSD 4.5R
Date: Tue, 26 Mar 2002 19:14:05 +0100

 hello,
 
 we had the same problem with 4.5R. The NFS server exports a 450 GB vinum=20
 RAID to other clients which usually do write access via NFS (read access is=
 =20
 local). When the filesystem has become more and more full (70% ...) one or=
 =20
 more nfsd's eat up the CPU when writing to the NFS server: After NFS=20
 writing of some megabytes (dirs, hundreds of files) via NFS, top shows nfsd=
 =20
 with >100% WCPU, system above 70% and a load average > 10 with very slow=20
 reactions of the machine (obviously). The usual load is somewhere between=20
 0.1 .. 2.0 resulting from some httpds.
 
 The high load went back to acceptable values when we disabled the=20
 softupdates for the exported filesystem: tunefs -n disable /dev/vinum/raid.
 
 If this is a bug or a feature: I don't know.
 
 Alexander
 --=20
 Alexander Haderer             Charit=E9 Berlin - Germany
 

From: Alexander Haderer <alexander.haderer@charite.de>
To: freebsd-gnats-submit@FreeBSD.org, anderson@centtech.com
Cc:  
Subject: Re: misc/35151: High NFSD load in FreeBSD 4.5R
Date: Thu, 28 Mar 2002 11:17:00 +0100

 Sorry, I was wrong!
 
 disabling soft updates only delays the effect of eating CPU time.
 After long term investigations we found something other.
 
 Please see http://www.freebsd.org/cgi/query-pr.cgi?pr=3D36381
 for details.
 
 Alexander
 --=20
 Alexander Haderer             Charit=E9 Berlin - Germany
 

From: Eric Anderson <anderson@centtech.com>
To: freebsd-gnats-submit@freebsd.org, anderson@centtech.com
Cc:  
Subject: Re: misc/35151: High NFSD load in FreeBSD 4.5R
Date: Wed, 28 Aug 2002 17:09:34 -0500

 Just an update - this problem still is occurring in 4.6-RELEASE, and 
 4.6.2-RELEASE.
 
 I have narrowed the problem down to when solaris nfs (automounters) try to 
 communicate with the NFS server, it causes the NFS server to go into high load. 
   This only occurs on FreeBSD NFS servers with more than one interface, both of 
 which can be accessed by the solaris machine, with the nfsd process attached to 
 both (using the -h option).  Apparently, the solaris machines continue to send 
 the packet (maybe malformed?) to the FreeBSD fileserver every x seconds (x is 
 less than 30), and it will continue to load the NFS fileserver.
 
 Eric
 
 -- 
 ------------------------------------------------------------------
 Eric Anderson	   Systems Administrator      Centaur Technology
 The moon may be smaller than Earth, but it's further away.
 ------------------------------------------------------------------
 

From: Eric Anderson <anderson@centtech.com>
To: freebsd-gnats-submit@freebsd.org, anderson@centtech.com
Cc:  
Subject: Re: misc/35151: High NFSD load in FreeBSD 4.5R
Date: Tue, 15 Oct 2002 08:19:24 -0500

 By removing the -h ip.ip.ip.ip options, nfsd works fine on multiple NICs.
 
 The man page indicates that you should use -h, and that UDP NFS won't 
 work properly without it - which is not true from what I have seen so far.
 
 Why would the -h option break nfsd?
 
 
 Eric
 
 
 -- 
 ------------------------------------------------------------------
 Eric Anderson	   Systems Administrator      Centaur Technology
 Skydiving - safer than the stock market.
 ------------------------------------------------------------------
 

From: Eric Anderson <anderson@centtech.com>
To: freebsd-gnats-submit@freebsd.org, anderson@centtech.com
Cc:  
Subject: Re: misc/35151: High NFSD load in FreeBSD 4.5R
Date: Mon, 18 Nov 2002 13:18:52 -0600

 I guess no one knows the answer to my previous question on this, so this 
 PR can be closed.
 
 Eric
 
 -- 
 ------------------------------------------------------------------
 Eric Anderson	   Systems Administrator      Centaur Technology
 Beware the fury of a patient man.
 ------------------------------------------------------------------
 
State-Changed-From-To: open->closed 
State-Changed-By: iedowse 
State-Changed-When: Sun Dec 8 09:11:35 PST 2002 
State-Changed-Why:  

Submitter says that this can be closed. I'm not sure what was causing 
nfsd to spin, but it's possible that this has been fixed in -CURRENT, 
as there were some bugs fixed there involving the -h option and TCP 
sockets. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=35151 
>Unformatted:
