From uspoerlein@gmail.com  Wed Aug  6 16:58:17 2008
Return-Path: <uspoerlein@gmail.com>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 12DAC106567D;
	Wed,  6 Aug 2008 16:58:17 +0000 (UTC)
	(envelope-from uspoerlein@gmail.com)
Received: from acme.spoerlein.net (cl-43.dus-01.de.sixxs.net [IPv6:2a01:198:200:2a::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 946F38FC1F;
	Wed,  6 Aug 2008 16:58:16 +0000 (UTC)
	(envelope-from uspoerlein@gmail.com)
Received: from coyote.spoerlein.net (e180144192.adsl.alicedsl.de [85.180.144.192])
	by acme.spoerlein.net (8.14.2/8.14.2) with ESMTP id m76GwDhW053528
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Wed, 6 Aug 2008 18:58:14 +0200 (CEST)
	(envelope-from uspoerlein@gmail.com)
Received: from coyote.spoerlein.net (localhost [127.0.0.1])
	by coyote.spoerlein.net (8.14.2/8.14.2) with ESMTP id m76GwC1i047536
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 6 Aug 2008 18:58:12 +0200 (CEST)
	(envelope-from uqs@coyote.spoerlein.net)
Received: (from uqs@localhost)
	by coyote.spoerlein.net (8.14.2/8.14.2/Submit) id m76GwCec047535;
	Wed, 6 Aug 2008 18:58:12 +0200 (CEST)
	(envelope-from uqs)
Message-Id: <200808061658.m76GwCec047535@coyote.spoerlein.net>
Date: Wed, 6 Aug 2008 18:58:12 +0200 (CEST)
From: Ulrich Spörlein <uspoerlein@gmail.com>
To: FreeBSD-gnats-submit@freebsd.org
Cc: <rwatson@freebsd.org>
Subject: bsnmpd: UNIX socket leak on 6.3 when using Hostres-MIB
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         126307
>Category:       bin
>Synopsis:       bsnmpd(1): UNIX socket leak on 6.3 when using Hostres-MIB [regression]
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    syrinx
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Aug 06 17:00:06 UTC 2008
>Closed-Date:    Tue Jan 12 12:03:39 UTC 2010
>Last-Modified:  Fri Nov 12 20:52:16 UTC 2010
>Originator:     Ulrich Spoerlein
>Release:        FreeBSD 6.3-STABLE i386
>Organization:
>Environment:
	
>Description:
I think this is a kernel bug, as I have seen this happen with nss_ldap, too.
The process will grow wrt. to open files and eventually run out of file
descriptors or bring the system down.

Now I was running bsnmpd on my root-server for several months without problems,
and updated my system on July 22nd due to the bind issue, since then I have to
periodically restart bsnmpd to avoid resource starvation (last kernel was from
March 14).

Our production servers are still running on a kernel from early May, and they
too are not seeing the socket leak when running bsnmpd. I cannot reproduce the
problem on 7-STABLE either, so I'm sure it has been introduced to RELENG_6
during the last 3 months.

(The nss_ldap problem is happening on 6.1 and 6.2, too, so it is probably not
related).

>How-To-Repeat:
Use the sample snmpd.config and activate the mibII module and the hostres
module. Start bsnmpd.

Check open files:

# lsof -p `pgrep bsnmpd` | tail -5
bsnmpd  46945 root   10u  VCHR      0,101      0t0     101  () (like character special /dev/mdctl)
bsnmpd  46945 root   11r  VCHR        0,8      0t0       8  () (like character special /dev/null)
bsnmpd  46945 root   12r  VCHR        0,8      0t0       8  () (like character special /dev/null)
bsnmpd  46945 root   13u  IPv4 0xc364c168      0t0     UDP localhost:snmp
bsnmpd  46945 root   14u  unix 0xc4303b20      0t0         /var/run/snmpd.sock

run several snmpwalks:

# snmpwalk -v2c -c public localhost > /dev/null
# snmpwalk -v2c -c public localhost > /dev/null
# snmpwalk -v2c -c public localhost > /dev/null
# snmpwalk -v2c -c public localhost > /dev/null

and check open files again:
# lsof -p `pgrep bsnmpd` | tail -5
bsnmpd  46945 root   36u  unix 0xc38129bc      0t0         ->0xc3698de8
bsnmpd  46945 root   37u  unix 0xc3ad42c8      0t0         ->0xc3aeab20
bsnmpd  46945 root   38u  unix 0xc38122c8      0t0         ->0xc3c7cc84
bsnmpd  46945 root   39u  unix 0xc46e742c      0t0         ->0xc62366f4
bsnmpd  46945 root   40u  unix 0xc3bd6000      0t0         ->0xc3c6b6f4

continue till your system dies, as bsnmpd is running under root privileges :/
Another server of mine has bsnmpd running for 14 hours now, with SNMP polling
every five minutes. It is now at 184 open unix sockets.

>Fix:

	


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->syrinx 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Wed Feb 11 16:56:56 UTC 2009 
Responsible-Changed-Why:  
Assign to maintainer. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=126307 

From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uspoerlein@gmail.com>
To: bug-followup@FreeBSD.org
Cc: rwatson@FreeBSD.org
Subject: Re: bin/126307: bsnmpd(1): UNIX socket leak on 6.3 when using
	Hostres-MIB [regression]
Date: Sat, 21 Mar 2009 17:42:49 +0100

 Hi Robert,
 
 as you were calling out for regressions wrt. socket stuff for the 7.2
 RELEASE, I have now 3 RELENG_7 machines (none older than 3 weeks) and
 some of them still show the above mentioned problem. When running
 bsnmpd, it will quickly leak unix sockets when under load.
 
 You can easily try this by having a look at /etc/snmpd.config, then
 start bsnmpd, have net-snmp installed and do something like this:
 
 % while :; do echo -n `date` " " $(lsof -p $(pgrep bsnmpd)|grep -c unix);
 > for i in `jot 5`; do snmpwalk -v2c -c public localhost >/dev/null;done;
 > echo; done
 
 I will update all three machines to latest RELENG_7 after the weekend
 and try to find a pattern.
 
 Cheers,
 Ulrich Spörlein

From: Robert Watson <rwatson@FreeBSD.org>
To: =?ISO-8859-15?Q?Ulrich_Sp=F6rlein?= <uspoerlein@gmail.com>
Cc: bug-followup@FreeBSD.org
Subject: Re: bin/126307: bsnmpd(1): UNIX socket leak on 6.3 when using
 Hostres-MIB [regression]
Date: Sat, 21 Mar 2009 16:51:51 +0000 (GMT)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
 
 --621616949-1266105034-1237654311=:94801
 Content-Type: TEXT/PLAIN; charset=utf-8; format=flowed
 Content-Transfer-Encoding: 8BIT
 
 
 On Sat, 21 Mar 2009, Ulrich Spörlein wrote:
 
 > as you were calling out for regressions wrt. socket stuff for the 7.2 
 > RELEASE, I have now 3 RELENG_7 machines (none older than 3 weeks) and some 
 > of them still show the above mentioned problem. When running bsnmpd, it will 
 > quickly leak unix sockets when under load.
 >
 > You can easily try this by having a look at /etc/snmpd.config, then start 
 > bsnmpd, have net-snmp installed and do something like this:
 >
 > % while :; do echo -n `date` " " $(lsof -p $(pgrep bsnmpd)|grep -c unix);
 >> for i in `jot 5`; do snmpwalk -v2c -c public localhost >/dev/null;done; 
 >> echo; done
 >
 > I will update all three machines to latest RELENG_7 after the weekend and 
 > try to find a pattern.
 
 Hi Ulrich--
 
 Thanks for the update (and specifically, the confirmation that it happens on 
 recent 7.x).  Could I ask you to let me know whether, when you kill snmpd, the 
 resources are all released properly?  I.e., if you look at 
 kern.ipc.numopensockets before doing the above, and then again after the above 
 + kill of bsnmpd, is it about the same?
 
 Robert N M Watson
 Computer Laboratory
 University of Cambridge
 --621616949-1266105034-1237654311=:94801--

From: Shteryana Shopova <syrinx@FreeBSD.org>
To: bug-followup@FreeBSD.org, uspoerlein@gmail.com
Cc:  
Subject: Re: bin/126307: bsnmpd(1): UNIX socket leak on 6.3 when using 
	Hostres-MIB [regression]
Date: Sat, 21 Mar 2009 20:02:33 +0200

 Hi Ulrich,
 
 I tried to reproduce the PR on a cleanly installed 6.4 system without
 success. Can you please provide further information for your system -
 e.g. your kernel config file, the proccesses you have running on the
 system, any custom patches you may have installed, or anything else
 you might think is useful to help us reproduce the problem. Thanks!
 
 cheers,
 Shteryana
 
 On Sat, Mar 21, 2009 at 7:00 PM, Ulrich Sp=C3=B6rlein <uspoerlein@gmail.com=
 > wrote:
 > The following reply was made to PR bin/126307; it has been noted by GNATS=
 .
 >
 > From: Ulrich =3D?utf-8?B?U3DDtnJsZWlu?=3D <uspoerlein@gmail.com>
 > To: bug-followup@FreeBSD.org
 > Cc: rwatson@FreeBSD.org
 > Subject: Re: bin/126307: bsnmpd(1): UNIX socket leak on 6.3 when using
 > =C2=A0 =C2=A0 =C2=A0 =C2=A0Hostres-MIB [regression]
 > Date: Sat, 21 Mar 2009 17:42:49 +0100
 >
 > =C2=A0Hi Robert,
 >
 > =C2=A0as you were calling out for regressions wrt. socket stuff for the 7=
 .2
 > =C2=A0RELEASE, I have now 3 RELENG_7 machines (none older than 3 weeks) a=
 nd
 > =C2=A0some of them still show the above mentioned problem. When running
 > =C2=A0bsnmpd, it will quickly leak unix sockets when under load.
 >
 > =C2=A0You can easily try this by having a look at /etc/snmpd.config, then
 > =C2=A0start bsnmpd, have net-snmp installed and do something like this:
 >
 > =C2=A0% while :; do echo -n `date` " " $(lsof -p $(pgrep bsnmpd)|grep -c =
 unix);
 > =C2=A0> for i in `jot 5`; do snmpwalk -v2c -c public localhost >/dev/null=
 ;done;
 > =C2=A0> echo; done
 >
 > =C2=A0I will update all three machines to latest RELENG_7 after the weeke=
 nd
 > =C2=A0and try to find a pattern.
 >
 > =C2=A0Cheers,
 > =C2=A0Ulrich Sp=C3=B6rlein
 >

From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uspoerlein@gmail.com>
To: Shteryana Shopova <syrinx@FreeBSD.org>
Cc: bug-followup@FreeBSD.org
Subject: Re: bin/126307: bsnmpd(1): UNIX socket leak on 6.3 when using
	Hostres-MIB [regression]
Date: Sun, 22 Mar 2009 13:00:52 +0100

 Hi Shteryana,
 
 I found an old friend that I can blame for the socket leak. Too bad I
 didn't check this earlier. Short story: It happens when you are walking
 the hostres MIB, use NSS via LDAP and have nss_ldap configured to talk
 to the LDAP server via UNIX socket. I reported this problem before and
 it has thus nothing to do with bsnmpd, per se. See, eg.
 http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2007-06/msg00209.html
 
 But I'm curious, why is snmp_hostres doing stat(2) calls that require a
 NSS round trip?
 
 Cheers,
 Ulrich Spörlein

From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uspoerlein@gmail.com>
To: Robert Watson <rwatson@FreeBSD.org>
Cc:  
Subject: Re: bin/126307: bsnmpd(1): UNIX socket leak on 6.3 when using
	Hostres-MIB [regression]
Date: Sat, 21 Mar 2009 18:07:14 +0100

 On Sat, 21.03.2009 at 16:51:51 +0000, Robert Watson wrote:
 > 
 > On Sat, 21 Mar 2009, Ulrich Spörlein wrote:
 > 
 > > as you were calling out for regressions wrt. socket stuff for the 7.2 
 > > RELEASE, I have now 3 RELENG_7 machines (none older than 3 weeks) and some 
 > > of them still show the above mentioned problem. When running bsnmpd, it will 
 > > quickly leak unix sockets when under load.
 > >
 > > You can easily try this by having a look at /etc/snmpd.config, then start 
 > > bsnmpd, have net-snmp installed and do something like this:
 > >
 > > % while :; do echo -n `date` " " $(lsof -p $(pgrep bsnmpd)|grep -c unix);
 > >> for i in `jot 5`; do snmpwalk -v2c -c public localhost >/dev/null;done; 
 > >> echo; done
 > >
 > > I will update all three machines to latest RELENG_7 after the weekend and 
 > > try to find a pattern.
 > 
 > Hi Ulrich--
 > 
 > Thanks for the update (and specifically, the confirmation that it happens on 
 > recent 7.x).  Could I ask you to let me know whether, when you kill snmpd, the 
 > resources are all released properly?  I.e., if you look at 
 > kern.ipc.numopensockets before doing the above, and then again after the above 
 > + kill of bsnmpd, is it about the same?
 
 Hi Robert,
 
 this is what I get, the last line is just after a bsnmpd restart
 
 Sat Mar 21 17:58:10 CET 2009   1194   62
 Sat Mar 21 17:59:19 CET 2009   1408   116
 Sat Mar 21 18:00:34 CET 2009   1541   182
 Sat Mar 21 18:01:42 CET 2009   1650   237
 Sat Mar 21 18:02:47 CET 2009   1760   292
 Sat Mar 21 18:04:03 CET 2009   1206   14
 
 So as you can see, the openfiles bounced back to ~1200 and this is not a
 kernel leak, but a begnin user land leak. Nevertheless it seems that it
 was/is dependent on the kernel version. Did some semantics change?
 
 Perhaps you could have a quick look at the bsnmpd code to see if there's
 a socket handling bug/leak?
 
 (Man, I wish valgrind was working on recent FreeBSD versions)
 
 Cheers,
 Ulrich Spörlein
 -- 
 None are more hopelessly enslaved than those who falsely believe they are free
 -- Johann Wolfgang von Goethe
 

From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uspoerlein@gmail.com>
To: bug-followup@FreeBSD.org, Shteryana Shopova <syrinx@FreeBSD.org>
Cc:  
Subject: Re: bin/126307: bsnmpd(1): UNIX socket leak on 6.3 when using
 Hostres-MIB [regression]
Date: Tue, 12 Jan 2010 12:03:29 +0100

 Please close this PR. The problem here is nss_ldap which has a socket
 leak when using domain sockets. It won't happen when configured to use
 TCP to talk to slapd.
 
 nss_ldapd (a fork from nss_ldap) does not have these problems due to its
 re-design.
 
 I'm still curious why bsnmpd is calling stat(2) that often, but I guess
 it's the installed software MIBs that require getting file timestamps.
 
 Regards,
 Uli
State-Changed-From-To: open->closed  
State-Changed-By: syrinx 
State-Changed-When: Tue Jan 12 12:02:07 UTC 2010 
State-Changed-Why:  
Close PR as requested by submitter. The socket leak was caused 
by nss_ldap, not bsnmpd. Thanks for reporting this. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=126307 
>Unformatted:
