From jdunn@aquezada.com  Tue Jan  6 04:54:17 2009
Return-Path: <jdunn@aquezada.com>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C44C91065670
	for <FreeBSD-gnats-submit@freebsd.org>; Tue,  6 Jan 2009 04:54:17 +0000 (UTC)
	(envelope-from jdunn@aquezada.com)
Received: from aphrodite.aquezada.com (h216-235-8-211.host.egate.net [216.235.8.211])
	by mx1.freebsd.org (Postfix) with ESMTP id 7D5518FC16
	for <FreeBSD-gnats-submit@freebsd.org>; Tue,  6 Jan 2009 04:54:17 +0000 (UTC)
	(envelope-from jdunn@aquezada.com)
Received: from localhost (localhost [127.0.0.1])
	by aphrodite.acf.aquezada.com (Postfix) with ESMTP id 053523F445
	for <FreeBSD-gnats-submit@freebsd.org>; Mon,  5 Jan 2009 23:54:17 -0500 (EST)
Received: from aphrodite.acf.aquezada.com ([127.0.0.1])
	by localhost (aphrodite.acf.aquezada.com [127.0.0.1]) (amavisd-new, port 10024)
	with LMTP id DH4+4Jx-QArP for <FreeBSD-gnats-submit@freebsd.org>;
	Mon,  5 Jan 2009 23:54:06 -0500 (EST)
Received: from jupiter.acf.aquezada.com (jupiter.acf.aquezada.com [192.168.5.5])
	by aphrodite.acf.aquezada.com (Postfix) with ESMTP id 6D6BF3F443
	for <FreeBSD-gnats-submit@freebsd.org>; Mon,  5 Jan 2009 23:54:06 -0500 (EST)
Received: from jupiter.acf.aquezada.com (jupiter.acf.aquezada.com [127.0.0.1])
	by jupiter.acf.aquezada.com (8.14.3/8.14.2) with ESMTP id n064s6Bh002201
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 5 Jan 2009 23:54:06 -0500
Received: (from jdunn@localhost)
	by jupiter.acf.aquezada.com (8.14.3/8.14.3/Submit) id n064s63e002199
	for FreeBSD-gnats-submit@freebsd.org; Mon, 5 Jan 2009 23:54:06 -0500
Message-Id: <1231217645.22423.2.camel@jupiter.acf.aquezada.com>
Date: Mon, 05 Jan 2009 23:54:05 -0500
From: "Julian C. Dunn" <jdunn@aquezada.com>
To: FreeBSD-gnats-submit@freebsd.org
Subject: bsnmpd snmp_hostres.so always returns 100% CPU utilization

>Number:         130222
>Category:       kern
>Synopsis:       bsnmpd snmp_hostres.so always returns 100% CPU
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    uqs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jan 06 05:00:10 UTC 2009
>Closed-Date:    Tue Nov 23 21:43:17 UTC 2010
>Last-Modified:  Tue Nov 23 21:43:17 UTC 2010
>Originator:     Julian C. Dunn
>Release:        FreeBSD 7.1-STABLE i386
>Organization:
>Environment:
System: FreeBSD aphrodite.acf.aquezada.com 7.1-STABLE FreeBSD 7.1-STABLE
#12: Mon Jan 5 22:11:06 EST 2009
jdunn@aphrodite.acf.aquezada.com:/usr/obj/usr/src/sys/APHRODITE i386


>Description:

On FreeBSD 7.1 with the ULE scheduler, bsnmpd always returns 100% CPU
utilization.

There was a thread about this here:

http://kerneltrap.org/mailarchive/freebsd-current/2008/5/7/1747854/thread

but the problem still occurs.

>How-To-Repeat:

Enable and start bsnmpd, and enable the HOST-RESOURCES-MIB in
snmpd.config

Then:

aphrodite:/etc$ snmpwalk -v2c -c public localhost|grep hrProcessorLoad
HOST-RESOURCES-MIB::hrProcessorLoad.7 = INTEGER: 100

>Fix:

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-i386->harti 
Responsible-Changed-By: remko 
Responsible-Changed-When: Tue Jan 6 09:08:16 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer 

http://www.freebsd.org/cgi/query-pr.cgi?pr=130222 

From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uqs@spoerlein.net>
To: "Julian C. Dunn" <jdunn@aquezada.com>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Wed, 2 Jun 2010 17:24:27 +0200

 --G4iJoqBmSsgzjUCe
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 On Mon, 05.01.2009 at 23:54:05 -0500, Julian C. Dunn wrote:
 > On FreeBSD 7.1 with the ULE scheduler, bsnmpd always returns 100% CPU
 > utilization.
 
 Hi Julian,
 
 can you please try the attached patch? It is against 8-STABLE but should
 also apply to 7-STABLE if you're still using it.
 
 It is not ready for commit, but should do the right thing regardless of
 the scheduler involved.
 
 Uli
 
 --G4iJoqBmSsgzjUCe
 Content-Type: text/x-diff; charset=us-ascii
 Content-Disposition: attachment; filename="bsnmpd.diff"
 
 Index: usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c
 ===================================================================
 --- usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c	(revision 208628)
 +++ usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c	(working copy)
 @@ -63,6 +63,7 @@
  
  	/* the samples from the last minute, as required by MIB */
  	double		samples[MAX_CPU_SAMPLES];
 +	long		states[MAX_CPU_SAMPLES][CPUSTATES];
  
  	/* current sample to fill in next time, must be < MAX_CPU_SAMPLES */
  	uint32_t	cur_sample_idx;
 @@ -112,6 +113,43 @@
  	return ((int)floor((double)sum/(double)e->sample_cnt));
  }
  
 +static int
 +get_avg_usage(struct processor_entry *e)
 +{
 +	u_int i, oldest;
 +	long delta = 0;
 +	double load = 0.0;
 +
 +	assert(e != NULL);
 +
 +	/* Need two samples to perform delta calculation */
 +	if (e->sample_cnt <= 1)
 +		return (0);
 +
 +	/* oldest usable index */
 +	if (e->sample_cnt == MAX_CPU_SAMPLES)
 +		oldest = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
 +	else
 +		oldest = 0;
 +
 +	/* FIXME handle wrap around */
 +	for (i = 0; i < CPUSTATES; i++) {
 +		delta += e->states[e->cur_sample_idx][i];
 +		delta -= e->states[oldest][i];
 +	}
 +	if (delta == 0)
 +		return 0;
 +
 +	/* XXX idle time is in the last index always?!? */
 +	load = (double)(e->states[e->cur_sample_idx][CPUSTATES-1] -
 +	    e->states[oldest][CPUSTATES-1]) / delta;
 +	load = 100 - (load*100);
 +	HRDBG("CPU no. %d delta ticks %ld pct usage %.2f", e->cpu_no,
 +	    delta, load);
 +
 +	return (floor(load));
 +}
 +
  /*
   * Stolen from /usr/src/bin/ps/print.c. The idle process should never
   * be swapped out :-)
 @@ -132,11 +170,15 @@
   * Save a new sample
   */
  static void
 -save_sample(struct processor_entry *e, struct kinfo_proc *kp)
 +save_sample(struct processor_entry *e, struct kinfo_proc *kp, long *cp_times)
  {
 +	int i;
  
 +	for (i = 0; cp_times != NULL && i < CPUSTATES; i++)
 +		e->states[e->cur_sample_idx][i] = cp_times[i];
 +
  	e->samples[e->cur_sample_idx] = 100.0 - processor_getpcpu(kp);
 -	e->load = get_avg_load(e);
 +	e->load = get_avg_usage(e);
  	e->cur_sample_idx = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
  
  	if (++e->sample_cnt > MAX_CPU_SAMPLES)
 @@ -241,8 +283,6 @@
  		entry->idle_pid = kp->ki_pid;
  		HRDBG("CPU no. %d with SNMP index=%d has idle PID %d",
  		    entry->cpu_no, entry->index, entry->idle_pid);
 -
 -		save_sample(entry, kp);
  	}
  }
  
 @@ -386,12 +426,22 @@
  refresh_processor_tbl(void)
  {
  	struct processor_entry *entry;
 -	int need_pids;
 +	int need_pids, nproc;
  	struct kinfo_proc *plist;
 -	int nproc;
 +	size_t size;
  
  	processor_refill_tbl();
  
 +	long pcpu_cp_times[hw_ncpu * CPUSTATES];
 +	memset(pcpu_cp_times, 0, sizeof(pcpu_cp_times));
 +
 +	size = hw_ncpu * CPUSTATES * sizeof(long);
 +	/* FIXME: assert entry->ncpu <= hw_ncpu <= length of cp_times */
 +	if (sysctlbyname("kern.cp_times", pcpu_cp_times, &size, NULL, 0) == -1) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.cp_times) failed");
 +		return;
 +	}
 +
  	need_pids = 0;
  	TAILQ_FOREACH(entry, &processor_tbl, link) {
  		if (entry->idle_pid <= 0) {
 @@ -410,7 +460,7 @@
  			need_pids = 1;
  			continue;
  		}
 -		save_sample(entry, plist);
 +		save_sample(entry, plist, &pcpu_cp_times[entry->cpu_no * CPUSTATES]);
  	}
  
  	if (need_pids == 1)
 Index: usr.sbin/bsnmpd/modules/snmp_hostres/Makefile
 ===================================================================
 --- usr.sbin/bsnmpd/modules/snmp_hostres/Makefile	(revision 208628)
 +++ usr.sbin/bsnmpd/modules/snmp_hostres/Makefile	(working copy)
 @@ -48,7 +48,8 @@
  	printcap.c
  
  #Not having NDEBUG defined will enable assertions and a lot of output on stderr
 -CFLAGS+= -DNDEBUG -I${LPRSRC}
 +WARNS?=	1
 +CFLAGS+= -I${LPRSRC}
  XSYM=	host hrStorageOther hrStorageRam hrStorageVirtualMemory \
  	hrStorageFixedDisk hrStorageRemovableDisk hrStorageFloppyDisk \
  	hrStorageCompactDisc hrStorageRamDisk hrStorageFlashMemory \
 
 --G4iJoqBmSsgzjUCe--

From: Julian Dunn <jdunn@aquezada.com>
To: =?ISO-8859-1?Q?Ulrich_Sp=F6rlein?= <uqs@spoerlein.net>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Sat, 03 Jul 2010 23:34:43 -0400

 On 06/02/2010 11:24 AM, Ulrich Sprlein wrote:
 > On Mon, 05.01.2009 at 23:54:05 -0500, Julian C. Dunn wrote:
 >> On FreeBSD 7.1 with the ULE scheduler, bsnmpd always returns 100% CPU
 >> utilization.
 > 
 > Hi Julian,
 > 
 > can you please try the attached patch? It is against 8-STABLE but should
 > also apply to 7-STABLE if you're still using it.
 > 
 > It is not ready for commit, but should do the right thing regardless of
 > the scheduler involved.
 
 Hi Uli,
 
 I'm using 8.1-RC1 now. I applied your patch (verifying, beforehand, that
 the problem still exists). I now see piles of this in /var/log/messages:
 
 Jul  3 23:32:05 fbsdvbox snmpd[1814]: hrProcessorTable:
 sysctl(kern.cp_times) failed
 Jul  3 23:32:50 fbsdvbox last message repeated 7 times
 
 and now, the value "0" is returned no matter what the system load:
 
 demeter:~$ snmpwalk -mALL -c public -v2c 192.168.5.109|grep hrProcessorLoad
 HOST-RESOURCES-MIB::hrProcessorLoad.55 = INTEGER: 0
 
 - Julian

From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uqs@spoerlein.net>
To: Julian Dunn <jdunn@aquezada.com>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Sun, 4 Jul 2010 21:18:00 +0100

 --liOOAslEiF7prFVr
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 Hi Julian,
 
 thanks for testing the patch! It was too naive however and I already
 came up with a more robust patch that also has been successfully tested
 by Gustau. It would be super if you could give it a try as well.
 
 Cheers,
 Uli
 
 --liOOAslEiF7prFVr
 Content-Type: text/x-diff; charset=us-ascii
 Content-Disposition: attachment; filename="bsnmpd.diff"
 
 diff --git a/usr.sbin/bsnmpd/modules/snmp_hostres/Makefile b/usr.sbin/bsnmpd/modules/snmp_hostres/Makefile
 index 2922f45..7fa8e77 100644
 --- a/usr.sbin/bsnmpd/modules/snmp_hostres/Makefile
 +++ b/usr.sbin/bsnmpd/modules/snmp_hostres/Makefile
 @@ -48,7 +48,8 @@ SRCS=	hostres_begemot.c		\
  	printcap.c
  
  #Not having NDEBUG defined will enable assertions and a lot of output on stderr
 -CFLAGS+= -DNDEBUG -I${LPRSRC}
 +WARNS?=	1
 +CFLAGS+= -I${LPRSRC}
  XSYM=	host hrStorageOther hrStorageRam hrStorageVirtualMemory \
  	hrStorageFixedDisk hrStorageRemovableDisk hrStorageFloppyDisk \
  	hrStorageCompactDisc hrStorageRamDisk hrStorageFlashMemory \
 diff --git a/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c b/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c
 index 33f7b2d..1d8070b 100644
 --- a/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c
 +++ b/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c
 @@ -63,6 +63,7 @@ struct processor_entry {
  
  	/* the samples from the last minute, as required by MIB */
  	double		samples[MAX_CPU_SAMPLES];
 +	long		states[MAX_CPU_SAMPLES][CPUSTATES];
  
  	/* current sample to fill in next time, must be < MAX_CPU_SAMPLES */
  	uint32_t	cur_sample_idx;
 @@ -85,6 +86,8 @@ static int hw_ncpu;
  /* sysctlbyname(kern.{ccpu,fscale}) */
  static fixpt_t ccpu;
  static int fscale;
 +static int cpmib[2];
 +static size_t cplen;
  
  /* tick of PDU where we have refreshed the processor table last */
  static uint64_t proctbl_tick;
 @@ -112,6 +115,43 @@ get_avg_load(struct processor_entry *e)
  	return ((int)floor((double)sum/(double)e->sample_cnt));
  }
  
 +static int
 +get_avg_usage(struct processor_entry *e)
 +{
 +	u_int i, oldest;
 +	long delta = 0;
 +	double load = 0.0;
 +
 +	assert(e != NULL);
 +
 +	/* Need two samples to perform delta calculation */
 +	if (e->sample_cnt <= 1)
 +		return (0);
 +
 +	/* oldest usable index */
 +	if (e->sample_cnt == MAX_CPU_SAMPLES)
 +		oldest = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
 +	else
 +		oldest = 0;
 +
 +	/* FIXME handle wrap around */
 +	for (i = 0; i < CPUSTATES; i++) {
 +		delta += e->states[e->cur_sample_idx][i];
 +		delta -= e->states[oldest][i];
 +	}
 +	if (delta == 0)
 +		return 0;
 +
 +	/* XXX idle time is in the last index always?!? */
 +	load = (double)(e->states[e->cur_sample_idx][CPUSTATES-1] -
 +	    e->states[oldest][CPUSTATES-1]) / delta;
 +	load = 100 - (load*100);
 +	HRDBG("CPU no. %d delta ticks %ld pct usage %.2f", e->cpu_no,
 +	    delta, load);
 +
 +	return (floor(load));
 +}
 +
  /*
   * Stolen from /usr/src/bin/ps/print.c. The idle process should never
   * be swapped out :-)
 @@ -132,11 +172,15 @@ processor_getpcpu(struct kinfo_proc *ki_p)
   * Save a new sample
   */
  static void
 -save_sample(struct processor_entry *e, struct kinfo_proc *kp)
 +save_sample(struct processor_entry *e, struct kinfo_proc *kp, long *cp_times)
  {
 +	int i;
 +
 +	for (i = 0; cp_times != NULL && i < CPUSTATES; i++)
 +		e->states[e->cur_sample_idx][i] = cp_times[i];
  
  	e->samples[e->cur_sample_idx] = 100.0 - processor_getpcpu(kp);
 -	e->load = get_avg_load(e);
 +	e->load = get_avg_usage(e);
  	e->cur_sample_idx = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
  
  	if (++e->sample_cnt > MAX_CPU_SAMPLES)
 @@ -241,8 +285,6 @@ processor_get_pids(void)
  		entry->idle_pid = kp->ki_pid;
  		HRDBG("CPU no. %d with SNMP index=%d has idle PID %d",
  		    entry->cpu_no, entry->index, entry->idle_pid);
 -
 -		save_sample(entry, kp);
  	}
  }
  
 @@ -256,6 +298,7 @@ create_proc_table(void)
  	struct device_map_entry *map;
  	struct processor_entry *entry;
  	int cpu_no;
 +	size_t len;
  
  	detected_processor_count = 0;
  
 @@ -285,6 +328,20 @@ create_proc_table(void)
  
  	HRDBG("%d CPUs detected", detected_processor_count);
  
 +	len = 2;
 +	if (sysctlnametomib("kern.cp_times", cpmib, &len)) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctlnametomib(kern.cp_times) failed");
 +		cpmib[0] = 0;
 +		cpmib[1] = 0;
 +		cplen = 0;
 +	} else if (sysctl(cpmib, 2, NULL, &len, NULL, 0)) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.cp_times) length query failed");
 +		cplen = 0;
 +	} else {
 +		cplen = len / sizeof(long);
 +	}
 +	HRDBG("%zu entries for kern.cp_times", cplen);
 +
  	processor_get_pids();
  }
  
 @@ -386,12 +443,22 @@ static void
  refresh_processor_tbl(void)
  {
  	struct processor_entry *entry;
 -	int need_pids;
 +	int need_pids, nproc;
  	struct kinfo_proc *plist;
 -	int nproc;
 +	size_t size;
  
  	processor_refill_tbl();
  
 +	long pcpu_cp_times[cplen];
 +	memset(pcpu_cp_times, 0, sizeof(pcpu_cp_times));
 +
 +	size = cplen * sizeof(long);
 +	/* FIXME: assert entry->ncpu <= hw_ncpu <= cplen */
 +	if (sysctl(cpmib, 2, pcpu_cp_times, &size, NULL, 0) == -1) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.cp_times) failed");
 +		return;
 +	}
 +
  	need_pids = 0;
  	TAILQ_FOREACH(entry, &processor_tbl, link) {
  		if (entry->idle_pid <= 0) {
 @@ -410,7 +477,7 @@ refresh_processor_tbl(void)
  			need_pids = 1;
  			continue;
  		}
 -		save_sample(entry, plist);
 +		save_sample(entry, plist, &pcpu_cp_times[entry->cpu_no * CPUSTATES]);
  	}
  
  	if (need_pids == 1)
 
 --liOOAslEiF7prFVr--

From: Julian Dunn <jdunn@aquezada.com>
To: =?ISO-8859-1?Q?Ulrich_Sp=F6rlein?= <uqs@spoerlein.net>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Sun, 04 Jul 2010 21:17:40 -0400

 On 07/04/2010 04:18 PM, Ulrich Sprlein wrote:
 > Hi Julian,
 > 
 > thanks for testing the patch! It was too naive however and I already
 > came up with a more robust patch that also has been successfully tested
 > by Gustau. It would be super if you could give it a try as well.
 > 
 > Cheers,
 > Uli
 
 Hi Uli,
 
 This one seems to work much better. I tried exercising the system in
 various ways after applying the patch and the CPU utilization numbers
 looked meaningful to me. Thanks!
 
 - Julian
Responsible-Changed-From-To: harti->uqs 
Responsible-Changed-By: uqs 
Responsible-Changed-When: Mon Aug 2 12:00:04 UTC 2010 
Responsible-Changed-Why:  
Grab PR, as I have a working patch ... 

http://www.freebsd.org/cgi/query-pr.cgi?pr=130222 

From: Oleg Gawriloff <barzog@telecom.by>
To: bug-followup@FreeBSD.org, jdunn@aquezada.com
Cc:  
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Thu, 26 Aug 2010 17:50:06 +0300

 This is a cryptographically signed message in MIME format.
 
 --------------ms040704090909040708040006
 Content-Type: text/plain; charset=UTF-8; format=flowed
 Content-Transfer-Encoding: quoted-printable
 
   Tested on 7.3-p1:
 FreeBSD eagle-cl1.telecom.by 7.3-RELEASE-p1 FreeBSD 7.3-RELEASE-p1 #0:=20
 Wed Jun 30 00:23:27 EEST 2010    =20
 root@eagle-cl1.telecom.by:/usr/obj/usr/src/sys/EAGLE-CL1  amd64
 Works as suspected:
 HOST-RESOURCES-MIB::hrProcessorLoad.3 =3D INTEGER: 19
 HOST-RESOURCES-MIB::hrProcessorLoad.6 =3D INTEGER: 18
 HOST-RESOURCES-MIB::hrProcessorLoad.8 =3D INTEGER: 15
 HOST-RESOURCES-MIB::hrProcessorLoad.10 =3D INTEGER: 13
 HOST-RESOURCES-MIB::hrProcessorLoad.12 =3D INTEGER: 14
 HOST-RESOURCES-MIB::hrProcessorLoad.14 =3D INTEGER: 11
 HOST-RESOURCES-MIB::hrProcessorLoad.16 =3D INTEGER: 10
 HOST-RESOURCES-MIB::hrProcessorLoad.18 =3D INTEGER: 10
 
 --=20
 Signed, Oleg Gawriloff.
 
 
 
 --------------ms040704090909040708040006
 Content-Type: application/pkcs7-signature; name="smime.p7s"
 Content-Transfer-Encoding: base64
 Content-Disposition: attachment; filename="smime.p7s"
 Content-Description: S/MIME Cryptographic Signature
 
 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIHtDCC
 A9YwggM/oAMCAQICAgPOMA0GCSqGSIb3DQEBBAUAMIGYMQswCQYDVQQGEwJCWTEQMA4GA1UE
 CBMHQmVsYXJ1czEOMAwGA1UEBxMFTWluc2sxFzAVBgNVBAoTDkF0bGFudC1UZWxlY29tMRcw
 FQYDVQQLEw5BdGxhbnQtVGVsZWNvbTETMBEGA1UEAxMKdGVsZWNvbS5ieTEgMB4GCSqGSIb3
 DQEJARYRYmFyem9nQHRlbGVjb20uYnkwHhcNMTAwNzMwMjIwNDA1WhcNMTEwNzMwMjIwNDA1
 WjCBnzELMAkGA1UEBhMCQlkxEDAOBgNVBAgTB0JlbGFydXMxDjAMBgNVBAcTBU1pbnNrMRcw
 FQYDVQQKEw5BdGxhbnQtVGVsZWNvbTEXMBUGA1UECxMOQXRsYW50LVRlbGVjb20xGjAYBgNV
 BAMUEWJhcnpvZ0B0ZWxlY29tLmJ5MSAwHgYJKoZIhvcNAQkBFhFiYXJ6b2dAdGVsZWNvbS5i
 eTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAqqsbCpxFuW1sKeRhsQ/d5S4Ax6hzOylA
 WOsB3Cpv7CkWal9oyeWUrI/zqHZTmTn15uN3t2APF1T9ptgKTh/50qEL5eRN/AX6SK13KsNk
 N45BSYWrxzMAOAp6ek2ijAqvhWMKAfEcS/9l+hWZZ/Ne7IMihE3gRpeKXpAx+xbv0P0CAwEA
 AaOCASQwggEgMAkGA1UdEwQCMAAwLAYJYIZIAYb4QgENBB8WHU9wZW5TU0wgR2VuZXJhdGVk
 IENlcnRpZmljYXRlMB0GA1UdDgQWBBThlUWpYv/jjE7qQarbHhZRhWk0PDCBxQYDVR0jBIG9
 MIG6gBQ1Q0N4+R+9Wg0AAlfWqtErWCUf9aGBnqSBmzCBmDELMAkGA1UEBhMCQlkxEDAOBgNV
 BAgTB0JlbGFydXMxDjAMBgNVBAcTBU1pbnNrMRcwFQYDVQQKEw5BdGxhbnQtVGVsZWNvbTEX
 MBUGA1UECxMOQXRsYW50LVRlbGVjb20xEzARBgNVBAMTCnRlbGVjb20uYnkxIDAeBgkqhkiG
 9w0BCQEWEWJhcnpvZ0B0ZWxlY29tLmJ5ggEAMA0GCSqGSIb3DQEBBAUAA4GBADMKxAOCTKOI
 R+2ZgTCOkIF1za96OXoAfdPtgkU8c2fAZmRWUi/WNL256gJv+1dyZ6T95z84SUvOrfWiQYOo
 sT8tY+2iuyY8m19foRoJ3sYyIhrlAuKonMfaNb7BAMCJw4SfG6DsJQs9KE4GgeGGLjF9gAN5
 pn6JC+2tcj3MZvYgMIID1jCCAz+gAwIBAgICA84wDQYJKoZIhvcNAQEEBQAwgZgxCzAJBgNV
 BAYTAkJZMRAwDgYDVQQIEwdCZWxhcnVzMQ4wDAYDVQQHEwVNaW5zazEXMBUGA1UEChMOQXRs
 YW50LVRlbGVjb20xFzAVBgNVBAsTDkF0bGFudC1UZWxlY29tMRMwEQYDVQQDEwp0ZWxlY29t
 LmJ5MSAwHgYJKoZIhvcNAQkBFhFiYXJ6b2dAdGVsZWNvbS5ieTAeFw0xMDA3MzAyMjA0MDVa
 Fw0xMTA3MzAyMjA0MDVaMIGfMQswCQYDVQQGEwJCWTEQMA4GA1UECBMHQmVsYXJ1czEOMAwG
 A1UEBxMFTWluc2sxFzAVBgNVBAoTDkF0bGFudC1UZWxlY29tMRcwFQYDVQQLEw5BdGxhbnQt
 VGVsZWNvbTEaMBgGA1UEAxQRYmFyem9nQHRlbGVjb20uYnkxIDAeBgkqhkiG9w0BCQEWEWJh
 cnpvZ0B0ZWxlY29tLmJ5MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCqqxsKnEW5bWwp
 5GGxD93lLgDHqHM7KUBY6wHcKm/sKRZqX2jJ5ZSsj/OodlOZOfXm43e3YA8XVP2m2ApOH/nS
 oQvl5E38BfpIrXcqw2Q3jkFJhavHMwA4Cnp6TaKMCq+FYwoB8RxL/2X6FZln817sgyKETeBG
 l4pekDH7Fu/Q/QIDAQABo4IBJDCCASAwCQYDVR0TBAIwADAsBglghkgBhvhCAQ0EHxYdT3Bl
 blNTTCBHZW5lcmF0ZWQgQ2VydGlmaWNhdGUwHQYDVR0OBBYEFOGVRali/+OMTupBqtseFlGF
 aTQ8MIHFBgNVHSMEgb0wgbqAFDVDQ3j5H71aDQACV9aq0StYJR/1oYGepIGbMIGYMQswCQYD
 VQQGEwJCWTEQMA4GA1UECBMHQmVsYXJ1czEOMAwGA1UEBxMFTWluc2sxFzAVBgNVBAoTDkF0
 bGFudC1UZWxlY29tMRcwFQYDVQQLEw5BdGxhbnQtVGVsZWNvbTETMBEGA1UEAxMKdGVsZWNv
 bS5ieTEgMB4GCSqGSIb3DQEJARYRYmFyem9nQHRlbGVjb20uYnmCAQAwDQYJKoZIhvcNAQEE
 BQADgYEAMwrEA4JMo4hH7ZmBMI6QgXXNr3o5egB90+2CRTxzZ8BmZFZSL9Y0vbnqAm/7V3Jn
 pP3nPzhJS86t9aJBg6ixPy1j7aK7JjybX1+hGgnexjIiGuUC4qicx9o1vsEAwInDhJ8boOwl
 Cz0oTgaB4YYuMX2AA3mmfokL7a1yPcxm9iAxggNwMIIDbAIBATCBnzCBmDELMAkGA1UEBhMC
 QlkxEDAOBgNVBAgTB0JlbGFydXMxDjAMBgNVBAcTBU1pbnNrMRcwFQYDVQQKEw5BdGxhbnQt
 VGVsZWNvbTEXMBUGA1UECxMOQXRsYW50LVRlbGVjb20xEzARBgNVBAMTCnRlbGVjb20uYnkx
 IDAeBgkqhkiG9w0BCQEWEWJhcnpvZ0B0ZWxlY29tLmJ5AgIDzjAJBgUrDgMCGgUAoIICJjAY
 BgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xMDA4MjYxNDUwMDZa
 MCMGCSqGSIb3DQEJBDEWBBQxb8C/vn6KiiNZ3CpQhwp6wUS4qTBfBgkqhkiG9w0BCQ8xUjBQ
 MAsGCWCGSAFlAwQBAjAKBggqhkiG9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwIC
 AUAwBwYFKw4DAgcwDQYIKoZIhvcNAwICASgwgbAGCSsGAQQBgjcQBDGBojCBnzCBmDELMAkG
 A1UEBhMCQlkxEDAOBgNVBAgTB0JlbGFydXMxDjAMBgNVBAcTBU1pbnNrMRcwFQYDVQQKEw5B
 dGxhbnQtVGVsZWNvbTEXMBUGA1UECxMOQXRsYW50LVRlbGVjb20xEzARBgNVBAMTCnRlbGVj
 b20uYnkxIDAeBgkqhkiG9w0BCQEWEWJhcnpvZ0B0ZWxlY29tLmJ5AgIDzjCBsgYLKoZIhvcN
 AQkQAgsxgaKggZ8wgZgxCzAJBgNVBAYTAkJZMRAwDgYDVQQIEwdCZWxhcnVzMQ4wDAYDVQQH
 EwVNaW5zazEXMBUGA1UEChMOQXRsYW50LVRlbGVjb20xFzAVBgNVBAsTDkF0bGFudC1UZWxl
 Y29tMRMwEQYDVQQDEwp0ZWxlY29tLmJ5MSAwHgYJKoZIhvcNAQkBFhFiYXJ6b2dAdGVsZWNv
 bS5ieQICA84wDQYJKoZIhvcNAQEBBQAEgYCod9HmEmmOa9XAz0zwQrraQhH8rXm8iet3JiET
 RBgUQgk83zWiXZcECkxlgG06iuzaoINLFEy5wLDh4amgAYL+faLgbzmHorcNYmOI8eGpi7q3
 laSxlyru7NqDA2fb/R4D+K45hOwRwy3JZiMOhZ7EH2BJxoDYXESgCWLtwUc1aQAAAAAAAA==
 --------------ms040704090909040708040006--

From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uqs@spoerlein.net>
To: bug-followup@FreeBSD.org, Julian Dunn <jdunn@aquezada.com>
Cc: Oleg Gawriloff <barzog@telecom.by>,
        Gustau =?utf-8?B?UMOpcmV6?= <gperez@entel.upc.edu>
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Sat, 23 Oct 2010 19:39:59 +0200

 --Q59ABw34pTSIagmi
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 Sorry folks for the long delay, I now have a camera-ready patch, that
 I'm going to commit and MFC soon, as long as I get some positive testing
 feedback. It works fine for me on -STABLE and -CURRENT.
 
 Simple drop the attached file into
 /usr/src/usr.sbin/bsnmpd/modules/snmp_hostres replacing the existing
 file. This should work fine on 9.x and 8.x. Then
 
 # cd /usr/src/usr.sbin/bsnmpd
 # make clean; make obj; make depend
 # make && make install
 # /etc/rc.d/bsnmpd restart
 
 And see if you get meaningful results over an extended period.
 NB: the module will always return zero CPU load for the first 15s after
 startup.
 
 --Q59ABw34pTSIagmi
 Content-Type: text/x-csrc; charset=us-ascii
 Content-Disposition: attachment; filename="hostres_processor_tbl.c"
 
 /*-
  * Copyright (c) 2005-2006 The FreeBSD Project
  * All rights reserved.
  *
  * Author: Victor Cruceru <soc-victor@freebsd.org>
  *
  * Redistribution of this software and documentation and use in source and
  * binary forms, with or without modification, are permitted provided that
  * the following conditions are met:
  *
  * 1. Redistributions of source code or documentation must retain the above
  *    copyright notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 /*
  * Host Resources MIB for SNMPd. Implementation for hrProcessorTable
  */
 
 #include <sys/param.h>
 #include <sys/sysctl.h>
 #include <sys/user.h>
 
 #include <assert.h>
 #include <math.h>
 #include <stdlib.h>
 #include <string.h>
 #include <syslog.h>
 
 #include "hostres_snmp.h"
 #include "hostres_oid.h"
 #include "hostres_tree.h"
 
 /*
  * This structure is used to hold a SNMP table entry
  * for HOST-RESOURCES-MIB's hrProcessorTable.
  * Note that index is external being allocated & maintained
  * by the hrDeviceTable code..
  */
 struct processor_entry {
 	int32_t		index;
 	const struct asn_oid *frwId;
 	int32_t		load;		/* average cpu usage */
 	int32_t		sample_cnt;	/* number of usage samples */
 	int32_t		cur_sample_idx;	/* current valid sample */
 	TAILQ_ENTRY(processor_entry) link;
 	u_char		cpu_no;		/* which cpu, counted from 0 */
 
 	/* the samples from the last minute, as required by MIB */
 	double		samples[MAX_CPU_SAMPLES];
 	long		states[MAX_CPU_SAMPLES][CPUSTATES];
 };
 TAILQ_HEAD(processor_tbl, processor_entry);
 
 /* the head of the list with hrDeviceTable's entries */
 static struct processor_tbl processor_tbl =
     TAILQ_HEAD_INITIALIZER(processor_tbl);
 
 /* number of processors in dev tbl */
 static int32_t detected_processor_count;
 
 /* sysctlbyname(hw.ncpu) */
 static int hw_ncpu;
 
 /* sysctlbyname(kern.cp_times) */
 static int cpmib[2];
 static size_t cplen;
 
 /* periodic timer used to get cpu load stats */
 static void *cpus_load_timer;
 
 /**
  * Returns the CPU usage of a given processor entry.
  *
  * It needs at least two cp_times "tick" samples to calculate a delta and
  * thus, the usage over the sampling period.
  */
 static int
 get_avg_load(struct processor_entry *e)
 {
 	u_int i, oldest;
 	long delta = 0;
 	double usage = 0.0;
 
 	assert(e != NULL);
 
 	/* Need two samples to perform delta calculation. */
 	if (e->sample_cnt <= 1)
 		return (0);
 
 	/* Oldest usable index, we wrap around. */
 	if (e->sample_cnt == MAX_CPU_SAMPLES)
 		oldest = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
 	else
 		oldest = 0;
 
 	/* Sum delta for all states. */
 	for (i = 0; i < CPUSTATES; i++) {
 		delta += e->states[e->cur_sample_idx][i];
 		delta -= e->states[oldest][i];
 	}
 	if (delta == 0)
 		return 0;
 
 	/* Take idle time from the last element and convert to
 	 * percent usage by contrasting with total ticks delta. */
 	usage = (double)(e->states[e->cur_sample_idx][CPUSTATES-1] -
 	    e->states[oldest][CPUSTATES-1]) / delta;
 	usage = 100 - (usage * 100);
 	HRDBG("CPU no. %d, delta ticks %ld, pct usage %.2f", e->cpu_no,
 	    delta, usage);
 
 	return ((int)(usage));
 }
 
 /**
  * Save a new sample to proc entry and get the average usage.
  *
  * Samples are stored in a ringbuffer from 0..(MAX_CPU_SAMPLES-1)
  */
 static void
 save_sample(struct processor_entry *e, long *cp_times)
 {
 	int i;
 
 	e->cur_sample_idx = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
 	for (i = 0; cp_times != NULL && i < CPUSTATES; i++)
 		e->states[e->cur_sample_idx][i] = cp_times[i];
 
 	e->sample_cnt++;
 	if (e->sample_cnt > MAX_CPU_SAMPLES)
 		e->sample_cnt = MAX_CPU_SAMPLES;
 
 	HRDBG("sample count for CPU no. %d went to %d", e->cpu_no, e->sample_cnt);
 	e->load = get_avg_load(e);
 
 }
 
 /**
  * Create a new entry into the processor table.
  */
 static struct processor_entry *
 proc_create_entry(u_int cpu_no, struct device_map_entry *map)
 {
 	struct device_entry *dev;
 	struct processor_entry *entry;
 	char name[128];
 
 	/*
 	 * If there is no map entry create one by creating a device table
 	 * entry.
 	 */
 	if (map == NULL) {
 		snprintf(name, sizeof(name), "cpu%u", cpu_no);
 		if ((dev = device_entry_create(name, "", "")) == NULL)
 			return (NULL);
 		dev->flags |= HR_DEVICE_IMMUTABLE;
 		STAILQ_FOREACH(map, &device_map, link)
 			if (strcmp(map->name_key, name) == 0)
 				break;
 		if (map == NULL)
 			abort();
 	}
 
 	if ((entry = malloc(sizeof(*entry))) == NULL) {
 		syslog(LOG_ERR, "hrProcessorTable: %s malloc "
 		    "failed: %m", __func__);
 		return (NULL);
 	}
 	memset(entry, 0, sizeof(*entry));
 
 	entry->index = map->hrIndex;
 	entry->load = 0;
 	entry->sample_cnt = 0;
 	entry->cur_sample_idx = -1;
 	entry->cpu_no = (u_char)cpu_no;
 	entry->frwId = &oid_zeroDotZero; /* unknown id FIXME */
 
 	INSERT_OBJECT_INT(entry, &processor_tbl);
 
 	HRDBG("CPU %d added with SNMP index=%d",
 	    entry->cpu_no, entry->index);
 
 	return (entry);
 }
 
 /**
  * Scan the device map table for CPUs and create an entry into the
  * processor table for each CPU.
  *
  * Make sure that the number of processors announced by the kernel hw.ncpu
  * is equal to the number of processors we have found in the device table.
  */
 static void
 create_proc_table(void)
 {
 	struct device_map_entry *map;
 	struct processor_entry *entry;
 	int cpu_no;
 	size_t len;
 
 	detected_processor_count = 0;
 
 	/*
 	 * Because hrProcessorTable depends on hrDeviceTable,
 	 * the device detection must be performed at this point.
 	 * If not, no entries will be present in the hrProcessor Table.
 	 *
 	 * For non-ACPI system the processors are not in the device table,
 	 * therefore insert them after checking hw.ncpu.
 	 */
 	STAILQ_FOREACH(map, &device_map, link)
 		if (strncmp(map->name_key, "cpu", strlen("cpu")) == 0 &&
 		    strstr(map->location_key, ".CPU") != NULL) {
 			if (sscanf(map->name_key,"cpu%d", &cpu_no) != 1) {
 				syslog(LOG_ERR, "hrProcessorTable: Failed to "
 				    "get cpu no. from device named '%s'",
 				    map->name_key);
 				continue;
 			}
 
 			if ((entry = proc_create_entry(cpu_no, map)) == NULL)
 				continue;
 
 			detected_processor_count++;
 		}
 
 	len = sizeof(hw_ncpu);
 	if (sysctlbyname("hw.ncpu", &hw_ncpu, &len, NULL, 0) == -1 ||
 	    len != sizeof(hw_ncpu)) {
 		syslog(LOG_ERR, "hrProcessorTable: sysctl(hw.ncpu) failed");
 		hw_ncpu = 0;
 	}
 
 	HRDBG("%d CPUs detected via device table; hw.ncpu is %d",
 	    detected_processor_count, hw_ncpu);
 
 	/* XXX Can happen on non-ACPI systems? Create entries by hand. */
 	for (; detected_processor_count < hw_ncpu; detected_processor_count++)
 		proc_create_entry(detected_processor_count, NULL);
 
 	len = 2;
 	if (sysctlnametomib("kern.cp_times", cpmib, &len)) {
 		syslog(LOG_ERR, "hrProcessorTable: sysctlnametomib(kern.cp_times) failed");
 		cpmib[0] = 0;
 		cpmib[1] = 0;
 		cplen = 0;
 	} else if (sysctl(cpmib, 2, NULL, &len, NULL, 0)) {
 		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.cp_times) length query failed");
 		cplen = 0;
 	} else {
 		cplen = len / sizeof(long);
 	}
 	HRDBG("%zu entries for kern.cp_times", cplen);
 
 }
 
 /**
  * Free the processor table
  */
 static void
 free_proc_table(void)
 {
 	struct processor_entry *n1;
 
 	while ((n1 = TAILQ_FIRST(&processor_tbl)) != NULL) {
 		TAILQ_REMOVE(&processor_tbl, n1, link);
 		free(n1);
 		detected_processor_count--;
 	}
 
 	assert(detected_processor_count == 0);
 	detected_processor_count = 0;
 }
 
 /**
  * Refresh all values in the processor table. We call this once for
  * every PDU that accesses the table.
  */
 static void
 refresh_processor_tbl(void)
 {
 	struct processor_entry *entry;
 	size_t size;
 
 	long pcpu_cp_times[cplen];
 	memset(pcpu_cp_times, 0, sizeof(pcpu_cp_times));
 
 	size = cplen * sizeof(long);
 	if (sysctl(cpmib, 2, pcpu_cp_times, &size, NULL, 0) == -1 &&
 	    !(errno == ENOMEM && size >= cplen * sizeof(long))) {
 		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.cp_times) failed");
 		return;
 	}
 
 	TAILQ_FOREACH(entry, &processor_tbl, link) {
 		assert(hr_kd != NULL);
 		save_sample(entry, &pcpu_cp_times[entry->cpu_no * CPUSTATES]);
 	}
 
 }
 
 /**
  * This function is called MAX_CPU_SAMPLES times per minute to collect the
  * CPU load.
  */
 static void
 get_cpus_samples(void *arg __unused)
 {
 
 	HRDBG("[%llu] ENTER", (unsigned long long)get_ticks());
 	refresh_processor_tbl();
 	HRDBG("[%llu] EXIT", (unsigned long long)get_ticks());
 }
 
 /**
  * Called to start this table. We need to start the periodic idle
  * time collection.
  */
 void
 start_processor_tbl(struct lmodule *mod)
 {
 
 	/*
 	 * Start the cpu stats collector
 	 * The semantics of timer_start parameters is in "SNMP ticks";
 	 * we have 100 "SNMP ticks" per second, thus we are trying below
 	 * to get MAX_CPU_SAMPLES per minute
 	 */
 	cpus_load_timer = timer_start_repeat(100, 100 * 60 / MAX_CPU_SAMPLES,
 	    get_cpus_samples, NULL, mod);
 }
 
 /**
  * Init the things for hrProcessorTable.
  * Scan the device table for processor entries.
  */
 void
 init_processor_tbl(void)
 {
 
 	/* create the initial processor table */
 	create_proc_table();
 	/* and get first samples */
 	refresh_processor_tbl();
 }
 
 /**
  * Finalization routine for hrProcessorTable.
  * It destroys the lists and frees any allocated heap memory.
  */
 void
 fini_processor_tbl(void)
 {
 
 	if (cpus_load_timer != NULL) {
 		timer_stop(cpus_load_timer);
 		cpus_load_timer = NULL;
 	}
 
 	free_proc_table();
 }
 
 /**
  * Access routine for the processor table.
  */
 int
 op_hrProcessorTable(struct snmp_context *ctx __unused,
     struct snmp_value *value, u_int sub, u_int iidx __unused,
     enum snmp_op curr_op)
 {
 	struct processor_entry *entry;
 
 	switch (curr_op) {
 
 	case SNMP_OP_GETNEXT:
 		if ((entry = NEXT_OBJECT_INT(&processor_tbl,
 		    &value->var, sub)) == NULL)
 			return (SNMP_ERR_NOSUCHNAME);
 		value->var.len = sub + 1;
 		value->var.subs[sub] = entry->index;
 		goto get;
 
 	case SNMP_OP_GET:
 		if ((entry = FIND_OBJECT_INT(&processor_tbl,
 		    &value->var, sub)) == NULL)
 			return (SNMP_ERR_NOSUCHNAME);
 		goto get;
 
 	case SNMP_OP_SET:
 		if ((entry = FIND_OBJECT_INT(&processor_tbl,
 		    &value->var, sub)) == NULL)
 			return (SNMP_ERR_NO_CREATION);
 		return (SNMP_ERR_NOT_WRITEABLE);
 
 	case SNMP_OP_ROLLBACK:
 	case SNMP_OP_COMMIT:
 		abort();
 	}
 	abort();
 
   get:
 	switch (value->var.subs[sub - 1]) {
 
 	case LEAF_hrProcessorFrwID:
 		assert(entry->frwId != NULL);
 		value->v.oid = *entry->frwId;
 		return (SNMP_ERR_NOERROR);
 
 	case LEAF_hrProcessorLoad:
 		value->v.integer = entry->load;
 		return (SNMP_ERR_NOERROR);
 	}
 	abort();
 }
 
 --Q59ABw34pTSIagmi--

From: =?ISO-8859-1?Q?Gustau_P=E9rez?= <gperez@entel.upc.edu>
To: =?ISO-8859-1?Q?Ulrich_Sp=F6rlein?= <uqs@spoerlein.net>
Cc: bug-followup@FreeBSD.org, Julian Dunn <jdunn@aquezada.com>,
        Oleg Gawriloff <barzog@telecom.by>
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Sat, 23 Oct 2010 21:16:53 +0200

   Al 23/10/10 19:39, En/na Ulrich Sprlein ha escrit:
 > Sorry folks for the long delay, I now have a camera-ready patch, that
 > I'm going to commit and MFC soon, as long as I get some positive testing
 > feedback. It works fine for me on -STABLE and -CURRENT.
 >
 > Simple drop the attached file into
 > /usr/src/usr.sbin/bsnmpd/modules/snmp_hostres replacing the existing
 > file. This should work fine on 9.x and 8.x. Then
 >
 > # cd /usr/src/usr.sbin/bsnmpd
 > # make clean; make obj; make depend
 > # make&&  make install
 > # /etc/rc.d/bsnmpd restart
 >
 > And see if you get meaningful results over an extended period.
 > NB: the module will always return zero CPU load for the first 15s after
 > startup.
 
     Thanks !!! I'm testing it with my home server and it seems to work 
 (at least some snmpgets seem
 to bring reasonable values).
 
    My system at home has a very little workload (some mails, some 
 transmission and amule transfers, some
 samba, etc ...). Bsnmp seemed to be a little pessimistic at the 
 beginning, but then when having
 enough samples it seemed to tend to the right values I was seeing with 
 top. I guess it was because I tested the
 hrProcessorLoad values with :
 
             snmpwalk ... | grep hrProcessorLoad.
 
     which seemed to have raised the CPU usage.
 
     Anyhow, next monday I'll test with some n-core systems I have in 
 production. I'll will report my results, but I think it will work 
 flawlessly.
 
     Thanks for your work.
 
     Best regards,
 
     Gustau

From: "Julian C. Dunn" <jdunn@aquezada.com>
To: =?ISO-8859-1?Q?Ulrich_Sp=F6rlein?= <uqs@spoerlein.net>
Cc: bug-followup@FreeBSD.org, Oleg Gawriloff <barzog@telecom.by>, 
 =?ISO-8859-1?Q?Gustau_P=E9rez?= <gperez@entel.upc.edu>
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Tue, 26 Oct 2010 08:54:11 -0400

 On 10/23/2010 01:39 PM, Ulrich Sprlein wrote:
 > Sorry folks for the long delay, I now have a camera-ready patch, that
 > I'm going to commit and MFC soon, as long as I get some positive testing
 > feedback. It works fine for me on -STABLE and -CURRENT.
 > 
 > Simple drop the attached file into
 > /usr/src/usr.sbin/bsnmpd/modules/snmp_hostres replacing the existing
 > file. This should work fine on 9.x and 8.x. Then
 > 
 > # cd /usr/src/usr.sbin/bsnmpd
 > # make clean; make obj; make depend
 > # make && make install
 > # /etc/rc.d/bsnmpd restart
 > 
 > And see if you get meaningful results over an extended period.
 > NB: the module will always return zero CPU load for the first 15s after
 > startup.
 
 It seems fine to me. I did as instructed, then I generated some load
 with "make buildkernel". The numbers look reasonable.
 
 thanks!
 
 - Julian
 
 -- 
 [ Julian C. Dunn <jdunn@aquezada.com>          * Sorry, I'm    ]
 [ WWW: http://www.aquezada.com/staff/julian    * only Web 1.0  ]
 [ gopher://sdf.lonestar.org/11/users/keymaker  * compliant!    ]
 [ PGP: 91B3 7A9D 683C 7C16 715F 442C 6065 D533 FDC2 05B9       ]

From: =?ISO-8859-1?Q?Gustau_P=E9rez?= <gperez@entel.upc.edu>
To: =?ISO-8859-1?Q?Ulrich_Sp=F6rlein?= <uqs@spoerlein.net>
Cc: bug-followup@FreeBSD.org, Julian Dunn <jdunn@aquezada.com>,
        Oleg Gawriloff <barzog@telecom.by>
Subject: Re: kern/130222: bsnmpd snmp_hostres.so always returns 100% CPU
Date: Wed, 27 Oct 2010 07:41:50 +0200

 Al 23/10/10 19:39, En/na Ulrich Sprlein ha escrit:
 > Sorry folks for the long delay, I now have a camera-ready patch, that
 > I'm going to commit and MFC soon, as long as I get some positive testing
 > feedback. It works fine for me on -STABLE and -CURRENT.
 >
 > Simple drop the attached file into
 > /usr/src/usr.sbin/bsnmpd/modules/snmp_hostres replacing the existing
 > file. This should work fine on 9.x and 8.x. Then
 >
 > # cd /usr/src/usr.sbin/bsnmpd
 > # make clean; make obj; make depend
 > # make&&  make install
 > # /etc/rc.d/bsnmpd restart
 >
 > And see if you get meaningful results over an extended period.
 > NB: the module will always return zero CPU load for the first 15s after
 > startup.
     Hi Ulrich,
 
     I've trying it for a few days in some production servers. All have 8 
 cores available. I've been tracking their n-core activity by using cacti 
 and the results seem good to me. So it seems fine for me too, and I 
 would commit it to head if I were a commiter ;)
 
     Thanks !
 
     Gus

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/130222: commit references a PR
Date: Thu, 28 Oct 2010 20:18:32 +0000 (UTC)

 Author: uqs
 Date: Thu Oct 28 20:18:26 2010
 New Revision: 214489
 URL: http://svn.freebsd.org/changeset/base/214489
 
 Log:
   Fix CPU load reporting independent of scheduler used.
   
   - Sample CPU usage data from kern.cp_times, this makes for a far more
     accurate and scheduler independent algorithm.
   - Rip out the process list scraping that is no longer required.
   - Don't update CPU usage sampling on every request, but every 15s
     instead. This makes it impossible for an attacker to hide the CPU load
     by triggering 4 samplings in short succession when the system is idle.
   - After reaching the steady-state, the system will always report the
     average CPU load of the last 60 sampled seconds.
   - Untangling of call graph.
   
   PR:		kern/130222
   Tested by:	Julian Dunn <jdunn@aquezada.com>
   		Gustau Pérez <gperez@entel.upc.edu>
   		Jürgen Weiß <weiss@uni-mainz.de>
   MFC after:	2 weeks
   
   I'm unsure if some MIB standard states this must be the load average
   for, eg. 300s, it looks like net-snmp isn't even bothering to implement
   the CPU load reporting at all.
 
 Modified:
   head/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c
 
 Modified: head/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c
 ==============================================================================
 --- head/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c	Thu Oct 28 19:18:54 2010	(r214488)
 +++ head/usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c	Thu Oct 28 20:18:26 2010	(r214489)
 @@ -56,19 +56,15 @@
  struct processor_entry {
  	int32_t		index;
  	const struct asn_oid *frwId;
 -	int32_t		load;
 +	int32_t		load;		/* average cpu usage */
 +	int32_t		sample_cnt;	/* number of usage samples */
 +	int32_t		cur_sample_idx;	/* current valid sample */
  	TAILQ_ENTRY(processor_entry) link;
  	u_char		cpu_no;		/* which cpu, counted from 0 */
 -	pid_t		idle_pid;	/* PID of idle process for this CPU */
  
  	/* the samples from the last minute, as required by MIB */
  	double		samples[MAX_CPU_SAMPLES];
 -
 -	/* current sample to fill in next time, must be < MAX_CPU_SAMPLES */
 -	uint32_t	cur_sample_idx;
 -
 -	/* number of useful samples */
 -	uint32_t	sample_cnt;
 +	long		states[MAX_CPU_SAMPLES][CPUSTATES];
  };
  TAILQ_HEAD(processor_tbl, processor_entry);
  
 @@ -82,65 +78,78 @@ static int32_t detected_processor_count;
  /* sysctlbyname(hw.ncpu) */
  static int hw_ncpu;
  
 -/* sysctlbyname(kern.{ccpu,fscale}) */
 -static fixpt_t ccpu;
 -static int fscale;
 -
 -/* tick of PDU where we have refreshed the processor table last */
 -static uint64_t proctbl_tick;
 +/* sysctlbyname(kern.cp_times) */
 +static int cpmib[2];
 +static size_t cplen;
  
  /* periodic timer used to get cpu load stats */
  static void *cpus_load_timer;
  
 -/*
 - * Average the samples. The entire algorithm seems to be wrong XXX.
 +/**
 + * Returns the CPU usage of a given processor entry.
 + *
 + * It needs at least two cp_times "tick" samples to calculate a delta and
 + * thus, the usage over the sampling period.
   */
  static int
  get_avg_load(struct processor_entry *e)
  {
 -	u_int i;
 -	double sum = 0.0;
 +	u_int i, oldest;
 +	long delta = 0;
 +	double usage = 0.0;
  
  	assert(e != NULL);
  
 -	if (e->sample_cnt == 0)
 +	/* Need two samples to perform delta calculation. */
 +	if (e->sample_cnt <= 1)
  		return (0);
  
 -	for (i = 0; i < e->sample_cnt; i++)
 -		sum += e->samples[i];
 -
 -	return ((int)floor((double)sum/(double)e->sample_cnt));
 -}
 -
 -/*
 - * Stolen from /usr/src/bin/ps/print.c. The idle process should never
 - * be swapped out :-)
 - */
 -static double
 -processor_getpcpu(struct kinfo_proc *ki_p)
 -{
 -
 -	if (ccpu == 0 || fscale == 0)
 -		return (0.0);
 -
 -#define	fxtofl(fixpt) ((double)(fixpt) / fscale)
 -	return (100.0 * fxtofl(ki_p->ki_pctcpu) /
 -	    (1.0 - exp(ki_p->ki_swtime * log(fxtofl(ccpu)))));
 +	/* Oldest usable index, we wrap around. */
 +	if (e->sample_cnt == MAX_CPU_SAMPLES)
 +		oldest = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
 +	else
 +		oldest = 0;
 +
 +	/* Sum delta for all states. */
 +	for (i = 0; i < CPUSTATES; i++) {
 +		delta += e->states[e->cur_sample_idx][i];
 +		delta -= e->states[oldest][i];
 +	}
 +	if (delta == 0)
 +		return 0;
 +
 +	/* Take idle time from the last element and convert to
 +	 * percent usage by contrasting with total ticks delta. */
 +	usage = (double)(e->states[e->cur_sample_idx][CPUSTATES-1] -
 +	    e->states[oldest][CPUSTATES-1]) / delta;
 +	usage = 100 - (usage * 100);
 +	HRDBG("CPU no. %d, delta ticks %ld, pct usage %.2f", e->cpu_no,
 +	    delta, usage);
 +
 +	return ((int)(usage));
  }
  
  /**
 - * Save a new sample
 + * Save a new sample to proc entry and get the average usage.
 + *
 + * Samples are stored in a ringbuffer from 0..(MAX_CPU_SAMPLES-1)
   */
  static void
 -save_sample(struct processor_entry *e, struct kinfo_proc *kp)
 +save_sample(struct processor_entry *e, long *cp_times)
  {
 +	int i;
  
 -	e->samples[e->cur_sample_idx] = 100.0 - processor_getpcpu(kp);
 -	e->load = get_avg_load(e);
  	e->cur_sample_idx = (e->cur_sample_idx + 1) % MAX_CPU_SAMPLES;
 +	for (i = 0; cp_times != NULL && i < CPUSTATES; i++)
 +		e->states[e->cur_sample_idx][i] = cp_times[i];
  
 -	if (++e->sample_cnt > MAX_CPU_SAMPLES)
 +	e->sample_cnt++;
 +	if (e->sample_cnt > MAX_CPU_SAMPLES)
  		e->sample_cnt = MAX_CPU_SAMPLES;
 +
 +	HRDBG("sample count for CPU no. %d went to %d", e->cpu_no, e->sample_cnt);
 +	e->load = get_avg_load(e);
 +
  }
  
  /**
 @@ -178,8 +187,9 @@ proc_create_entry(u_int cpu_no, struct d
  
  	entry->index = map->hrIndex;
  	entry->load = 0;
 +	entry->sample_cnt = 0;
 +	entry->cur_sample_idx = -1;
  	entry->cpu_no = (u_char)cpu_no;
 -	entry->idle_pid = 0;
  	entry->frwId = &oid_zeroDotZero; /* unknown id FIXME */
  
  	INSERT_OBJECT_INT(entry, &processor_tbl);
 @@ -191,64 +201,11 @@ proc_create_entry(u_int cpu_no, struct d
  }
  
  /**
 - * Get the PIDs for the idle processes of the CPUs.
 - */
 -static void
 -processor_get_pids(void)
 -{
 -	struct kinfo_proc *plist, *kp;
 -	int i;
 -	int nproc;
 -	int cpu;
 -	int nchars;
 -	struct processor_entry *entry;
 -
 -	plist = kvm_getprocs(hr_kd, KERN_PROC_ALL, 0, &nproc);
 -	if (plist == NULL || nproc < 0) {
 -		syslog(LOG_ERR, "hrProcessor: kvm_getprocs() failed: %m");
 -		return;
 -	}
 -
 -	for (i = 0, kp = plist; i < nproc; i++, kp++) {
 -		if (!IS_KERNPROC(kp))
 -			continue;
 -
 -		if (strcmp(kp->ki_comm, "idle") == 0) {
 -			/* single processor system */
 -			cpu = 0;
 -		} else if (sscanf(kp->ki_comm, "idle: cpu%d%n", &cpu, &nchars)
 -		    == 1 && (u_int)nchars == strlen(kp->ki_comm)) {
 -			/* MP system */
 -		} else
 -			/* not an idle process */
 -			continue;
 -
 -		HRDBG("'%s' proc with pid %d is on CPU #%d (last on #%d)",
 -		    kp->ki_comm, kp->ki_pid, kp->ki_oncpu, kp->ki_lastcpu);
 -
 -		TAILQ_FOREACH(entry, &processor_tbl, link)
 -			if (entry->cpu_no == kp->ki_lastcpu)
 -				break;
 -
 -		if (entry == NULL) {
 -			/* create entry on non-ACPI systems */
 -			if ((entry = proc_create_entry(cpu, NULL)) == NULL)
 -				continue;
 -
 -			detected_processor_count++;
 -		}
 -
 -		entry->idle_pid = kp->ki_pid;
 -		HRDBG("CPU no. %d with SNMP index=%d has idle PID %d",
 -		    entry->cpu_no, entry->index, entry->idle_pid);
 -
 -		save_sample(entry, kp);
 -	}
 -}
 -
 -/**
   * Scan the device map table for CPUs and create an entry into the
 - * processor table for each CPU. Then fetch the idle PIDs for all CPUs.
 + * processor table for each CPU.
 + *
 + * Make sure that the number of processors announced by the kernel hw.ncpu
 + * is equal to the number of processors we have found in the device table.
   */
  static void
  create_proc_table(void)
 @@ -256,6 +213,7 @@ create_proc_table(void)
  	struct device_map_entry *map;
  	struct processor_entry *entry;
  	int cpu_no;
 +	size_t len;
  
  	detected_processor_count = 0;
  
 @@ -265,7 +223,7 @@ create_proc_table(void)
  	 * If not, no entries will be present in the hrProcessor Table.
  	 *
  	 * For non-ACPI system the processors are not in the device table,
 -	 * therefor insert them when getting the idle pids. XXX
 +	 * therefore insert them after checking hw.ncpu.
  	 */
  	STAILQ_FOREACH(map, &device_map, link)
  		if (strncmp(map->name_key, "cpu", strlen("cpu")) == 0 &&
 @@ -283,9 +241,34 @@ create_proc_table(void)
  			detected_processor_count++;
  		}
  
 -	HRDBG("%d CPUs detected", detected_processor_count);
 +	len = sizeof(hw_ncpu);
 +	if (sysctlbyname("hw.ncpu", &hw_ncpu, &len, NULL, 0) == -1 ||
 +	    len != sizeof(hw_ncpu)) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctl(hw.ncpu) failed");
 +		hw_ncpu = 0;
 +	}
 +
 +	HRDBG("%d CPUs detected via device table; hw.ncpu is %d",
 +	    detected_processor_count, hw_ncpu);
 +
 +	/* XXX Can happen on non-ACPI systems? Create entries by hand. */
 +	for (; detected_processor_count < hw_ncpu; detected_processor_count++)
 +		proc_create_entry(detected_processor_count, NULL);
 +
 +	len = 2;
 +	if (sysctlnametomib("kern.cp_times", cpmib, &len)) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctlnametomib(kern.cp_times) failed");
 +		cpmib[0] = 0;
 +		cpmib[1] = 0;
 +		cplen = 0;
 +	} else if (sysctl(cpmib, 2, NULL, &len, NULL, 0)) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.cp_times) length query failed");
 +		cplen = 0;
 +	} else {
 +		cplen = len / sizeof(long);
 +	}
 +	HRDBG("%zu entries for kern.cp_times", cplen);
  
 -	processor_get_pids();
  }
  
  /**
 @@ -307,78 +290,6 @@ free_proc_table(void)
  }
  
  /**
 - * Init the things for hrProcessorTable.
 - * Scan the device table for processor entries.
 - */
 -void
 -init_processor_tbl(void)
 -{
 -	size_t len;
 -
 -	/* get various parameters from the kernel */
 -	len = sizeof(ccpu);
 -	if (sysctlbyname("kern.ccpu", &ccpu, &len, NULL, 0) == -1) {
 -		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.ccpu) failed");
 -		ccpu = 0;
 -	}
 -
 -	len = sizeof(fscale);
 -	if (sysctlbyname("kern.fscale", &fscale, &len, NULL, 0) == -1) {
 -		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.fscale) failed");
 -		fscale = 0;
 -	}
 -
 -	/* create the initial processor table */
 -	create_proc_table();
 -}
 -
 -/**
 - * Finalization routine for hrProcessorTable.
 - * It destroys the lists and frees any allocated heap memory.
 - */
 -void
 -fini_processor_tbl(void)
 -{
 -
 -	if (cpus_load_timer != NULL) {
 -		timer_stop(cpus_load_timer);
 -		cpus_load_timer = NULL;
 -	}
 -
 -	free_proc_table();
 -}
 -
 -/**
 - * Make sure that the number of processors announced by the kernel hw.ncpu
 - * is equal to the number of processors we have found in the device table.
 - * If they differ rescan the device table.
 - */
 -static void
 -processor_refill_tbl(void)
 -{
 -
 -	HRDBG("hw_ncpu=%d detected_processor_count=%d", hw_ncpu,
 -	    detected_processor_count);
 -
 -	if (hw_ncpu <= 0) {
 -		size_t size = sizeof(hw_ncpu);
 -
 -		if (sysctlbyname("hw.ncpu", &hw_ncpu, &size, NULL, 0) == -1 ||
 -		    size != sizeof(hw_ncpu)) {
 -			syslog(LOG_ERR, "hrProcessorTable: "
 -			    "sysctl(hw.ncpu) failed: %m");
 -			hw_ncpu = 0;
 -			return;
 -		}
 -	}
 -
 -	if (hw_ncpu != detected_processor_count) {
 -		free_proc_table();
 -		create_proc_table();
 -	}
 -}
 -
 -/**
   * Refresh all values in the processor table. We call this once for
   * every PDU that accesses the table.
   */
 @@ -386,37 +297,23 @@ static void
  refresh_processor_tbl(void)
  {
  	struct processor_entry *entry;
 -	int need_pids;
 -	struct kinfo_proc *plist;
 -	int nproc;
 +	size_t size;
  
 -	processor_refill_tbl();
 +	long pcpu_cp_times[cplen];
 +	memset(pcpu_cp_times, 0, sizeof(pcpu_cp_times));
  
 -	need_pids = 0;
 -	TAILQ_FOREACH(entry, &processor_tbl, link) {
 -		if (entry->idle_pid <= 0) {
 -			need_pids = 1;
 -			continue;
 -		}
 +	size = cplen * sizeof(long);
 +	if (sysctl(cpmib, 2, pcpu_cp_times, &size, NULL, 0) == -1 &&
 +	    !(errno == ENOMEM && size >= cplen * sizeof(long))) {
 +		syslog(LOG_ERR, "hrProcessorTable: sysctl(kern.cp_times) failed");
 +		return;
 +	}
  
 +	TAILQ_FOREACH(entry, &processor_tbl, link) {
  		assert(hr_kd != NULL);
 -
 -		plist = kvm_getprocs(hr_kd, KERN_PROC_PID,
 -		    entry->idle_pid, &nproc);
 -		if (plist == NULL || nproc != 1) {
 -			syslog(LOG_ERR, "%s: missing item with "
 -			    "PID = %d for CPU #%d\n ", __func__,
 -			    entry->idle_pid, entry->cpu_no);
 -			need_pids = 1;
 -			continue;
 -		}
 -		save_sample(entry, plist);
 +		save_sample(entry, &pcpu_cp_times[entry->cpu_no * CPUSTATES]);
  	}
  
 -	if (need_pids == 1)
 -		processor_get_pids();
 -
 -	proctbl_tick = this_tick;
  }
  
  /**
 @@ -451,6 +348,36 @@ start_processor_tbl(struct lmodule *mod)
  }
  
  /**
 + * Init the things for hrProcessorTable.
 + * Scan the device table for processor entries.
 + */
 +void
 +init_processor_tbl(void)
 +{
 +
 +	/* create the initial processor table */
 +	create_proc_table();
 +	/* and get first samples */
 +	refresh_processor_tbl();
 +}
 +
 +/**
 + * Finalization routine for hrProcessorTable.
 + * It destroys the lists and frees any allocated heap memory.
 + */
 +void
 +fini_processor_tbl(void)
 +{
 +
 +	if (cpus_load_timer != NULL) {
 +		timer_stop(cpus_load_timer);
 +		cpus_load_timer = NULL;
 +	}
 +
 +	free_proc_table();
 +}
 +
 +/**
   * Access routine for the processor table.
   */
  int
 @@ -460,9 +387,6 @@ op_hrProcessorTable(struct snmp_context 
  {
  	struct processor_entry *entry;
  
 -	if (this_tick != proctbl_tick)
 -		refresh_processor_tbl();
 -
  	switch (curr_op) {
  
  	case SNMP_OP_GETNEXT:
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: uqs 
State-Changed-When: Sat Oct 30 14:48:14 UTC 2010 
State-Changed-Why:  
Patched in head, keep around for MFC 

http://www.freebsd.org/cgi/query-pr.cgi?pr=130222 
State-Changed-From-To: patched->closed 
State-Changed-By: uqs 
State-Changed-When: Tue Nov 23 21:37:10 UTC 2010 
State-Changed-Why:  
Merged to stable branches. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=130222 
>Unformatted:
 utilization
