From leres@ee.lbl.gov  Wed Feb 21 23:45:18 2007
Return-Path: <leres@ee.lbl.gov>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 4B68116A400
	for <freebsd-gnats-submit@freebsd.org>; Wed, 21 Feb 2007 23:45:18 +0000 (UTC)
	(envelope-from leres@ee.lbl.gov)
Received: from fun.ee.lbl.gov (fun.ee.lbl.gov [131.243.1.81])
	by mx1.freebsd.org (Postfix) with ESMTP id 2E25913C494
	for <freebsd-gnats-submit@freebsd.org>; Wed, 21 Feb 2007 23:45:17 +0000 (UTC)
	(envelope-from leres@ee.lbl.gov)
Received: from fun.ee.lbl.gov (localhost [127.0.0.1])
	by fun.ee.lbl.gov (8.14.0/8.14.0) with ESMTP id l1LNjH7e083972
	for <freebsd-gnats-submit@freebsd.org>; Wed, 21 Feb 2007 15:45:17 -0800 (PST)
Received: from fun.ee.lbl.gov (leres@localhost)
	by fun.ee.lbl.gov (8.14.0/8.14.0/Submit) with ESMTP id l1LNjH4u083969
	for <freebsd-gnats-submit@freebsd.org>; Wed, 21 Feb 2007 15:45:17 -0800 (PST)
Message-Id: <200702212345.l1LNjH4u083969@fun.ee.lbl.gov>
Date: Wed, 21 Feb 2007 15:45:17 -0800
From: Craig Leres <leres@ee.lbl.gov>
To: freebsd-gnats-submit@freebsd.org
Subject: [PATCH] top shows at least 50% idle when hyperthreading is disabled
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         109413
>Category:       bin
>Synopsis:       [patch] top(1) shows at least 50% idle when hyperthreading is disabled
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    jhb
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Feb 21 23:50:08 GMT 2007
>Closed-Date:    Thu Jul 07 19:07:45 UTC 2011
>Last-Modified:  Thu Jul 07 19:07:45 UTC 2011
>Originator:     Craig Leres
>Release:        FreeBSD 6.2-RELEASE i386
>Organization:
Lawrence Berkeley National Laboratory
>Environment:
	% uname -a
	FreeBSD fun.ee.lbl.gov 6.2-RELEASE FreeBSD 6.2-RELEASE #1: Mon Jan 15 11:31:32 PST 2007     leres@fox.ee.lbl.gov:/usr/src/6.2-RELEASE/sys/i386/compile/LBLSMP  i386

>Description:
	If the hardware and kernel support hyperthreading but
	hyperthreading is disabled (which is the default), top
	always reports at least 50% idle time even all cpus completely
	busy.

>How-To-Repeat:
	Insure hyperthreading is disabled:

	    # sysctl machdep.hyperthreading_allowed=0
	    machdep.hyperthreading_allowed: 0 -> 0

	Note that if sysctl says "unknown oid" then your test system
	does not support hyperthreading and you need to find a
	different system that does.

	Launch top and then start some cpu bound processes; observe
	that idle never goes below 50%.

	Another way to see this effect is to look at the cp_time
	vector on the number of ticks tallied in a second.

	Here's a single processor Pentium III:

	    CPU: Intel(R) Pentium(R) III CPU family      1266MHz (1266.07-MHz 686-class CPU)
	      Origin = "GenuineIntel"  Id = 0x6b1  Stepping = 1
	      Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
	    real memory  = 2147418112 (2047 MB)
	    avail memory = 2096291840 (1999 MB)
	    [...]
	    cpu0: <ACPI CPU> on acpi0

	It see about 133 ticks in a second:

	    % sysctl kern.cp_time ; sleep 1 ; sysctl kern.cp_time
	    kern.cp_time: 386 142 8685 210 9844275
	    kern.cp_time: 386 142 8686 210 9844408

	Here's a dual processor Xeon system with hyperthreading (4
	logical CPUs):

	    CPU: Intel(R) Xeon(TM) CPU 3.20GHz (3200.13-MHz 686-class CPU)
	      Origin = "GenuineIntel"  Id = 0xf34  Stepping = 4
	      Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
	      Features2=0x441d<SSE3,RSVD2,MON,DS_CPL,CNTX-ID,<b14>>
	      AMD Features=0x20000000<LM>
	      Logical CPUs per core: 2
	    real memory  = 2146893824 (2047 MB)
	    avail memory = 2095759360 (1998 MB)
	    FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
	     cpu0 (BSP): APIC ID:  0
	     cpu1 (AP): APIC ID:  1
	     cpu2 (AP): APIC ID:  6
	     cpu3 (AP): APIC ID:  7

	It see about 534 ticks in a second:

	    % sysctl kern.cp_time ; sleep 1 ; sysctl kern.cp_time
	    kern.cp_time: 31410710 498316 11249090 2058151 1597290413
	    kern.cp_time: 31410710 498316 11249091 2058151 1597290946

	So with 4 times as many logical processors, we see 4 times
	as many ticks. What's interesting is that the number of
	ticks per second is independent of the setting of
	hyperthreading_allowed. But when hyperthreading is disabled,
	two of the processors can only contribute to CP_IDLE ticks
	while the other two can contribute to all types of ticks.

>Fix:
	The appended patch to usr.bin/top/machine.c checks to see
	if hyperthreading is possible but disabled and if so,
	subtracts ticks from cp_time[CP_IDLE].

===================================================================
RCS file: RCS/machine.c,v
retrieving revision 1.1
diff -c -r1.1 machine.c
*** machine.c	2007/02/21 22:56:13	1.1
--- machine.c	2007/02/21 23:07:52
***************
*** 150,155 ****
--- 150,156 ----
  static long cp_time[CPUSTATES];
  static long cp_old[CPUSTATES];
  static long cp_diff[CPUSTATES];
+ static int hyperthread_hack;
  
  /* these are for detailing the process states */
  
***************
*** 226,240 ****
  machine_init(struct statics *statics)
  {
  	int pagesize;
! 	size_t modelen;
  	struct passwd *pw;
  
! 	modelen = sizeof(smpmode);
! 	if ((sysctlbyname("machdep.smp_active", &smpmode, &modelen, NULL, 0) < 0 &&
! 		sysctlbyname("kern.smp.active", &smpmode, &modelen, NULL, 0) < 0) ||
! 	    modelen != sizeof(smpmode))
  		smpmode = 0;
  
  	while ((pw = getpwent()) != NULL) {
  		if (strlen(pw->pw_name) > namelength)
  			namelength = strlen(pw->pw_name);
--- 227,250 ----
  machine_init(struct statics *statics)
  {
  	int pagesize;
! 	size_t len;
! 	int hyperthreading_allowed;
  	struct passwd *pw;
  
! 	len = sizeof(smpmode);
! 	if ((sysctlbyname("machdep.smp_active", &smpmode, &len, NULL, 0) < 0 &&
! 		sysctlbyname("kern.smp.active", &smpmode, &len, NULL, 0) < 0) ||
! 	    len != sizeof(smpmode))
  		smpmode = 0;
  
+ 	/* Determine if hyperthreading is supported but disabled */
+ 	len = sizeof(hyperthreading_allowed);
+ 	if (sysctlbyname("machdep.hyperthreading_allowed",
+ 	    &hyperthreading_allowed, &len, NULL, 0) >= 0 &&
+ 	    len == sizeof(hyperthreading_allowed) &&
+ 	    !hyperthreading_allowed)
+ 	    ++hyperthread_hack;
+ 
  	while ((pw = getpwent()) != NULL) {
  		if (strlen(pw->pw_name) > namelength)
  			namelength = strlen(pw->pw_name);
***************
*** 330,335 ****
--- 340,360 ----
  	GETSYSCTL("kern.cp_time", cp_time);
  	GETSYSCTL("vm.loadavg", sysload);
  	GETSYSCTL("kern.lastpid", lastpid);
+ 
+ 	/*
+ 	 * If hyperthreading is supported but disabled, at most 1/2
+ 	 * of the total ticks can be non-idle. To avoid having idle
+ 	 * always displayed as at least 50%, subtract out the phantom
+ 	 * idle ticks out.
+ 	 */
+ 	if (hyperthread_hack) {
+ 		total = 0;
+ 		for (i = 0; i < CPUSTATES; ++i)
+ 			total += cp_time[i];
+ 		cp_time[CP_IDLE] -= total / 2;
+ 		if (cp_time[CP_IDLE] < 0)
+ 			cp_time[CP_IDLE] = 0;
+ 	}
  
  	/* convert load averages to doubles */
  	for (i = 0; i < 3; i++)
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->jhb 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Mon Jun 25 11:43:34 UTC 2007 
Responsible-Changed-Why:  
Assign this PR to John Baldwin as he has been involved in development in 
these areas and may be able to comment on the best way to fix this.  My 
intuition is that we should thinking about changing the kernel measurement 
and reporting bits rather than top, as other tools will otherwise remain 
incorrect even if top is fixed. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=109413 

From: John Baldwin <john@baldwin.cx>
To: bug-followup@freebsd.org, leres@ee.lbl.gov
Cc:  
Subject: Re: bin/109413: [PATCH] top(1) shows at least 50% idle when hyperthreading is disabled
Date: Mon, 25 Jun 2007 14:38:26 -0400

 Yes, the place to fix this is in the kernel, not in userland applications.  We 
 have a hack at work for 4.x at least to address this, but a more proper fix 
 is needed and requires us to have good knowledge in the kernel of online vs 
 offline CPUs.
 
 --- //depot/vendor/freebsd_4/src/sys/kern/kern_clock.c	2003/08/22 15:39:19
 +++ //depot/yahoo/ybsd_4/src/sys/kern/kern_clock.c	2005/02/01 08:02:41
 @@ -376,6 +375,11 @@
  	}
  }
  
 +#ifdef SMP
 +/* XXXHACK */
 +extern int hlt_cpus_mask;
 +#endif
 +
  /*
   * Statistics clock.  Grab profile sample, and if divider reaches 0,
   * do process and kernel statistics.  Most of the statistics are only
 @@ -450,6 +450,15 @@
  		 * so that we know how much of its real time was spent
  		 * in ``non-process'' (i.e., interrupt) work.
  		 */
 +#ifdef SMP
 +		/*
 +		 * XXXHACK: If this is a halted CPU, then don't count it
 +		 * in the statistics.
 +		 */
 +		if (hlt_cpus_mask & 1 << cpuid)
 +			p = NULL;
 +		else {
 +#endif
  		p = curproc;
  		if (CLKF_INTR(frame)) {
  			if (p != NULL)
 @@ -460,6 +469,9 @@
  			cp_time[CP_SYS]++;
  		} else
  			cp_time[CP_IDLE]++;
 +#ifdef SMP
 +		}
 +#endif
  	}
  	pscnt = psdiv;
  
 
 -- 
 John Baldwin
State-Changed-From-To: open->closed 
State-Changed-By: jhb 
State-Changed-When: Thu Jul 7 19:06:49 UTC 2011 
State-Changed-Why:  
HEAD no longer allows dynamic disabling of HTT which "fixes" this. 
Eventually we will have true online/offline CPU support and we will 
ensure top(1) works properly with offline CPUs once that happens. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=109413 
>Unformatted:
