From nobody@FreeBSD.org  Thu Mar 20 09:47:50 2014
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1])
	(using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by hub.freebsd.org (Postfix) with ESMTPS id D3483777
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 20 Mar 2014 09:47:50 +0000 (UTC)
Received: from cgiserv.freebsd.org (cgiserv.freebsd.org [IPv6:2001:1900:2254:206a::50:4])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mx1.freebsd.org (Postfix) with ESMTPS id BFFFDB76
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 20 Mar 2014 09:47:50 +0000 (UTC)
Received: from cgiserv.freebsd.org ([127.0.1.6])
	by cgiserv.freebsd.org (8.14.8/8.14.8) with ESMTP id s2K9lo6C036225
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 20 Mar 2014 09:47:50 GMT
	(envelope-from nobody@cgiserv.freebsd.org)
Received: (from nobody@localhost)
	by cgiserv.freebsd.org (8.14.8/8.14.8/Submit) id s2K9looW036224;
	Thu, 20 Mar 2014 09:47:50 GMT
	(envelope-from nobody)
Message-Id: <201403200947.s2K9looW036224@cgiserv.freebsd.org>
Date: Thu, 20 Mar 2014 09:47:50 GMT
From: Alok Kataria <akataria@vmware.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: TSC frequency under virtualization environment is slightly off in overcommitted situations.
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         187783
>Category:       kern
>Synopsis:       TSC frequency under virtualization environment is slightly off in overcommitted situations.
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    jkim
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Mar 20 09:50:00 UTC 2014
>Closed-Date:    
>Last-Modified:  Tue May 13 22:50:00 UTC 2014
>Originator:     Alok Kataria
>Release:        FreeBSD 8.3/8.4
>Organization:
VMware
>Environment:
>Description:
We have seen cases where the TSC frequency that kernel calibrates is
slightly off from the actual TSC frequency when the FreeBSD kernel is
running under a virtualized environment. This problem is amplified when
the system is host is overcommitted. We have seen errors as high as
500ppm with the TSC frequency that gets calculated. 

To solve this problem, FreeBSD 9.x has an updated kernel which instead
of calibrating the TSC frequency, queries the hypervisor for that
information. 

It looks like FreeBSD 8.4 will be supported for another year still (June
2015), so to solve this problem I think we should backport that fix for
the next 8.4 kernel update too. Since that change has been well tested
on 9.x and later it should be pretty low risk IMO.
>How-To-Repeat:

>Fix:
The original fix, which needs to be [back] ported can be found here.

http://lists.freebsd.org/pipermail/svn-src-head/2011-April/026998.html
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->jkim 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Wed Apr 16 01:22:24 UTC 2014 
Responsible-Changed-Why:  
over to committer of r221214. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=187783 

From: Jung-uk Kim <jkim@FreeBSD.org>
To: bug-followup@FreeBSD.org, akataria@vmware.com
Cc:  
Subject: Re: kern/187783: TSC frequency under virtualization environment is
 slightly off in overcommitted situations.
Date: Tue, 13 May 2014 18:49:33 -0400

 This is a multi-part message in MIME format.
 --------------000502090603070805030500
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: 7bit
 
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Please try the attached patch (not tested).
 
 Jung-uk Kim
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (FreeBSD)
 
 iQEcBAEBAgAGBQJTcqF9AAoJEHyflib82/FGodYH/2JYkb1CarhJYryNL6uUb6CR
 sv60THBAokw2e+muwVgkl5AFtguXMZsfAEJdT1zGzHFPbUH4jZpHNcRIQG4TJAqz
 dfMfnRQsrl94Nv8w0jKLX6HU9wsKh2xopfoEGzpgxJFN+zzKCA57aOaEr03GO9Qx
 ltQckrRgEmFsR4gLXxMVs3kRFsMDXNhuRFJWW1NK8t6YlK/sUzRTpaniOq0iCZqw
 J9x9ziN3I7jfUaC+5xMilwBa8Pw5OFc0luKsuZiKMPDK6+mbDaszfPoIojzSmn0p
 l1uQY0tkraL5N9S8KQq8ZF6C0kLiaPXvYKNYR9daPjuFMYtWYN63EycP+76p9PI=
 =EiQS
 -----END PGP SIGNATURE-----
 
 --------------000502090603070805030500
 Content-Type: text/x-patch;
  name="tsc.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
  filename="tsc.diff"
 
 Index: sys/amd64/amd64/tsc.c
 ===================================================================
 --- sys/amd64/amd64/tsc.c	(revision 265982)
 +++ sys/amd64/amd64/tsc.c	(working copy)
 @@ -32,6 +32,7 @@ __FBSDID("$FreeBSD$");
  #include <sys/param.h>
  #include <sys/bus.h>
  #include <sys/cpu.h>
 +#include <sys/limits.h>
  #include <sys/malloc.h>
  #include <sys/systm.h>
  #include <sys/sysctl.h>
 @@ -78,11 +79,95 @@ static struct timecounter tsc_timecounter = {
  	800,			/* quality (adjusted in code) */
  };
  
 -void
 -init_TSC(void)
 +#define	VMW_HVMAGIC		0x564d5868
 +#define	VMW_HVPORT		0x5658
 +#define	VMW_HVCMD_GETVERSION	10
 +#define	VMW_HVCMD_GETHZ		45
 +
 +static __inline void
 +vmware_hvcall(u_int cmd, u_int *p)
  {
 +
 +	__asm __volatile("inl %w3, %0"
 +	: "=a" (p[0]), "=b" (p[1]), "=c" (p[2]), "=d" (p[3])
 +	: "0" (VMW_HVMAGIC), "1" (UINT_MAX), "2" (cmd), "3" (VMW_HVPORT)
 +	: "memory");
 +}
 +
 +static int
 +tsc_freq_vmware(void)
 +{
 +	char hv_sig[13];
 +	u_int regs[4];
 +	char *p;
 +	u_int hv_high;
 +	int i;
 +
 +	/*
 +	 * [RFC] CPUID usage for interaction between Hypervisors and Linux.
 +	 * http://lkml.org/lkml/2008/10/1/246
 +	 *
 +	 * KB1009458: Mechanisms to determine if software is running in
 +	 * a VMware virtual machine
 +	 * http://kb.vmware.com/kb/1009458
 +	 */
 +	hv_high = 0;
 +	if ((cpu_feature2 & CPUID2_HV) != 0) {
 +		do_cpuid(0x40000000, regs);
 +		hv_high = regs[0];
 +		for (i = 1, p = hv_sig; i < 4; i++, p += sizeof(regs) / 4)
 +			memcpy(p, &regs[i], sizeof(regs[i]));
 +		*p = '\0';
 +		if (bootverbose) {
 +			/*
 +			 * HV vendor	ID string
 +			 * ------------+--------------
 +			 * KVM		"KVMKVMKVM"
 +			 * Microsoft	"Microsoft Hv"
 +			 * VMware	"VMwareVMware"
 +			 * Xen		"XenVMMXenVMM"
 +			 */
 +			printf("Hypervisor: Origin = \"%s\"\n", hv_sig);
 +		}
 +		if (strncmp(hv_sig, "VMwareVMware", 12) != 0)
 +			return (0);
 +	} else {
 +		p = getenv("smbios.system.serial");
 +		if (p == NULL)
 +			return (0);
 +		if (strncmp(p, "VMware-", 7) != 0 &&
 +		    strncmp(p, "VMW", 3) != 0) {
 +			freeenv(p);
 +			return (0);
 +		}
 +		freeenv(p);
 +		vmware_hvcall(VMW_HVCMD_GETVERSION, regs);
 +		if (regs[1] != VMW_HVMAGIC)
 +			return (0);
 +	}
 +	if (hv_high >= 0x40000010) {
 +		do_cpuid(0x40000010, regs);
 +		tsc_freq = regs[0] * 1000;
 +	} else {
 +		vmware_hvcall(VMW_HVCMD_GETHZ, regs);
 +		if (regs[1] != UINT_MAX)
 +			tsc_freq = regs[0] | ((uint64_t)regs[1] << 32);
 +	}
 +	tsc_is_invariant = 1;
 +#ifdef SMP
 +	smp_tsc = 1;	/* XXX */
 +#endif
 +	return (1);
 +}
 +
 +static void
 +probe_tsc_freq(void)
 +{
  	u_int64_t tscval[2];
  
 +	if (tsc_freq_vmware())
 +		return;
 +
  	if (bootverbose)
  	        printf("Calibrating TSC clock ... ");
  
 @@ -93,7 +178,14 @@ static struct timecounter tsc_timecounter = {
  	tsc_freq = tscval[1] - tscval[0];
  	if (bootverbose)
  		printf("TSC clock: %lu Hz\n", tsc_freq);
 +}
  
 +void
 +init_TSC(void)
 +{
 +
 +	probe_tsc_freq();
 +
  	/*
  	 * Inform CPU accounting about our boot-time clock rate.  Once the
  	 * system is finished booting, we will get the real max clock rate
 Index: sys/i386/i386/tsc.c
 ===================================================================
 --- sys/i386/i386/tsc.c	(revision 265982)
 +++ sys/i386/i386/tsc.c	(working copy)
 @@ -32,6 +32,7 @@ __FBSDID("$FreeBSD$");
  #include <sys/param.h>
  #include <sys/bus.h>
  #include <sys/cpu.h>
 +#include <sys/limits.h>
  #include <sys/malloc.h>
  #include <sys/systm.h>
  #include <sys/sysctl.h>
 @@ -79,11 +80,95 @@ static struct timecounter tsc_timecounter = {
  	800,			/* quality (adjusted in code) */
  };
  
 -void
 -init_TSC(void)
 +#define	VMW_HVMAGIC		0x564d5868
 +#define	VMW_HVPORT		0x5658
 +#define	VMW_HVCMD_GETVERSION	10
 +#define	VMW_HVCMD_GETHZ		45
 +
 +static __inline void
 +vmware_hvcall(u_int cmd, u_int *p)
  {
 +
 +	__asm __volatile("inl %w3, %0"
 +	: "=a" (p[0]), "=b" (p[1]), "=c" (p[2]), "=d" (p[3])
 +	: "0" (VMW_HVMAGIC), "1" (UINT_MAX), "2" (cmd), "3" (VMW_HVPORT)
 +	: "memory");
 +}
 +
 +static int
 +tsc_freq_vmware(void)
 +{
 +	char hv_sig[13];
 +	u_int regs[4];
 +	char *p;
 +	u_int hv_high;
 +	int i;
 +
 +	/*
 +	 * [RFC] CPUID usage for interaction between Hypervisors and Linux.
 +	 * http://lkml.org/lkml/2008/10/1/246
 +	 *
 +	 * KB1009458: Mechanisms to determine if software is running in
 +	 * a VMware virtual machine
 +	 * http://kb.vmware.com/kb/1009458
 +	 */
 +	hv_high = 0;
 +	if ((cpu_feature2 & CPUID2_HV) != 0) {
 +		do_cpuid(0x40000000, regs);
 +		hv_high = regs[0];
 +		for (i = 1, p = hv_sig; i < 4; i++, p += sizeof(regs) / 4)
 +			memcpy(p, &regs[i], sizeof(regs[i]));
 +		*p = '\0';
 +		if (bootverbose) {
 +			/*
 +			 * HV vendor	ID string
 +			 * ------------+--------------
 +			 * KVM		"KVMKVMKVM"
 +			 * Microsoft	"Microsoft Hv"
 +			 * VMware	"VMwareVMware"
 +			 * Xen		"XenVMMXenVMM"
 +			 */
 +			printf("Hypervisor: Origin = \"%s\"\n", hv_sig);
 +		}
 +		if (strncmp(hv_sig, "VMwareVMware", 12) != 0)
 +			return (0);
 +	} else {
 +		p = getenv("smbios.system.serial");
 +		if (p == NULL)
 +			return (0);
 +		if (strncmp(p, "VMware-", 7) != 0 &&
 +		    strncmp(p, "VMW", 3) != 0) {
 +			freeenv(p);
 +			return (0);
 +		}
 +		freeenv(p);
 +		vmware_hvcall(VMW_HVCMD_GETVERSION, regs);
 +		if (regs[1] != VMW_HVMAGIC)
 +			return (0);
 +	}
 +	if (hv_high >= 0x40000010) {
 +		do_cpuid(0x40000010, regs);
 +		tsc_freq = regs[0] * 1000;
 +	} else {
 +		vmware_hvcall(VMW_HVCMD_GETHZ, regs);
 +		if (regs[1] != UINT_MAX)
 +			tsc_freq = regs[0] | ((uint64_t)regs[1] << 32);
 +	}
 +	tsc_is_invariant = 1;
 +#ifdef SMP
 +	smp_tsc = 1;	/* XXX */
 +#endif
 +	return (1);
 +}
 +
 +static void
 +probe_tsc_freq(void)
 +{
  	u_int64_t tscval[2];
  
 +	if (tsc_freq_vmware())
 +		return;
 +
  	if (cpu_feature & CPUID_TSC)
  		tsc_present = 1;
  	else
 @@ -102,7 +187,14 @@ static struct timecounter tsc_timecounter = {
  	tsc_freq = tscval[1] - tscval[0];
  	if (bootverbose)
  		printf("TSC clock: %ju Hz\n", (intmax_t)tsc_freq);
 +}
  
 +void
 +init_TSC(void)
 +{
 +
 +	probe_tsc_freq();
 +
  	/*
  	 * Inform CPU accounting about our boot-time clock rate.  Once the
  	 * system is finished booting, we will get the real max clock rate
 
 --------------000502090603070805030500--
>Unformatted:
