From nobody@FreeBSD.org  Mon Nov 21 22:29:42 2011
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0A015106566C
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 21 Nov 2011 22:29:42 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id E44D18FC08
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 21 Nov 2011 22:29:41 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id pALMTfaD050061
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 21 Nov 2011 22:29:41 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id pALMTfXv050060;
	Mon, 21 Nov 2011 22:29:41 GMT
	(envelope-from nobody)
Message-Id: <201111212229.pALMTfXv050060@red.freebsd.org>
Date: Mon, 21 Nov 2011 22:29:41 GMT
From: Adam McDougall <mcdouga9@egr.msu.edu>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [PATCH] vm_kmem_size miscalculated due to int type overflow sometimes
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         162741
>Category:       kern
>Synopsis:       [PATCH] vm_kmem_size miscalculated due to int type overflow sometimes
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Nov 21 22:30:10 UTC 2011
>Closed-Date:    Sat Jan 21 07:59:16 UTC 2012
>Last-Modified:  Sat Jan 21 08:00:18 UTC 2012
>Originator:     Adam McDougall
>Release:        8 and 9
>Organization:
>Environment:
FreeBSD ike2.egr.msu.edu 9.0-RC1 FreeBSD 9.0-RC1 #3: Mon Nov 21 12:29:54 EST 2011     root@ike2:/usr/obj/usr/src/sys/AMD64-9  amd64

>Description:
Advancements in filesystems in the last few years in FreeBSD have greatly increased the need to have a large vm.kmem_size.  I've been raising mine to improve stability and performance and I've been using a rule of thumb of setting the kmem_size to be twice the amount of physical ram, or a little above 2x.  As equipment has improved over the years, and FreeBSD has allowed a vm.kmem_size to exceed 2G, I've run into issues with increasing vm.kmem_size because the number I provided in loader.conf would sometimes get overridden with a much smaller number.  This problem became woefully evident on computers with considerably more than 4G of ram.  I had a local hack in place for years but finally got some help tracking down the true problem which was the following check in sys/kern/kern_malloc.c:

         * to something sane. Be careful to not overflow the 32bit
         * ints while doing the check.
         */
        if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count)
                vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE;

The problem was cnt.v_page_count being defined as an u_int while vm_kmem_size is a u_long and the math would overflow if the vm.kmem_size I provided in loader.conf was over 2x as large as physmem.  The warning in the comment applies to the adjustment and not just the check.  Solutions include substituting cnt.v_page_count with mem_size or ((u_long) cnt.v_page_count).  mem_size is defined earlier in the file as a u_long with the same value as cnt.v_page_count.  The same fix is needed in 8.x (and probably earlier) but the line numbers are different.  I would really appreciate it if this or an equivalent fix can be committed so I don't need to remember to apply any patches at all to my source trees before building.
>How-To-Repeat:
Set vm.kmem_size="17G" on a computer with 8G of ram and check the sysctl after boot to see that it has reduced it to 3.5G or something much smaller than 2x physmem as the check claims to do.  Setting vm.kmem_size to anything larger than 2x physmem should reproduce this issue I believe.

>Fix:


Patch attached with submission follows:

--- sys/kern/kern_malloc.c.orig	2011-11-21 12:19:25.712591472 -0500
+++ sys/kern/kern_malloc.c	2011-11-21 17:25:11.831042640 -0500
@@ -704,10 +704,10 @@
 	 * Limit kmem virtual size to twice the physical memory.
 	 * This allows for kmem map sparseness, but limits the size
 	 * to something sane. Be careful to not overflow the 32bit
-	 * ints while doing the check.
+	 * ints while doing the check or the adjustment.
 	 */
 	if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count)
-		vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE;
+		vm_kmem_size = 2 * mem_size * PAGE_SIZE;
 
 #ifdef DEBUG_MEMGUARD
 	tmp = memguard_fudge(vm_kmem_size, vm_kmem_size_max);


>Release-Note:
>Audit-Trail:

From: Bruce Evans <brde@optusnet.com.au>
To: Adam McDougall <mcdouga9@egr.msu.edu>
Cc: freebsd-gnats-submit@FreeBSD.org, freebsd-bugs@FreeBSD.org
Subject: Re: kern/162741: [PATCH] vm_kmem_size miscalculated due to int type
 overflow sometimes
Date: Wed, 23 Nov 2011 03:48:43 +1100 (EST)

 On Mon, 21 Nov 2011, Adam McDougall wrote:
 
 >> Description:
 
 > [Misformatted lines deleted]
 
 >> Fix:
 >
 > Patch attached with submission follows:
 >
 > --- sys/kern/kern_malloc.c.orig	2011-11-21 12:19:25.712591472 -0500
 > +++ sys/kern/kern_malloc.c	2011-11-21 17:25:11.831042640 -0500
 > @@ -704,10 +704,10 @@
 > 	 * Limit kmem virtual size to twice the physical memory.
 > 	 * This allows for kmem map sparseness, but limits the size
 > 	 * to something sane. Be careful to not overflow the 32bit
 > -	 * ints while doing the check.
 > +	 * ints while doing the check or the adjustment.
 > 	 */
 > 	if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count)
 > -		vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE;
 > +		vm_kmem_size = 2 * mem_size * PAGE_SIZE;
 >
 > #ifdef DEBUG_MEMGUARD
 > 	tmp = memguard_fudge(vm_kmem_size, vm_kmem_size_max);
 
 cnt.v_page_count should probably be spelled as mem_size in the check too.
 
 The limit is still garbage for 32-bit systems.  32-bit systems can
 easily have 2-4GB of physical memory.  i386 with PAE can have much
 more.  Overflow can't occur in (2 * cnt.v_page_count * PAGE_SIZE)
 since the original vm_kmem_size is limited to 4G-1 by u_long bogusly
 being 32 bits on all supported 32-bit systems.  But the user can
 misconfigure things so that the original vm_kmem_size is only slightly
 less than 4G.  Then there cannot be that much kva.  But when there is
 >= 2G physical, clamping kva to <= 2*physical has no effect.
 
 VM_KMEM_SIZE_MAX or vm.kmem_size would have to be misconfigured for
 vm_kmem_size to be impossibly large.  This means that the above code
 usually has no effect on 32-bit systems.
 
 Bruce

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162741: commit references a PR
Date: Wed,  7 Dec 2011 07:03:23 +0000 (UTC)

 Author: alc
 Date: Wed Dec  7 07:03:14 2011
 New Revision: 228317
 URL: http://svn.freebsd.org/changeset/base/228317
 
 Log:
   Eliminate the possibility of 32-bit arithmetic overflow in the calculation
   of vm_kmem_size that may occur if the system administrator has specified a
   vm.vm_kmem_size tunable value that exceeds the hard cap.
   
   PR:		162741
   Submitted by:	Adam McDougall
   Reviewed by:	bde@
   MFC after:	3 weeks
 
 Modified:
   head/sys/kern/kern_malloc.c
 
 Modified: head/sys/kern/kern_malloc.c
 ==============================================================================
 --- head/sys/kern/kern_malloc.c	Wed Dec  7 00:22:34 2011	(r228316)
 +++ head/sys/kern/kern_malloc.c	Wed Dec  7 07:03:14 2011	(r228317)
 @@ -740,11 +740,11 @@ kmeminit(void *dummy)
  	/*
  	 * Limit kmem virtual size to twice the physical memory.
  	 * This allows for kmem map sparseness, but limits the size
 -	 * to something sane. Be careful to not overflow the 32bit
 -	 * ints while doing the check.
 +	 * to something sane.  Be careful to not overflow the 32bit
 +	 * ints while doing the check or the adjustment.
  	 */
 -	if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count)
 -		vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE;
 +	if (vm_kmem_size / 2 / PAGE_SIZE > mem_size)
 +		vm_kmem_size = 2 * mem_size * PAGE_SIZE;
  
  #ifdef DEBUG_MEMGUARD
  	tmp = memguard_fudge(vm_kmem_size, vm_kmem_size_max);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: alc 
State-Changed-When: Wed Dec 7 07:13:08 UTC 2011 
State-Changed-Why:  
The patch has been applied to HEAD. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=162741 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162741: commit references a PR
Date: Sat, 21 Jan 2012 05:03:19 +0000 (UTC)

 Author: alc
 Date: Sat Jan 21 05:03:10 2012
 New Revision: 230418
 URL: http://svn.freebsd.org/changeset/base/230418
 
 Log:
   MFC r226163, r228317, and r228324
     Fix the handling of an empty kmem map by sysctl_kmem_map_free().
   
     Eliminate the possibility of 32-bit arithmetic overflow in the
     calculation of vm_kmem_size that may occur if the system
     administrator has specified a vm.vm_kmem_size tunable value that
     exceeds the hard cap.
   
     Eliminate stale numbers from a comment.
   
   PR:		162741
 
 Modified:
   stable/9/sys/kern/kern_malloc.c
 Directory Properties:
   stable/9/sys/   (props changed)
   stable/9/sys/amd64/include/xen/   (props changed)
   stable/9/sys/boot/   (props changed)
   stable/9/sys/boot/i386/efi/   (props changed)
   stable/9/sys/boot/ia64/efi/   (props changed)
   stable/9/sys/boot/ia64/ski/   (props changed)
   stable/9/sys/boot/powerpc/boot1.chrp/   (props changed)
   stable/9/sys/boot/powerpc/ofw/   (props changed)
   stable/9/sys/cddl/contrib/opensolaris/   (props changed)
   stable/9/sys/conf/   (props changed)
   stable/9/sys/contrib/dev/acpica/   (props changed)
   stable/9/sys/contrib/octeon-sdk/   (props changed)
   stable/9/sys/contrib/pf/   (props changed)
   stable/9/sys/contrib/x86emu/   (props changed)
 
 Modified: stable/9/sys/kern/kern_malloc.c
 ==============================================================================
 --- stable/9/sys/kern/kern_malloc.c	Sat Jan 21 04:24:19 2012	(r230417)
 +++ stable/9/sys/kern/kern_malloc.c	Sat Jan 21 05:03:10 2012	(r230418)
 @@ -265,8 +265,8 @@ sysctl_kmem_map_free(SYSCTL_HANDLER_ARGS
  	u_long size;
  
  	vm_map_lock_read(kmem_map);
 -	size = kmem_map->root != NULL ?
 -	    kmem_map->root->max_free : kmem_map->size;
 +	size = kmem_map->root != NULL ? kmem_map->root->max_free :
 +	    kmem_map->max_offset - kmem_map->min_offset;
  	vm_map_unlock_read(kmem_map);
  	return (sysctl_handle_long(oidp, &size, 0, req));
  }
 @@ -661,12 +661,9 @@ kmeminit(void *dummy)
  
  	/*
  	 * Try to auto-tune the kernel memory size, so that it is
 -	 * more applicable for a wider range of machine sizes.
 -	 * On an X86, a VM_KMEM_SIZE_SCALE value of 4 is good, while
 -	 * a VM_KMEM_SIZE of 12MB is a fair compromise.  The
 +	 * more applicable for a wider range of machine sizes.  The
  	 * VM_KMEM_SIZE_MAX is dependent on the maximum KVA space
 -	 * available, and on an X86 with a total KVA space of 256MB,
 -	 * try to keep VM_KMEM_SIZE_MAX at 80MB or below.
 +	 * available.
  	 *
  	 * Note that the kmem_map is also used by the zone allocator,
  	 * so make sure that there is enough space.
 @@ -703,11 +700,11 @@ kmeminit(void *dummy)
  	/*
  	 * Limit kmem virtual size to twice the physical memory.
  	 * This allows for kmem map sparseness, but limits the size
 -	 * to something sane. Be careful to not overflow the 32bit
 -	 * ints while doing the check.
 +	 * to something sane.  Be careful to not overflow the 32bit
 +	 * ints while doing the check or the adjustment.
  	 */
 -	if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count)
 -		vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE;
 +	if (vm_kmem_size / 2 / PAGE_SIZE > mem_size)
 +		vm_kmem_size = 2 * mem_size * PAGE_SIZE;
  
  #ifdef DEBUG_MEMGUARD
  	tmp = memguard_fudge(vm_kmem_size, vm_kmem_size_max);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162741: commit references a PR
Date: Sat, 21 Jan 2012 07:22:05 +0000 (UTC)

 Author: alc
 Date: Sat Jan 21 07:21:44 2012
 New Revision: 230419
 URL: http://svn.freebsd.org/changeset/base/230419
 
 Log:
   MFC r226163, r228317, and r228324
     Fix the handling of an empty kmem map by sysctl_kmem_map_free().
   
     Eliminate the possibility of 32-bit arithmetic overflow in the
     calculation of vm_kmem_size that may occur if the system
     administrator has specified a vm.vm_kmem_size tunable value that
     exceeds the hard cap.
   
     Eliminate stale numbers from a comment.
   
   PR:		162741
 
 Modified:
   stable/8/sys/kern/kern_malloc.c
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
 
 Modified: stable/8/sys/kern/kern_malloc.c
 ==============================================================================
 --- stable/8/sys/kern/kern_malloc.c	Sat Jan 21 05:03:10 2012	(r230418)
 +++ stable/8/sys/kern/kern_malloc.c	Sat Jan 21 07:21:44 2012	(r230419)
 @@ -258,8 +258,8 @@ sysctl_kmem_map_free(SYSCTL_HANDLER_ARGS
  	u_long size;
  
  	vm_map_lock_read(kmem_map);
 -	size = kmem_map->root != NULL ?
 -	    kmem_map->root->max_free : kmem_map->size;
 +	size = kmem_map->root != NULL ? kmem_map->root->max_free :
 +	    kmem_map->max_offset - kmem_map->min_offset;
  	vm_map_unlock_read(kmem_map);
  	return (sysctl_handle_long(oidp, &size, 0, req));
  }
 @@ -595,12 +595,9 @@ kmeminit(void *dummy)
  
  	/*
  	 * Try to auto-tune the kernel memory size, so that it is
 -	 * more applicable for a wider range of machine sizes.
 -	 * On an X86, a VM_KMEM_SIZE_SCALE value of 4 is good, while
 -	 * a VM_KMEM_SIZE of 12MB is a fair compromise.  The
 +	 * more applicable for a wider range of machine sizes.  The
  	 * VM_KMEM_SIZE_MAX is dependent on the maximum KVA space
 -	 * available, and on an X86 with a total KVA space of 256MB,
 -	 * try to keep VM_KMEM_SIZE_MAX at 80MB or below.
 +	 * available.
  	 *
  	 * Note that the kmem_map is also used by the zone allocator,
  	 * so make sure that there is enough space.
 @@ -637,11 +634,11 @@ kmeminit(void *dummy)
  	/*
  	 * Limit kmem virtual size to twice the physical memory.
  	 * This allows for kmem map sparseness, but limits the size
 -	 * to something sane. Be careful to not overflow the 32bit
 -	 * ints while doing the check.
 +	 * to something sane.  Be careful to not overflow the 32bit
 +	 * ints while doing the check or the adjustment.
  	 */
 -	if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count)
 -		vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE;
 +	if (vm_kmem_size / 2 / PAGE_SIZE > mem_size)
 +		vm_kmem_size = 2 * mem_size * PAGE_SIZE;
  
  	/*
  	 * Tune settings based on the kmem map's size at this time.
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: alc 
State-Changed-When: Sat Jan 21 07:58:11 UTC 2012 
State-Changed-Why:  
All active branches have been fixed. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=162741 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/162741: commit references a PR
Date: Sat, 21 Jan 2012 07:57:38 +0000 (UTC)

 Author: alc
 Date: Sat Jan 21 07:57:27 2012
 New Revision: 230420
 URL: http://svn.freebsd.org/changeset/base/230420
 
 Log:
   MFC r226163, r228317, and r228324
     Fix the handling of an empty kmem map by sysctl_kmem_map_free().
   
     Eliminate the possibility of 32-bit arithmetic overflow in the
     calculation of vm_kmem_size that may occur if the system
     administrator has specified a vm.vm_kmem_size tunable value that
     exceeds the hard cap.
   
     Eliminate stale numbers from a comment.
   
   PR:		162741
 
 Modified:
   stable/7/sys/kern/kern_malloc.c
 Directory Properties:
   stable/7/sys/   (props changed)
   stable/7/sys/cddl/contrib/opensolaris/   (props changed)
   stable/7/sys/contrib/dev/acpica/   (props changed)
   stable/7/sys/contrib/pf/   (props changed)
 
 Modified: stable/7/sys/kern/kern_malloc.c
 ==============================================================================
 --- stable/7/sys/kern/kern_malloc.c	Sat Jan 21 07:21:44 2012	(r230419)
 +++ stable/7/sys/kern/kern_malloc.c	Sat Jan 21 07:57:27 2012	(r230420)
 @@ -258,8 +258,8 @@ sysctl_kmem_map_free(SYSCTL_HANDLER_ARGS
  	u_long size;
  
  	vm_map_lock_read(kmem_map);
 -	size = kmem_map->root != NULL ?
 -	    kmem_map->root->max_free : kmem_map->size;
 +	size = kmem_map->root != NULL ? kmem_map->root->max_free :
 +	    kmem_map->max_offset - kmem_map->min_offset;
  	vm_map_unlock_read(kmem_map);
  	return (sysctl_handle_long(oidp, &size, 0, req));
  }
 @@ -594,12 +594,9 @@ kmeminit(void *dummy)
  
  	/*
  	 * Try to auto-tune the kernel memory size, so that it is
 -	 * more applicable for a wider range of machine sizes.
 -	 * On an X86, a VM_KMEM_SIZE_SCALE value of 4 is good, while
 -	 * a VM_KMEM_SIZE of 12MB is a fair compromise.  The
 +	 * more applicable for a wider range of machine sizes.  The
  	 * VM_KMEM_SIZE_MAX is dependent on the maximum KVA space
 -	 * available, and on an X86 with a total KVA space of 256MB,
 -	 * try to keep VM_KMEM_SIZE_MAX at 80MB or below.
 +	 * available.
  	 *
  	 * Note that the kmem_map is also used by the zone allocator,
  	 * so make sure that there is enough space.
 @@ -640,11 +637,11 @@ kmeminit(void *dummy)
  	/*
  	 * Limit kmem virtual size to twice the physical memory.
  	 * This allows for kmem map sparseness, but limits the size
 -	 * to something sane. Be careful to not overflow the 32bit
 -	 * ints while doing the check.
 +	 * to something sane.  Be careful to not overflow the 32bit
 +	 * ints while doing the check or the adjustment.
  	 */
 -	if (((vm_kmem_size / 2) / PAGE_SIZE) > cnt.v_page_count)
 -		vm_kmem_size = 2 * cnt.v_page_count * PAGE_SIZE;
 +	if (vm_kmem_size / 2 / PAGE_SIZE > mem_size)
 +		vm_kmem_size = 2 * mem_size * PAGE_SIZE;
  
  	/*
  	 * Tune settings based on the kmem map's size at this time.
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
>Unformatted:
