From nobody@FreeBSD.org  Fri Apr 27 22:11:57 2007
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 8165E16A404
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 27 Apr 2007 22:11:57 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [69.147.83.33])
	by mx1.freebsd.org (Postfix) with ESMTP id 635C013C44B
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 27 Apr 2007 22:11:57 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l3RMBucH001197
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 27 Apr 2007 22:11:56 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id l3RM6tED000314;
	Fri, 27 Apr 2007 22:06:55 GMT
	(envelope-from nobody)
Message-Id: <200704272206.l3RM6tED000314@www.freebsd.org>
Date: Fri, 27 Apr 2007 22:06:55 GMT
From: Nathan Whitehorn<nathanw@ginger.rh.uchicago.edu>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [PATCH] VM handles systems with sparse physical memory poorly
X-Send-Pr-Version: www-3.0

>Number:         112194
>Category:       kern
>Synopsis:       [vm] [patch] VM handles systems with sparse physical memory poorly
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    alc
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 27 22:20:04 GMT 2007
>Closed-Date:    Mon May 14 05:57:29 GMT 2007
>Last-Modified:  Mon May 14 05:57:29 GMT 2007
>Originator:     Nathan Whitehorn
>Release:        7-CURRENT
>Organization:
University of Chicago
>Environment:
>Description:
On systems with sparse physical memory (UltraSPARC IIIi systems map
each DIMM to 8 GB boundaries, and each CPU's RAM to 64 GB boundaries,
for instance), the page table allocates an enormous number of page
table entries for the RAM that isn't there. On UltraSPARC IIIi in
particular, storing this array exceeds the maximum available contiguous
physical memory, which causes early kernel panics. I believe this is
also a problem on IA64.

The attached patch only allocates page table entries for physical RAM
that exists -- the price is slowing down PHYS_TO_VM_PAGE proportional
to the number of entries in phys_avail. Because phys_avail should only
be large on systems with very discontiguous physical memory (where this
approach most matters), and PHYS_TO_VM_PAGE is called rarely, I think
this is a price worth paying.
>How-To-Repeat:

>Fix:


Patch attached with submission follows:

--- vm_page.c.dist	Sun Apr 15 11:12:39 2007
+++ vm_page.c	Sun Apr 15 11:11:24 2007
@@ -212,7 +212,7 @@
 	/* the biggest memory array is the second group of pages */
 	vm_paddr_t end;
 	vm_paddr_t biggestsize;
-	vm_paddr_t low_water, high_water;
+	vm_paddr_t low_water;
 	int biggestone;
 
 	vm_paddr_t total;
@@ -221,6 +221,7 @@
 	biggestsize = 0;
 	biggestone = 0;
 	nblocks = 0;
+	page_range = 0;
 	vaddr = round_page(vaddr);
 
 	for (i = 0; phys_avail[i + 1]; i += 2) {
@@ -229,20 +230,19 @@
 	}
 
 	low_water = phys_avail[0];
-	high_water = phys_avail[1];
 
 	for (i = 0; phys_avail[i + 1]; i += 2) {
 		vm_paddr_t size = phys_avail[i + 1] - phys_avail[i];
 
-		if (size > biggestsize) {
+		if ((size > biggestsize) && (size > (boot_pages * UMA_SLAB_SIZE))) {
 			biggestone = i;
 			biggestsize = size;
 		}
+
 		if (phys_avail[i] < low_water)
 			low_water = phys_avail[i];
-		if (phys_avail[i + 1] > high_water)
-			high_water = phys_avail[i + 1];
 		++nblocks;
+		page_range += size/PAGE_SIZE;
 		total += size;
 	}
 
@@ -297,8 +297,9 @@
 	 * use (taking into account the overhead of a page structure per
 	 * page).
 	 */
+	
 	first_page = low_water / PAGE_SIZE;
-	page_range = high_water / PAGE_SIZE - first_page;
+	total = page_range * PAGE_SIZE;
 	npages = (total - (page_range * sizeof(struct vm_page)) -
 	    (end - new_end)) / PAGE_SIZE;
 	end = new_end;
--- vm_page.h.dist	Sun Apr 15 11:12:29 2007
+++ vm_page.h	Sun Apr 15 11:12:17 2007
@@ -272,11 +272,21 @@
 extern vm_page_t vm_page_array;		/* First resident page in table */
 extern int vm_page_array_size;		/* number of vm_page_t's */
 extern long first_page;			/* first physical page number */
+static __inline vm_page_t vm_phys_to_vm_page(vm_paddr_t pa);
+
+static __inline vm_page_t vm_phys_to_vm_page(vm_paddr_t pa) {
+	int i,j;
+	
+	for (i = 0, j = 0; (phys_avail[i+1] <= pa) || (phys_avail[i] > pa); i+= 2)
+		j += atop(phys_avail[i+1] - phys_avail[i]);
+
+	return &vm_page_array[j + atop(pa - phys_avail[i])];
+}
 
 #define VM_PAGE_TO_PHYS(entry)	((entry)->phys_addr)
 
 #define PHYS_TO_VM_PAGE(pa) \
-		(&vm_page_array[atop(pa) - first_page ])
+		vm_phys_to_vm_page(pa)
 
 extern struct mtx vm_page_queue_mtx;
 #define vm_page_lock_queues()   mtx_lock(&vm_page_queue_mtx)

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->alc 
Responsible-Changed-By: kris 
Responsible-Changed-When: Sat Apr 28 00:45:54 UTC 2007 
Responsible-Changed-Why:  
Assign to alc for evaluation 

http://www.freebsd.org/cgi/query-pr.cgi?pr=112194 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/112194: commit references a PR
Date: Sat,  5 May 2007 19:50:46 +0000 (UTC)

 alc         2007-05-05 19:50:28 UTC
 
   FreeBSD src repository
 
   Modified files:
     sys/amd64/include    vmparam.h 
     sys/arm/include      vmparam.h 
     sys/i386/include     vmparam.h 
     sys/ia64/ia64        machdep.c 
     sys/ia64/include     vmparam.h 
     sys/powerpc/include  vmparam.h 
     sys/sparc64/include  vmparam.h 
     sys/sun4v/include    vmparam.h 
     sys/vm               vm_page.c vm_page.h 
   Log:
   Define every architecture as either VM_PHYSSEG_DENSE or
   VM_PHYSSEG_SPARSE depending on whether the physical address space is
   densely or sparsely populated with memory.  The effect of this
   definition is to determine which of two implementations of
   vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
   implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
   implementation that trades off time for space is obtained by defining
   VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
   sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
   allows the entirety of my Itanium 2's memory to be used.  Previously,
   only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
   sparc64 allows USIIIi-based systems to boot without crashing.
   
   This change is a combination of Nathan Whitehorn's patch and my own
   work in perforce.
   
   Discussed with: kmacy, marius, Nathan Whitehorn
   PR:             112194
   
   Revision  Changes    Path
   1.47      +5 -0      src/sys/amd64/include/vmparam.h
   1.8       +5 -0      src/sys/arm/include/vmparam.h
   1.42      +5 -0      src/sys/i386/include/vmparam.h
   1.215     +0 -15     src/sys/ia64/ia64/machdep.c
   1.14      +5 -0      src/sys/ia64/include/vmparam.h
   1.6       +5 -0      src/sys/powerpc/include/vmparam.h
   1.16      +5 -0      src/sys/sparc64/include/vmparam.h
   1.3       +5 -0      src/sys/sun4v/include/vmparam.h
   1.342     +8 -0      src/sys/vm/vm_page.c
   1.148     +20 -2     src/sys/vm/vm_page.h
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: alc 
State-Changed-When: Sat May 12 17:25:19 UTC 2007 
State-Changed-Why:  
A patch has been applied to the HEAD of CVS that addresses 
this issue on both ia64 and sparc64.  This PR should be closed 
after the submitter has had an opportunity to provide feedback. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=112194 

From: Nathan Whitehorn <nathanw@uchicago.edu>
To: bug-followup@FreeBSD.org,  nathanw@ginger.rh.uchicago.edu
Cc:  
Subject: Re: kern/112194: [vm] [patch] VM handles systems with sparse physical
 memory poorly
Date: Sat, 12 May 2007 23:06:23 -0500

 The patch currently in CVS works perfectly on my US IIIi machine. Thanks!
State-Changed-From-To: patched->closed 
State-Changed-By: alc 
State-Changed-When: Mon May 14 05:55:43 UTC 2007 
State-Changed-Why:  
The originator of the PR has verified that the patch committed 
to the HEAD of CVS works for him. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=112194 
>Unformatted:
