From gemini@geminix.org  Sun Aug 29 17:03:23 2004
Return-Path: <gemini@geminix.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4BAAA16A4CE
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 29 Aug 2004 17:03:23 +0000 (GMT)
Received: from gen129.n001.c02.escapebox.net (gen129.n001.c02.escapebox.net [213.73.91.129])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 973E643D41
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 29 Aug 2004 17:03:22 +0000 (GMT)
	(envelope-from gemini@geminix.org)
Received: from gemini by geminix.org with local (Exim 3.36 #1)
	id 1C1T5L-0006sc-00; Sun, 29 Aug 2004 19:03:19 +0200
Message-Id: <E1C1T5L-0006sc-00@geminix.org>
Date: Sun, 29 Aug 2004 19:03:19 +0200
From: Uwe Doering <gemini@geminix.org>
Reply-To: Uwe Doering <gemini@geminix.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc: Uwe Doering <gemini@geminix.org>
Subject: Possible race conditions in pmap.c
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         71109
>Category:       kern
>Synopsis:       [pmap] [patch] Possible race conditions in pmap.c
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    alc
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Aug 29 17:10:28 GMT 2004
>Closed-Date:    Sat Aug 26 02:59:04 GMT 2006
>Last-Modified:  Sat Aug 26 02:59:04 GMT 2006
>Originator:     Uwe Doering
>Release:        FreeBSD 4.5-RELEASE i386
>Organization:
EscapeBox - Managed On-Demand UNIX Servers
http://www.escapebox.net

>Environment:
System: FreeBSD geminix.org 4.5-RELEASE FreeBSD 4.5-RELEASE #3: Sun Aug 29 09:13:05 GMT 2004 root@localhost:/RELENG_4_Enhanced i386


>Description:
pmap_allocpte() and pmap_enter_quick() in 'src/sys/i386/i386/pmap.c'
both call pmap_page_lookup() and _pmap_allocpte().  The latter two
functions may return NULL, which can be an indicator that the
respective function has been blocking.  The caller is then supposed
to start over again since the current operation at this level can
no longer be considered atomic.

Now, in pmap_allocpte() the return value of pmap_page_lookup()
isn't checked for NULL, and in pmap_enter_quick() the same is
true for _pmap_allocpte().  These flaws cause race conditions
that can result in a kernel panic.

>How-To-Repeat:
The problem becomes apparent when looking at the relevant source
code.

>Fix:
Please consider adopting the patch below.  It applies cleanly to
RELENG_4 as of today, and, as far as I can tell, also applies to
the Alpha architecture (possibly with minor tweaks).

Apart from adding the missing NULL checking we also optimize the
other, already existing NULL check in pmap_enter_quick() in that
we do it only after pmap_page_lookup() has been called, because
'mpte' cannot be NULL in case the first part of the if-else clause
gets executed.

This patch also removes a superfluous call of VM_WAIT in
_pmap_allocpte().  The preceding function vm_page_grab() already
does the appropriate sleeping before returning NULL, even if called
without the VM_ALLOC_RETRY flag.  So we can return from
_pmap_allocpte() right away.


--- pmap.c.diff begins here ---
--- src/sys/i386/i386/pmap.c.orig	Thu May  6 20:56:50 2004
+++ src/sys/i386/i386/pmap.c	Mon May 31 13:03:52 2004
@@ -1228,7 +1228,6 @@
 	m = vm_page_grab(pmap->pm_pteobj, ptepindex,
 			VM_ALLOC_ZERO);
 	if (m == NULL) {
-		VM_WAIT;
 		/*
 		 * Indicate the need to retry.  While waiting, the page table
 		 * page may have been allocated.
@@ -1316,6 +1315,8 @@
 		} else {
 			m = pmap_page_lookup(pmap->pm_pteobj, ptepindex);
 			pmap->pm_ptphint = m;
+			if (m == NULL)
+				goto retry;
 		}
 		m->hold_count++;
 	} else {
@@ -2105,12 +2106,14 @@
 				} else {
 					mpte = pmap_page_lookup(pmap->pm_pteobj, ptepindex);
 					pmap->pm_ptphint = mpte;
+					if (mpte == NULL)
+						goto retry;
 				}
-				if (mpte == NULL)
-					goto retry;
 				mpte->hold_count++;
 			} else {
 				mpte = _pmap_allocpte(pmap, ptepindex);
+				if (mpte == NULL)
+					goto retry;
 			}
 		}
 	} else {
--- pmap.c.diff ends here ---
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->alc 
Responsible-Changed-By: alc 
Responsible-Changed-When: Sun Dec 5 20:15:52 GMT 2004 
Responsible-Changed-Why:  


http://www.freebsd.org/cgi/query-pr.cgi?pr=71109 

From: "Alan L. Cox" <alc@imimic.com>
To: freebsd-gnats-submit@FreeBSD.org, gemini@geminix.org
Cc: alc@cs.rice.edu
Subject: Re: kern/71109: [patch] Possible race conditions in pmap.c
Date: Sun, 05 Dec 2004 14:12:00 -0600

 The VM_WAIT is, indeed, unnecessary.  I suspect, but haven't verified, 
 that it is an artifact of merging PAE support.
 
 There is not, however, a race.  The reason is that we never call 
 pmap_page_lookup() on a missing or busy page table page.  So, the check 
 for a busy page is pointless and confusing.  (HEAD and RELENG_5 have 
 remedied this.)
 
 To see why, observe that pmap_page_lookup() is called only after 
 inspection of the page directory entry ("pde") has shown that the page 
 table page exists.  So, unless the page directory entry has been 
 corrupted by a hardware error or different bug, the pmap_page_lookup() 
 will succeed in finding a page.
 
 A page table is only marked busy for a short time in
 _pmap_unwire_pte_hold(), pmap_release() and _pmap_allocpte().  Since
 the kernel is non-preemptive, the busy flag should be cleared before
 it is tested anywhere.  For a problem to occur, an interrupt handler 
 would have to free a page that is mapped into a user address space. 
 That is not supposed to happen (and would itself be an error).
 
 I'm happy to elaborate on any of these points if you like.
 
 Regards,
 Alan
State-Changed-From-To: open->closed 
State-Changed-By: alc 
State-Changed-When: Sat Aug 26 02:56:06 UTC 2006 
State-Changed-Why:  
The hypothesized race condition does not exist. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=71109 
>Unformatted:
