From nobody@FreeBSD.org  Sat Jan 17 16:13:04 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 54EE8106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 17 Jan 2009 16:13:04 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 281C78FC21
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 17 Jan 2009 16:13:04 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n0HGD3t7009449
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 17 Jan 2009 16:13:03 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n0HGD3Qj009412;
	Sat, 17 Jan 2009 16:13:03 GMT
	(envelope-from nobody)
Message-Id: <200901171613.n0HGD3Qj009412@www.freebsd.org>
Date: Sat, 17 Jan 2009 16:13:03 GMT
From: Dmitrij Tejblum <tejblum@yandex-team.ru>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Possible deadlock in rt_check() (sys/net/route.c)
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         130652
>Category:       kern
>Synopsis:       [kernel] [patch] Possible deadlock in rt_check() (sys/net/route.c)
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    rwatson
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Jan 17 16:20:03 UTC 2009
>Closed-Date:    Wed Feb 25 11:22:10 UTC 2009
>Last-Modified:  Wed Feb 25 11:22:10 UTC 2009
>Originator:     Dmitrij Tejblum
>Release:        7.1-STABLE
>Organization:
OOO Yandex
>Environment:
FreeBSD 7.1-STABLE; net/route.c 1.120.2.7
>Description:
Some excerpt from rt_check(): 
rt_check()
{
/*1*/   RT_LOCK(rt0);
retry:
        ...
                if ((rt = rt0->rt_gwroute) != NULL) {
/*2*/                   RT_LOCK(rt);            /* NB: gwroute */
                        ....
                }
                 
                if (rt == NULL) {  /* NOT AN ELSE CLAUSE */
/*3*/                   RT_TEMP_UNLOCK(rt0); /* MUST return to undo this */
/*4*/                   rt = rtalloc1_fib(rt0->rt_gateway, 1, 0UL, fibnum);
                        ....
/*5*/                   RT_RELOCK(rt0);
                        ....
                                rt0->rt_gwroute = rt;
                }
                RT_LOCK_ASSERT(rt);
                RT_UNLOCK(rt0);
                ....
}

The function deals with route rt0 and rt. Usually, it locks rt0 in point /*1*/, then locks rt = rt0->rt_gwroute in point /*2*/, then unlock rt0 and done. But sometimes, in lock rt inside rtalloc1_fib() in point /*4*/. Then, in point /*5*/, it locks rt0, which was unlocked in point /*3*/. The order of locking of rt0 and rt is reversed, so a deadlock is possible.

(Also, if after RT_RELOCK(rt0) we found that rt0 is unusable, we should not forget to free rt before retry.)

>How-To-Repeat:

>Fix:


Patch attached with submission follows:

--- net/route.c	2008-12-05 20:40:46.000000000 +0300
+++ net/route.c	2009-01-17 18:59:12.000000000 +0300
@@ -1634,24 +1634,31 @@ retry:
 			 * Relock it and lose the added reference.
 			 * All sorts of things could have happenned while we
 			 * had no lock on it, so check for them.
+			 * rt need to be unlocked to avoid possible deadlock.
 			 */
+			RT_UNLOCK(rt);
 			RT_RELOCK(rt0);
-			if (rt0 == NULL || ((rt0->rt_flags & RTF_UP) == 0))
+			if (rt0 == NULL || ((rt0->rt_flags & RTF_UP) == 0)) {
 				/* Ru-roh.. what we had is no longer any good */
+				RTFREE(rt);
 				goto retry;
+			}
 			/* 
 			 * While we were away, someone replaced the gateway.
 			 * Since a reference count is involved we can't just
 			 * overwrite it.
 			 */
 			if (rt0->rt_gwroute) {
-				if (rt0->rt_gwroute != rt) {
-					RTFREE_LOCKED(rt);
-					goto retry;
-				}
+				if (rt0->rt_gwroute != rt)
+					RTFREE(rt);
 			} else {
 				rt0->rt_gwroute = rt;
 			}
+			/* 
+			 * Since rt was not locked, we need recheck that
+			 * it still may be used (e.g. up)
+			 */
+			goto retry;
 		}
 		RT_LOCK_ASSERT(rt);
 		RT_UNLOCK(rt0);


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-net 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sat Jan 17 21:31:56 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=130652 
Responsible-Changed-From-To: freebsd-net->rwatson 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Sat Feb 21 15:22:13 UTC 2009 
Responsible-Changed-Why:  
Grab ownership of this PR since I'm taking a look at deadlocks relating to 
routing in 7.x currently. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=130652 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/130652: commit references a PR
Date: Wed, 25 Feb 2009 11:18:30 +0000 (UTC)

 Author: rwatson
 Date: Wed Feb 25 11:18:18 2009
 New Revision: 189029
 URL: http://svn.freebsd.org/changeset/base/189029
 
 Log:
   Correct a deadlock and a rtentry leak in rt_check():
   
   - In the event that a gateway route has to be looked up, drop the lock
     on 'rt' before reacquiring it 'rt0' in order to avoid deadlock.
   
   - In the event the original route has evaporated or is no longer up
     after the gateway route lookup, call RTFREE() on the gateway route
     before retrying.
   
   This is a potential errata candidate patch.
   
   PR:		kern/130652
   Submitted by:	Dmitrij Tejblum <tejblum at yandex-team.ru>
   Reviewed by:	bz
   Tested by:	Pete French <petefrench at ticketswitch.com>
 
 Modified:
   stable/7/sys/net/route.c
 
 Modified: stable/7/sys/net/route.c
 ==============================================================================
 --- stable/7/sys/net/route.c	Wed Feb 25 11:13:13 2009	(r189028)
 +++ stable/7/sys/net/route.c	Wed Feb 25 11:18:18 2009	(r189029)
 @@ -1650,27 +1650,34 @@ retry:
  				return (ENETUNREACH);
  			}
  			/*
 -			 * Relock it and lose the added reference.
 -			 * All sorts of things could have happenned while we
 -			 * had no lock on it, so check for them.
 +			 * Relock it and lose the added reference.  All sorts
 +			 * of things could have happenned while we had no
 +			 * lock on it, so check for them.  rt need to be
 +			 * unlocked to avoid possible deadlock.
  			 */
 +			RT_UNLOCK(rt);
  			RT_RELOCK(rt0);
 -			if (rt0 == NULL || ((rt0->rt_flags & RTF_UP) == 0))
 +			if (rt0 == NULL || ((rt0->rt_flags & RTF_UP) == 0)) {
  				/* Ru-roh.. what we had is no longer any good */
 +				RTFREE(rt);
  				goto retry;
 +			}
  			/* 
  			 * While we were away, someone replaced the gateway.
  			 * Since a reference count is involved we can't just
  			 * overwrite it.
  			 */
  			if (rt0->rt_gwroute) {
 -				if (rt0->rt_gwroute != rt) {
 -					RTFREE_LOCKED(rt);
 -					goto retry;
 -				}
 +				if (rt0->rt_gwroute != rt)
 +					RTFREE(rt);
  			} else {
  				rt0->rt_gwroute = rt;
  			}
 +			/* 
 +			 * Since rt was not locked, we need recheck that
 +			 * it still may be used (e.g. up)
 +			 */
 +			goto retry;
  		}
  		RT_LOCK_ASSERT(rt);
  		RT_UNLOCK(rt0);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->closed 
State-Changed-By: rwatson 
State-Changed-When: Wed Feb 25 11:21:30 UTC 2009 
State-Changed-Why:  
Patch applied to 7-stable, thanks for the bug report and fix!  We'll look 
at doing an errata patch for the various route locking issues we've seen 
in 7.1, assuming re@ approval. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=130652 
>Unformatted:
