From art@FreeBSD.org  Fri Aug 26 06:05:02 2011
Return-Path: <art@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9B6B11065672
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 26 Aug 2011 06:05:02 +0000 (UTC)
	(envelope-from art@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 6D4928FC15
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 26 Aug 2011 06:05:02 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p7Q652de018238
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 26 Aug 2011 06:05:02 GMT
	(envelope-from art@freefall.freebsd.org)
Received: (from art@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p7Q652FD018236;
	Fri, 26 Aug 2011 06:05:02 GMT
	(envelope-from art)
Message-Id: <201108260605.p7Q652FD018236@freefall.freebsd.org>
Date: Fri, 26 Aug 2011 06:05:02 GMT
From: Artem Belevich <art@FreeBSD.org>
Reply-To: Artem Belevich <art@FreeBSD.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: amd + NFS reconnect = ICMP storm + unkillable process + hung amd mount.
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         160198
>Category:       kern
>Synopsis:       [rpc] amd + NFS reconnect = ICMP storm + unkillable process + hung amd mount.
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    art
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Aug 26 06:10:08 UTC 2011
>Closed-Date:    Mon Sep 05 07:12:28 UTC 2011
>Last-Modified:  Mon Sep 05 07:12:28 UTC 2011
>Originator:     Artem Belevich
>Release:        FreeBSD 8.2-STABLE i386
>Organization:
FreeBSD
>Environment:
FreeBSD stable/8, head

>Description:

When a process is interrupted during NFS reconnect which uses
UDP, the process gets stuck in an unkillable state.

In my particular case NFS connection is to the amd process on
the localhost. Continuous reconnects result in a
self-inflicted DoS attack on the amd which renders it
unresponsive which hangs all other processes that access
amd-mounted filesystems. As a side effect we also generate
rather high rate of ICMP port unreachable replies. All in all
the system ends up being virtually unavailable and in many
cases it requires reboot to get it out of this state.

The stuck process always has clnt_reconnect_call() in its backtrace:

	18779 100511 collect2         -                
	mi_switch+0x176
	turnstile_wait+0x1cb 
	_mtx_lock_sleep+0xe1 
	sleepq_catch_signals+0x386
	sleepq_timedwait_sig+0x19 
	_sleep+0x1b1 
	clnt_dg_call+0x7e6
	clnt_reconnect_call+0x12e 
	nfs_request+0x212 
	nfs_getattr+0x2e4
	VOP_GETATTR_APV+0x44 
	nfs_bioread+0x42a 
	VOP_READLINK_APV+0x4a
	namei+0x4f9 
	kern_statat_vnhook+0x92 
	kern_statat+0x15
	freebsd32_stat+0x2e 
	syscallenter+0x23d
	

>How-To-Repeat:
In my case the problem most frequently occurs when a parallel
build that touches amd-mounted filesystem is interrupted.

>Fix:
clnt_dg_call() uses msleep() which may return ERESTART when
current process is interrupted. In that happens we return to
clnt_reconnect_call with RPC_CANTRECV. clnt_reconnect_call()
handles RPC_CANTRECV by trying to reconnect again and the
story repeats. Because current code never returns to the
userland, it never quits and gets stuck, in most cases,
forever.

The fix is to convert ERESTART to RPC_INTR which is what's
done in other places where it's handled in RPC code.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->art 
Responsible-Changed-By: art 
Responsible-Changed-When: Fri Aug 26 06:13:18 UTC 2011 
Responsible-Changed-Why:  
Mine. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=160198 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/160198: commit references a PR
Date: Sun, 28 Aug 2011 18:09:31 +0000 (UTC)

 Author: art
 Date: Sun Aug 28 18:09:17 2011
 New Revision: 225234
 URL: http://svn.freebsd.org/changeset/base/225234
 
 Log:
   Make sure RPC calls over UDP return RPC_INTR status is the process has
   been interrupted in a restartable syscall. Otherwise we could end up
   in an (almost) endless loop in clnt_reconnect_call().
   
   PR: kern/160198
   Reviewed by: rmacklem
   Approved by: re (kib), avg (mentor)
   MFC after: 1 week
 
 Modified:
   head/sys/rpc/clnt_dg.c
 
 Modified: head/sys/rpc/clnt_dg.c
 ==============================================================================
 --- head/sys/rpc/clnt_dg.c	Sun Aug 28 16:11:24 2011	(r225233)
 +++ head/sys/rpc/clnt_dg.c	Sun Aug 28 18:09:17 2011	(r225234)
 @@ -467,7 +467,10 @@ send_again:
  		    cu->cu_waitflag, "rpccwnd", 0);
  		if (error) {
  			errp->re_errno = error;
 -			errp->re_status = stat = RPC_CANTSEND;
 +			if (error == EINTR || error == ERESTART)
 +				errp->re_status = stat = RPC_INTR;
 +			else
 +				errp->re_status = stat = RPC_CANTSEND;
  			goto out;
  		}
  	}
 @@ -636,7 +639,7 @@ get_reply:
  		 */
  		if (error != EWOULDBLOCK) {
  			errp->re_errno = error;
 -			if (error == EINTR)
 +			if (error == EINTR || error == ERESTART)
  				errp->re_status = stat = RPC_INTR;
  			else
  				errp->re_status = stat = RPC_CANTRECV;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/160198: commit references a PR
Date: Mon,  5 Sep 2011 06:54:22 +0000 (UTC)

 Author: art
 Date: Mon Sep  5 06:54:13 2011
 New Revision: 225384
 URL: http://svn.freebsd.org/changeset/base/225384
 
 Log:
   MFC r225234:
   
   Make sure RPC calls over UDP return RPC_INTR status if the process has
   been interrupted in a restartable syscall. Otherwise we could end up
   in an (almost) endless loop in clnt_reconnect_call().
   
   PR: kern/160198
   Reviewed by: rmacklem
   Approved by: avg (mentor)
 
 Modified:
   stable/8/sys/rpc/clnt_dg.c
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
 
 Modified: stable/8/sys/rpc/clnt_dg.c
 ==============================================================================
 --- stable/8/sys/rpc/clnt_dg.c	Mon Sep  5 06:11:17 2011	(r225383)
 +++ stable/8/sys/rpc/clnt_dg.c	Mon Sep  5 06:54:13 2011	(r225384)
 @@ -467,7 +467,10 @@ send_again:
  		    cu->cu_waitflag, "rpccwnd", 0);
  		if (error) {
  			errp->re_errno = error;
 -			errp->re_status = stat = RPC_CANTSEND;
 +			if (error == EINTR || error == ERESTART)
 +				errp->re_status = stat = RPC_INTR;
 +			else
 +				errp->re_status = stat = RPC_CANTSEND;
  			goto out;
  		}
  	}
 @@ -636,7 +639,7 @@ get_reply:
  		 */
  		if (error != EWOULDBLOCK) {
  			errp->re_errno = error;
 -			if (error == EINTR)
 +			if (error == EINTR || error == ERESTART)
  				errp->re_status = stat = RPC_INTR;
  			else
  				errp->re_status = stat = RPC_CANTRECV;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->closed 
State-Changed-By: art 
State-Changed-When: Mon Sep 5 07:10:40 UTC 2011 
State-Changed-Why:  
Fix committed to head and MFC'ed to -8. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=160198 
>Unformatted:
