From nobody@FreeBSD.org  Fri Apr  8 22:07:20 2005
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E473216A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  8 Apr 2005 22:07:20 +0000 (GMT)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A2DE943D2D
	for <freebsd-gnats-submit@FreeBSD.org>; Fri,  8 Apr 2005 22:07:20 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j38M7Ked097855
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 8 Apr 2005 22:07:20 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id j38M7KR7097854;
	Fri, 8 Apr 2005 22:07:20 GMT
	(envelope-from nobody)
Message-Id: <200504082207.j38M7KR7097854@www.freebsd.org>
Date: Fri, 8 Apr 2005 22:07:20 GMT
From: Dylan Simon <dylan@dylex.net>
To: freebsd-gnats-submit@FreeBSD.org
Subject: suspending nfs file access hangs other access to the file
X-Send-Pr-Version: www-2.3

>Number:         79700
>Category:       kern
>Synopsis:       [nfs] suspending nfs file access hangs other access to the file
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 08 22:10:35 GMT 2005
>Closed-Date:    Mon Nov 19 08:29:35 UTC 2007
>Last-Modified:  Mon Nov 19 08:29:35 UTC 2007
>Originator:     Dylan Simon
>Release:        FreeBSD 5.3-RELEASE-p5 i386
>Organization:
Rainfinity
>Environment:
System: FreeBSD druid.rainfinity.prv 5.3-RELEASE-p5 FreeBSD 5.3-RELEASE-p5 #1: Tue Mar 8 14:36:00 PST 2005 dylan@druid.rainfinity.prv:/usr/obj/usr/src/sys/QUARK i386      
>Description:
Suspending process that is actively writing to a file on an nfs mount causes
future access to that file by other processes to hang until the first process
is resumed.

A tcpdump shows all outstanding WRITE operations have successfully completed,
and the ACCESS to the file is not sent until the process is resumed (although
access to other files on that mount continues without interruption).

The hung process is in state D.  A truss of the hung process shows it hanging
before the stat() of the file.  For example, if we suspending a cp to file3,
and ls -l of the directory shows:

open(".",0x4,05001257435)                        = 6 (0x6)
fstat(6,0xbfbfde60)                              = 0 (0x0)
fcntl(6,F_SETFD,0x1)                             = 0 (0x0)
break(0x8054000)                                 = 0 (0x0)
__sysctl(0xbfbfdc18,0x2,0x281b3f1c,0xbfbfdc14,0x0,0x0) = 0 (0x0)
fstatfs(0x6,0xbfbfdc80)                          = 0 (0x0)
break(0x8055000)                                 = 0 (0x0)
fstat(6,0xbfbfde60)                              = 0 (0x0)
fchdir(0x6)                                      = 0 (0x0)
getdirentries(0x6,0x8054000,0x1000,0x8053014)    = 512 (0x200)
lstat("file0",0x8052248)                         = 0 (0x0)
lstat("file1",0x8052348)                         = 0 (0x0)
lstat("file2",0x8052448)                         = 0 (0x0)
<hang here until resume>
lstat("file3",0x8052548)                         = 0 (0x0)
getdirentries(0x6,0x8054000,0x1000,0x8053014)    = 0 (0x0)
lseek(6,0x0,SEEK_SET)                            = 0 (0x0)
close(6)                                         = 0 (0x0)

This has been reproduced on 5.2.1 as well as 5.3.  It happens about half the
time.

>How-To-Repeat:
mount -t nfs -o intr server:/export /mnt/path

Mount can be soft or hard.  Only tried udp.  Happened on fast (100Mb) and
slow(T1) connections to server.

cd /mnt/path
cp big_file file2
^Z
Suspend the copy in the middle.  Now, on any other shell:

cd /mnt/path> ls -l

This hangs until resuming the suspended copy.  This will work about half
the time.  The other half, the ls -l will complete fine.  Resuming the
copy and resuspending it often causes the problem again.
>Fix:
      
>Release-Note:
>Audit-Trail:

From: Stephan Uphoff <ups@tree.com>
To: Dylan Simon <dylan@dylex.net>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/79700: suspending nfs file access hangs other access to
	the file
Date: Thu, 19 May 2005 18:52:14 -0400

 This is a problem with the usage of interruptible sleep in the NFS code.
 The following description of the problem in NetBSD should be close
 enough to the FreeBSD problem.
 
 http://mail-index.netbsd.org/tech-kern/2003/06/27/0022.html
 
 I don't think this will be fixed anytime soon and I recommend not using
 the "intr" option.
 ( In my opinion the "intr" option should never be used during normal
 operation since file operations may not behave as expected by programs )
 
 Stephan
 
Responsible-Changed-From-To: freebsd-bugs->cel 
Responsible-Changed-By: cel 
Responsible-Changed-When: Fri May 12 21:13:41 UTC 2006 
Responsible-Changed-Why:  


http://www.freebsd.org/cgi/query-pr.cgi?pr=79700 
Responsible-Changed-From-To: cel->freebsd-bugs 
Responsible-Changed-By: cel 
Responsible-Changed-When: Mon Mar 12 15:31:23 UTC 2007 
Responsible-Changed-Why:  
Back to the public pool. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=79700 
State-Changed-From-To: open->closed 
State-Changed-By: kmacy 
State-Changed-When: Mon Nov 19 08:28:32 UTC 2007 
State-Changed-Why:  

Suspending a process while holding a vnode lock implies that other processes 
will not be able to acquire the file's vnode lock. Unfortunately, this is  
expected behaviour and not likely to change. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=79700 
>Unformatted:
