From nobody@FreeBSD.org  Sat Mar 27 10:51:18 2004
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1E84316A4CE
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 27 Mar 2004 10:51:18 -0800 (PST)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 171E243D1D
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 27 Mar 2004 10:51:18 -0800 (PST)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.10/8.12.10) with ESMTP id i2RIpH72066469
	for <freebsd-gnats-submit@FreeBSD.org>; Sat, 27 Mar 2004 10:51:17 -0800 (PST)
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.10/8.12.10/Submit) id i2RIpHK1066468;
	Sat, 27 Mar 2004 10:51:17 -0800 (PST)
	(envelope-from nobody)
Message-Id: <200403271851.i2RIpHK1066468@www.freebsd.org>
Date: Sat, 27 Mar 2004 10:51:17 -0800 (PST)
From: Patrick Mackinlay <patrick@spacesurfer.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: mmap and/or ftruncate does not work correctly on nfs mounted file systems
X-Send-Pr-Version: www-2.3

>Number:         64816
>Category:       kern
>Synopsis:       [nfs] [patch] mmap and/or ftruncate does not work correctly on nfs mounted file systems
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Mar 27 11:00:31 PST 2004
>Closed-Date:    
>Last-Modified:  Sun Jul 01 15:59:44 UTC 2012
>Originator:     Patrick Mackinlay
>Release:        4.9 (5.2.1 also tried and found not to work)
>Organization:
SpaceSurfer Ltd.
>Environment:
FreeBSD ws3.spacesurfer.com 4.9-STABLE FreeBSD 4.9-STABLE #0: Sat Feb 14 18:01:36 GMT 2004     pim@ws3.uknet.spacesurfer.com:/usr/obj/usr/src/sys/WS3  i386

>Description:
When using a file on an nfs mounted region, that has its last bytes memory mapped with mmap and then using ftruncate to increase the size of the file, the ftruncate call will result in whatever changes you have made via the mmaped area not being synced to disk (unless you use an msync or mmunmp before the ftruncate call).
The c progam below demonstrates the problem. The files tested on where on a linux 2.4 nfs server and the test program worked fine on linux 2.4 nfs clients.

#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>

void error(char *msg)
 {
 fprintf(stderr, "Error: %s\nSystem error %d: %s\n", msg, errno, strerror(errno));
 exit(-1);
 }

#define SZ 1024 // Less than page size

int main(int argn, char *argv[])
 {
 int fd;
 char buffer[SZ];
 char *map;

 if (argn!=2)
  {
  fprintf(stderr, "Usage:\n %s [filename]\n", argv[0]);
  exit(-1);
  }

 memset(buffer, 0, SZ);

 fd=open(argv[1], O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR);
 if (fd==-1) error("Could not create file");

 if (write(fd, buffer, SZ)!=SZ) error("Could not write buffer");

 map=mmap(NULL, SZ, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
 if (map==MAP_FAILED) error("Map failed");
 map[SZ-1]=1;

 if (ftruncate(fd, SZ+1)!=0) error("Could not truncate file");
 
 if (map[SZ-1]==1)
  printf("Test passed\n");
 else
  printf("Test failed\n");

 exit(0);
 } 

>How-To-Repeat:
      
>Fix:
      
>Release-Note:
>Audit-Trail:

From: Peter Edwards <peadar@freebsd.org>
To: freebsd-gnats-submit@FreeBSD.org, patrick@spacesurfer.com
Cc: Stephen McKay <smckay@internode.on.net>
Subject: Re: misc/64816: mmap and/or ftruncate does not work correctly on
 nfs mounted file systems
Date: Tue, 20 Apr 2004 01:05:13 +0100

 This is a multi-part message in MIME format.
 --------------060505050408010706000408
 Content-Type: text/plain; charset=us-ascii; format=flowed
 Content-Transfer-Encoding: 7bit
 
 I'm pretty sure I know what's going on here.
 
 There's an optimisation in nfs_getpages() that returns partially valid 
 pages on the assumption that the partially _invalid_ parts are only 
 those beyond the end of the file. (In the test program, for example, 
 assuming the file doesn't exist initially, this happens when the code 
 hits this line:
 
  > map[SZ-1]=1;
 
 The first 1K of the file is valid, but the rest isn't. (because it 
 doesn't exist.)
 
 However, once a page is mapped, and dragged in for a fault, if requested 
 page in nfs_getpages() isn't totally valid on return, the invalid parts 
 are zeroed and marked as valid by vm_fault().
 
 Later, at the ftruncate(), we call nfs_flush(). While writing out the 
 dirty data, it gets marked as no longer "valid" (vfs_busy_pages(), 
 called eventually from bwrite()). However, the end of the page doesn't 
 go through the bwrite() interface (because it's not present in the 
 file), so it's not caught by this.
 
 For our case, by the time we finish ftruncate(), we now have the first 
 two DEV_BSIZE chunks of the first page marked invalid and not clean, and 
 the remaining 6 marked as valid and clean (assuming intel architecture). 
 This is the exact opposite of the assumption that the optimisation 
 makes: that the only clean parts happen to be the "real" parts of the file.
 
 The attached patch addresses the problem, and it's run through a 
 buildworld over NFS. It simply ensures that a partially valid page meets 
 the previously assumed criteria.
 
 I'm awaiting a review from a somewhat more experienced hacker that I've 
 been pestering before committing this, but in the meantime, I thought 
 I'd share the analysis and patch. Any feedback appreciated.
 
 --------------060505050408010706000408
 Content-Type: text/plain;
  name="patchnfs_getpages.txt"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename="patchnfs_getpages.txt"
 
 Index: sys/nfsclient/nfs_bio.c
 ===================================================================
 RCS file: /usr/cvs/FreeBSD-CVS/src/sys/nfsclient/nfs_bio.c,v
 retrieving revision 1.130
 diff -u -r1.130 nfs_bio.c
 --- sys/nfsclient/nfs_bio.c	14 Apr 2004 23:23:55 -0000	1.130
 +++ sys/nfsclient/nfs_bio.c	19 Apr 2004 23:19:33 -0000
 @@ -40,6 +40,7 @@
  #include <sys/bio.h>
  #include <sys/buf.h>
  #include <sys/kernel.h>
 +#include <sys/sysctl.h>
  #include <sys/mount.h>
  #include <sys/proc.h>
  #include <sys/resourcevar.h>
 @@ -113,6 +114,7 @@
  	struct nfsmount *nmp;
  	vm_object_t object;
  	vm_page_t *pages;
 +	struct nfsnode *np;
  
  	GIANT_REQUIRED;
  
 @@ -120,6 +122,7 @@
  	td = curthread;				/* XXX */
  	cred = curthread->td_ucred;		/* XXX */
  	nmp = VFSTONFS(vp->v_mount);
 +	np = VTONFS(vp);
  	pages = ap->a_m;
  	count = ap->a_count;
  
 @@ -137,26 +140,48 @@
  	npages = btoc(count);
  
  	/*
 -	 * If the requested page is partially valid, just return it and
 -	 * allow the pager to zero-out the blanks.  Partially valid pages
 -	 * can only occur at the file EOF.
 +	 * If the requested page is partially valid, return it if all
 +	 * the invalid parts are beyond the end of the file.
 +	 * Partially valid pages can only occur at the file EOF.
 +	 *
 +	 * Note:
 +	 * The bits beyond the file may be marked as valid if the file is
 +	 * mapped: vm_fault() calls vm_page_zero_invalid() for the sections
 +	 * of the page that the pager doesn't validate, (for NFS, those
 +	 * beyond the end of the file). Unlike the other bits in the page,
 +	 * these will not get cleared by vinvalbuf(), so we can end up in the
 +	 * situation where the part of the page representing real data in the
 +	 * file is not valid, while the part extending past the end is marked
 +	 * as such.
  	 */
  
  	{
 +		int resid;
 +		u_long mask; /* Largest type for "valid" in vm_page. */
  		vm_page_t m = pages[ap->a_reqpage];
  
  		VM_OBJECT_LOCK(object);
  		vm_page_lock_queues();
  		if (m->valid != 0) {
 -			/* handled by vm_fault now	  */
 -			/* vm_page_zero_invalid(m, TRUE); */
 -			for (i = 0; i < npages; ++i) {
 -				if (i != ap->a_reqpage)
 -					vm_page_free(pages[i]);
 +			off_t pgoff = IDX_TO_OFF(m->pindex);
 +			off_t fileoff = np->n_size;
 +			resid = fileoff - pgoff;
 +			KASSERT(fileoff - pgoff < PAGE_SIZE,
 +			    ("partially valid page not at end of file"));
 +			mask = 1;
 +			for (mask = 1; m->valid & mask; mask <<= 1) {
 +				resid -= DEV_BSIZE;
 +				if (resid <= 0) {
 +					/* Page is valid enough, return it */
 +					for (i = 0; i < npages; ++i) {
 +						if (i != ap->a_reqpage)
 +							vm_page_free(pages[i]);
 +					}
 +					vm_page_unlock_queues();
 +					VM_OBJECT_UNLOCK(object);
 +					return(0);
 +				}
  			}
 -			vm_page_unlock_queues();
 -			VM_OBJECT_UNLOCK(object);
 -			return(0);
  		}
  		vm_page_unlock_queues();
  		VM_OBJECT_UNLOCK(object);
 @@ -229,8 +254,6 @@
  			 */
  			m->valid = 0;
  			vm_page_set_validclean(m, 0, size - toff);
 -			/* handled by vm_fault now	  */
 -			/* vm_page_zero_invalid(m, TRUE); */
  		} else {
  			/*
  			 * Read operation was short.  If no error occured
 
 --------------060505050408010706000408--
State-Changed-From-To: open->analyzed 
State-Changed-By: peadar 
State-Changed-When: Mon Apr 19 17:10:13 PDT 2004 
State-Changed-Why:  
Analyzed, and patch for consideration. 


Responsible-Changed-From-To: freebsd-bugs->peadar 
Responsible-Changed-By: peadar 
Responsible-Changed-When: Mon Apr 19 17:10:13 PDT 2004 
Responsible-Changed-Why:  
I <heart> NFS problems. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=64816 

From: Rong-En Fan <rafan@infor.org>
To: bug-followup@FreeBSD.org, patrick@spacesurfer.com
Cc: peadar@FreeBSD.org
Subject: Re: kern/64816: [nfs] mmap and/or ftruncate does not work correctly on nfs mounted file systems
Date: Sat, 11 Mar 2006 10:32:52 +0800 (CST)

 Hi all,
 
 When browsing NFS related PR, I saw this one. The test program
 still fails on 6.0-RELEASE. The patch posted by peadar@ solves
 this and makes test program success. 
 
 How's the status of this pr ?
 
 Cheers,
 Rong-En Fan

From: "Peter Edwards" <peadar.edwards@gmail.com>
To: "Rong-En Fan" <rafan@infor.org>
Cc: bug-followup@freebsd.org, patrick@spacesurfer.com, peadar@freebsd.org
Subject: Re: kern/64816: [nfs] mmap and/or ftruncate does not work correctly on nfs mounted file systems
Date: Sun, 12 Mar 2006 16:16:57 +0000

 On 3/11/06, Rong-En Fan <rafan@infor.org> wrote:
 > Hi all,
 >
 > When browsing NFS related PR, I saw this one. The test program
 > still fails on 6.0-RELEASE. The patch posted by peadar@ solves
 > this and makes test program success.
 >
 > How's the status of this pr ?
 >
 > Cheers,
 > Rong-En Fan
 >
 Ah: I'd forgotten about this:
 There are unfortunately some problems with the patch: Matt Dillon was
 very helpful describing the problem, but before I had a complete
 understanding of what he was saying, I got distracted by real life.
 I'll refresh my knowledge on the issue and update the PR this week
 sometime.
Responsible-Changed-From-To: peadar->freebsd-bugs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Tue Jul 31 14:50:58 UTC 2007 
Responsible-Changed-Why:  
Committer is away from FreeBSD work right now, so assign this back to 
the general pool, while noting that it contains a patch. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=64816 
State-Changed-From-To: analyzed->open 
State-Changed-By: eadler 
State-Changed-When: Sun Jul 1 15:59:43 UTC 2012 
State-Changed-Why:  
unowned PRs should not be in analyzed state 

http://www.freebsd.org/cgi/query-pr.cgi?pr=64816 
>Unformatted:
