From joelh@thor.piquan.org  Sun Mar 11 11:13:34 2012
Return-Path: <joelh@thor.piquan.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id ABB271065672
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 11 Mar 2012 11:13:34 +0000 (UTC)
	(envelope-from joelh@thor.piquan.org)
Received: from thor.piquan.org (unknown [IPv6:2001:470:1f05:1741:201:2ff:fe8b:103e])
	by mx1.freebsd.org (Postfix) with ESMTP id 59C2B8FC1A
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 11 Mar 2012 11:13:34 +0000 (UTC)
Received: from thor.piquan.org (localhost [127.0.0.1])
	by thor.piquan.org (8.14.5/8.14.5) with ESMTP id q2BBDXIx024870
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 11 Mar 2012 04:13:34 -0700 (PDT)
	(envelope-from joelh@thor.piquan.org)
Received: (from joelh@localhost)
	by thor.piquan.org (8.14.5/8.14.5/Submit) id q2BBDX0V024869;
	Sun, 11 Mar 2012 04:13:33 -0700 (PDT)
	(envelope-from joelh)
Message-Id: <201203111113.q2BBDX0V024869@thor.piquan.org>
Date: Sun, 11 Mar 2012 04:13:33 -0700 (PDT)
From: Joel Ray Holveck <joelh@juniper.net>
Reply-To: Joel Ray Holveck <joelh@juniper.net>
To: FreeBSD-gnats-submit@freebsd.org
Cc: David Wolfskill <dwolf@juniper.net>
Subject: msync reports success after a failed pager flush
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         165927
>Category:       kern
>Synopsis:       [libc] msync(2) reports success after a failed pager flush
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Mar 11 11:20:10 UTC 2012
>Closed-Date:    Tue Apr 10 10:53:07 UTC 2012
>Last-Modified:  Tue Apr 10 10:53:07 UTC 2012
>Originator:     Joel Ray Holveck <joelh@juniper.net>
>Release:        FreeBSD 8.3-PRERELEASE i386
>Organization:
Juniper Networks, Inc.
>Environment:
System: FreeBSD thor.piquan.org 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #2: Sat Feb 25 15:52:16 PST 2012 root@thor.piquan.org:/usr/obj/usr/src/sys/THOR i386

>Description:
When a process is writing to an mmap-backed file, under certain common
circumstances, changes to data might not be properly flushed.
Nevertheless, msync may report success.

The bug is most easily demonstrated using NFS, so much of this
description refers to NFS-based errors.  However, these are only
examples; the bug can apply to many other filesystems as well.

If a process has an NFS-backed file mmapped in and dirties the data,
there are several common circumstances under which it might not be
properly flushed.  The bug in kern/165923 is one situation, in which
the backing file is written with the wrong uid, leading to a return of
NFSERR_ACCES.  Another client might delete the file, making the server
return NFSERR_STALE.

Formerly (e.g., in 8.2-RELEASE), this would cause the client's VM
subsystem to go into an infinite loop: the client would attempt to
flush to the server, the server returns an error, the client leaves
the pages on the dirty list but still needs to flush them, repeat ad
infinitum.

In r223054 (on stable/8; MFC r222586), this behavior was changed: the
VM system marks the pages as clean to avoid this type of loop.
However, this comes with its own set of problems.

As an example, consider the case where a process is gathering data
into an mmap-backed datastore.  The process gathers some data into the
datastore.  While this is happening, another client changes the
ownership or mode of the file.  Next, the syncer daemon attempts to
flush the datastore, but since it fails, the pages are marked as
clean.  The data-gathering process later runs msync, and since the
pages are "clean" (according to the client's VM system), msync returns
success.  However, the data has never been written to disk.

>How-To-Repeat:
See the program in the "How-To-Repeat" section of kern/165923.

If kern/165923 has not yet been fixed, then that program will
demonstrate the bug by itself using the instructions in that PR: note
that the pages are not written, but msync returns success.

Alternately, the program can still demonstrate the bug, but with more
effort.  Make sure that both WAIT_FOR_SYNC and DO_MMAP are turned on.
As the client program sleeps during the WAIT_FOR_SYNC interval, on the
server run "chattr uchg backing-store".  (A chmod won't be sufficient
on a FreeBSD 8.2 server, but might be on others.)  Be quick; you have
to do this before the client's syncer flushes the file, which will
happen within 0-30 seconds.  (If kern/165923 has not been fixed, then
you don't have to hurry; the syncer can't save the file.)  Either wait
for the sleep to return, or press ^C (which will stop the sleep and
continue with the call to msync).

Observe (using "od -X" or similar) that the file's contents will not
have changed, but the msync succeeded.

This indicates that msync(2) is a necessary, but NOT sufficient, way
for a process to verify that mapped files are flushed.  The idea that
it's necessary is contrary to the documentation in the msync(2) and
mmap(2) man pages.  The fact that it's not sufficient is contrary to
POSIX's assertion that msync may be used "for synchronized I/O data
integrity completion" (and more explicit verbage in the informative
sections; cf
<http://pubs.opengroup.org/onlinepubs/9699919799/functions/msync.html>),
which is the subject of the present PR.

>Fix:
The VM system currently (as of r222586) marks pages that cannot be
written as clean.  Instead, the VM object should be made unavailable
(unmapped, set VM_PROT_NONE, or similar), so that later memory
accesses raise a SIGSEGV and msyncs return EINVAL (or ENOMEM according
to POSIX.1-2008).  While this means that the program will almost
certainly exit with an error, that is appropriate, since its write did
fail.  (This is also similar to what happens if a swap drive fails.)

This bug is most visible in conjunction with kern/165923, since that
bug causes the sort of failure that triggers the bug currently under
discussion.  However, they are independent.  As described in
How-To-Repeat, an analogous situation that can with NFSERR_STALE.
>Release-Note:
>Audit-Trail:

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/165927: commit references a PR
Date: Sat, 17 Mar 2012 23:00:43 +0000 (UTC)

 Author: kib
 Date: Sat Mar 17 23:00:32 2012
 New Revision: 233100
 URL: http://svn.freebsd.org/changeset/base/233100
 
 Log:
   In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flag
   if the filesystem performed short write and we are skipping the page
   due to this.
   
   Propogate write error from the pager back to the callers of
   vm_pageout_flush().  Report the failure to write a page from the
   requested range as the FALSE return value from vm_object_page_clean(),
   and propagate it back to msync(2) to return EIO to usermode.
   
   While there, convert the clearobjflags variable in the
   vm_object_page_clean() and arguments of the helper functions to
   boolean.
   
   PR:	kern/165927
   Reviewed by:	alc
   MFC after:	2 weeks
 
 Modified:
   head/sys/vm/vm_contig.c
   head/sys/vm/vm_map.c
   head/sys/vm/vm_mmap.c
   head/sys/vm/vm_object.c
   head/sys/vm/vm_object.h
   head/sys/vm/vm_pageout.c
   head/sys/vm/vm_pageout.h
 
 Modified: head/sys/vm/vm_contig.c
 ==============================================================================
 --- head/sys/vm/vm_contig.c	Sat Mar 17 22:29:05 2012	(r233099)
 +++ head/sys/vm/vm_contig.c	Sat Mar 17 23:00:32 2012	(r233100)
 @@ -137,7 +137,8 @@ vm_contig_launder_page(vm_page_t m, vm_p
  			   object->type == OBJT_DEFAULT) {
  			vm_page_unlock_queues();
  			m_tmp = m;
 -			vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC, 0, NULL);
 +			vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC, 0,
 +			    NULL, NULL);
  			VM_OBJECT_UNLOCK(object);
  			vm_page_lock_queues();
  			return (0);
 
 Modified: head/sys/vm/vm_map.c
 ==============================================================================
 --- head/sys/vm/vm_map.c	Sat Mar 17 22:29:05 2012	(r233099)
 +++ head/sys/vm/vm_map.c	Sat Mar 17 23:00:32 2012	(r233100)
 @@ -2591,6 +2591,7 @@ vm_map_sync(
  	vm_object_t object;
  	vm_ooffset_t offset;
  	unsigned int last_timestamp;
 +	boolean_t failed;
  
  	vm_map_lock_read(map);
  	VM_MAP_RANGE_CHECK(map, start, end);
 @@ -2620,6 +2621,7 @@ vm_map_sync(
  
  	if (invalidate)
  		pmap_remove(map->pmap, start, end);
 +	failed = FALSE;
  
  	/*
  	 * Make a second pass, cleaning/uncaching pages from the indicated
 @@ -2648,7 +2650,8 @@ vm_map_sync(
  		vm_object_reference(object);
  		last_timestamp = map->timestamp;
  		vm_map_unlock_read(map);
 -		vm_object_sync(object, offset, size, syncio, invalidate);
 +		if (!vm_object_sync(object, offset, size, syncio, invalidate))
 +			failed = TRUE;
  		start += size;
  		vm_object_deallocate(object);
  		vm_map_lock_read(map);
 @@ -2658,7 +2661,7 @@ vm_map_sync(
  	}
  
  	vm_map_unlock_read(map);
 -	return (KERN_SUCCESS);
 +	return (failed ? KERN_FAILURE : KERN_SUCCESS);
  }
  
  /*
 
 Modified: head/sys/vm/vm_mmap.c
 ==============================================================================
 --- head/sys/vm/vm_mmap.c	Sat Mar 17 22:29:05 2012	(r233099)
 +++ head/sys/vm/vm_mmap.c	Sat Mar 17 23:00:32 2012	(r233100)
 @@ -508,6 +508,8 @@ sys_msync(td, uap)
  		return (EINVAL);	/* Sun returns ENOMEM? */
  	case KERN_INVALID_ARGUMENT:
  		return (EBUSY);
 +	case KERN_FAILURE:
 +		return (EIO);
  	default:
  		return (EINVAL);
  	}
 
 Modified: head/sys/vm/vm_object.c
 ==============================================================================
 --- head/sys/vm/vm_object.c	Sat Mar 17 22:29:05 2012	(r233099)
 +++ head/sys/vm/vm_object.c	Sat Mar 17 23:00:32 2012	(r233100)
 @@ -101,9 +101,10 @@ SYSCTL_INT(_vm, OID_AUTO, old_msync, CTL
      "Use old (insecure) msync behavior");
  
  static int	vm_object_page_collect_flush(vm_object_t object, vm_page_t p,
 -		    int pagerflags, int flags, int *clearobjflags);
 +		    int pagerflags, int flags, boolean_t *clearobjflags,
 +		    boolean_t *eio);
  static boolean_t vm_object_page_remove_write(vm_page_t p, int flags,
 -		    int *clearobjflags);
 +		    boolean_t *clearobjflags);
  static void	vm_object_qcollapse(vm_object_t object);
  static void	vm_object_vndeallocate(vm_object_t object);
  
 @@ -775,7 +776,7 @@ vm_object_terminate(vm_object_t object)
   * page should be flushed, and FALSE otherwise.
   */
  static boolean_t
 -vm_object_page_remove_write(vm_page_t p, int flags, int *clearobjflags)
 +vm_object_page_remove_write(vm_page_t p, int flags, boolean_t *clearobjflags)
  {
  
  	/*
 @@ -784,7 +785,7 @@ vm_object_page_remove_write(vm_page_t p,
  	 * cleared in this case so we do not have to set them.
  	 */
  	if ((flags & OBJPC_NOSYNC) != 0 && (p->oflags & VPO_NOSYNC) != 0) {
 -		*clearobjflags = 0;
 +		*clearobjflags = FALSE;
  		return (FALSE);
  	} else {
  		pmap_remove_write(p);
 @@ -806,21 +807,25 @@ vm_object_page_remove_write(vm_page_t p,
   *	Odd semantics: if start == end, we clean everything.
   *
   *	The object must be locked.
 + *
 + *	Returns FALSE if some page from the range was not written, as
 + *	reported by the pager, and TRUE otherwise.
   */
 -void
 +boolean_t
  vm_object_page_clean(vm_object_t object, vm_ooffset_t start, vm_ooffset_t end,
      int flags)
  {
  	vm_page_t np, p;
  	vm_pindex_t pi, tend, tstart;
 -	int clearobjflags, curgeneration, n, pagerflags;
 +	int curgeneration, n, pagerflags;
 +	boolean_t clearobjflags, eio, res;
  
  	mtx_assert(&vm_page_queue_mtx, MA_NOTOWNED);
  	VM_OBJECT_LOCK_ASSERT(object, MA_OWNED);
  	KASSERT(object->type == OBJT_VNODE, ("Not a vnode object"));
  	if ((object->flags & OBJ_MIGHTBEDIRTY) == 0 ||
  	    object->resident_page_count == 0)
 -		return;
 +		return (TRUE);
  
  	pagerflags = (flags & (OBJPC_SYNC | OBJPC_INVAL)) != 0 ?
  	    VM_PAGER_PUT_SYNC : VM_PAGER_CLUSTER_OK;
 @@ -829,6 +834,7 @@ vm_object_page_clean(vm_object_t object,
  	tstart = OFF_TO_IDX(start);
  	tend = (end == 0) ? object->size : OFF_TO_IDX(end + PAGE_MASK);
  	clearobjflags = tstart == 0 && tend >= object->size;
 +	res = TRUE;
  
  rescan:
  	curgeneration = object->generation;
 @@ -845,7 +851,7 @@ rescan:
  				if ((flags & OBJPC_SYNC) != 0)
  					goto rescan;
  				else
 -					clearobjflags = 0;
 +					clearobjflags = FALSE;
  			}
  			np = vm_page_find_least(object, pi);
  			continue;
 @@ -854,12 +860,16 @@ rescan:
  			continue;
  
  		n = vm_object_page_collect_flush(object, p, pagerflags,
 -		    flags, &clearobjflags);
 +		    flags, &clearobjflags, &eio);
 +		if (eio) {
 +			res = FALSE;
 +			clearobjflags = FALSE;
 +		}
  		if (object->generation != curgeneration) {
  			if ((flags & OBJPC_SYNC) != 0)
  				goto rescan;
  			else
 -				clearobjflags = 0;
 +				clearobjflags = FALSE;
  		}
  
  		/*
 @@ -874,8 +884,10 @@ rescan:
  		 * behind, but there is not much we can do there if
  		 * filesystem refuses to write it.
  		 */
 -		if (n == 0)
 +		if (n == 0) {
  			n = 1;
 +			clearobjflags = FALSE;
 +		}
  		np = vm_page_find_least(object, pi + n);
  	}
  #if 0
 @@ -884,11 +896,12 @@ rescan:
  
  	if (clearobjflags)
  		vm_object_clear_flag(object, OBJ_MIGHTBEDIRTY);
 +	return (res);
  }
  
  static int
  vm_object_page_collect_flush(vm_object_t object, vm_page_t p, int pagerflags,
 -    int flags, int *clearobjflags)
 +    int flags, boolean_t *clearobjflags, boolean_t *eio)
  {
  	vm_page_t ma[vm_pageout_page_count], p_first, tp;
  	int count, i, mreq, runlen;
 @@ -921,7 +934,7 @@ vm_object_page_collect_flush(vm_object_t
  	for (tp = p_first, i = 0; i < count; tp = TAILQ_NEXT(tp, listq), i++)
  		ma[i] = tp;
  
 -	vm_pageout_flush(ma, count, pagerflags, mreq, &runlen);
 +	vm_pageout_flush(ma, count, pagerflags, mreq, &runlen, eio);
  	return (runlen);
  }
  
 @@ -939,17 +952,20 @@ vm_object_page_collect_flush(vm_object_t
   * Note: certain anonymous maps, such as MAP_NOSYNC maps,
   * may start out with a NULL object.
   */
 -void
 +boolean_t
  vm_object_sync(vm_object_t object, vm_ooffset_t offset, vm_size_t size,
      boolean_t syncio, boolean_t invalidate)
  {
  	vm_object_t backing_object;
  	struct vnode *vp;
  	struct mount *mp;
 -	int flags, fsync_after;
 +	int error, flags, fsync_after;
 +	boolean_t res;
  
  	if (object == NULL)
 -		return;
 +		return (TRUE);
 +	res = TRUE;
 +	error = 0;
  	VM_OBJECT_LOCK(object);
  	while ((backing_object = object->backing_object) != NULL) {
  		VM_OBJECT_LOCK(backing_object);
 @@ -995,13 +1011,16 @@ vm_object_sync(vm_object_t object, vm_oo
  			fsync_after = FALSE;
  		}
  		VM_OBJECT_LOCK(object);
 -		vm_object_page_clean(object, offset, offset + size, flags);
 +		res = vm_object_page_clean(object, offset, offset + size,
 +		    flags);
  		VM_OBJECT_UNLOCK(object);
  		if (fsync_after)
 -			(void) VOP_FSYNC(vp, MNT_WAIT, curthread);
 +			error = VOP_FSYNC(vp, MNT_WAIT, curthread);
  		VOP_UNLOCK(vp, 0);
  		VFS_UNLOCK_GIANT(vfslocked);
  		vn_finished_write(mp);
 +		if (error != 0)
 +			res = FALSE;
  		VM_OBJECT_LOCK(object);
  	}
  	if ((object->type == OBJT_VNODE ||
 @@ -1021,6 +1040,7 @@ vm_object_sync(vm_object_t object, vm_oo
  		    OFF_TO_IDX(offset + size + PAGE_MASK), flags);
  	}
  	VM_OBJECT_UNLOCK(object);
 +	return (res);
  }
  
  /*
 
 Modified: head/sys/vm/vm_object.h
 ==============================================================================
 --- head/sys/vm/vm_object.h	Sat Mar 17 22:29:05 2012	(r233099)
 +++ head/sys/vm/vm_object.h	Sat Mar 17 23:00:32 2012	(r233100)
 @@ -227,7 +227,7 @@ void vm_object_set_writeable_dirty (vm_o
  void vm_object_init (void);
  void vm_object_page_cache(vm_object_t object, vm_pindex_t start,
      vm_pindex_t end);
 -void vm_object_page_clean(vm_object_t object, vm_ooffset_t start,
 +boolean_t vm_object_page_clean(vm_object_t object, vm_ooffset_t start,
      vm_ooffset_t end, int flags);
  void vm_object_page_remove(vm_object_t object, vm_pindex_t start,
      vm_pindex_t end, int options);
 @@ -238,7 +238,7 @@ void vm_object_reference_locked(vm_objec
  int  vm_object_set_memattr(vm_object_t object, vm_memattr_t memattr);
  void vm_object_shadow (vm_object_t *, vm_ooffset_t *, vm_size_t);
  void vm_object_split(vm_map_entry_t);
 -void vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t,
 +boolean_t vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t,
      boolean_t);
  void vm_object_madvise (vm_object_t, vm_pindex_t, int, int);
  #endif				/* _KERNEL */
 
 Modified: head/sys/vm/vm_pageout.c
 ==============================================================================
 --- head/sys/vm/vm_pageout.c	Sat Mar 17 22:29:05 2012	(r233099)
 +++ head/sys/vm/vm_pageout.c	Sat Mar 17 23:00:32 2012	(r233100)
 @@ -445,7 +445,8 @@ more:
  	/*
  	 * we allow reads during pageouts...
  	 */
 -	return (vm_pageout_flush(&mc[page_base], pageout_count, 0, 0, NULL));
 +	return (vm_pageout_flush(&mc[page_base], pageout_count, 0, 0, NULL,
 +	    NULL));
  }
  
  /*
 @@ -459,9 +460,12 @@ more:
   *
   *	Returned runlen is the count of pages between mreq and first
   *	page after mreq with status VM_PAGER_AGAIN.
 + *	*eio is set to TRUE if pager returned VM_PAGER_ERROR or VM_PAGER_FAIL
 + *	for any page in runlen set.
   */
  int
 -vm_pageout_flush(vm_page_t *mc, int count, int flags, int mreq, int *prunlen)
 +vm_pageout_flush(vm_page_t *mc, int count, int flags, int mreq, int *prunlen,
 +    boolean_t *eio)
  {
  	vm_object_t object = mc[0]->object;
  	int pageout_status[count];
 @@ -493,6 +497,8 @@ vm_pageout_flush(vm_page_t *mc, int coun
  	vm_pager_put_pages(object, mc, count, flags, pageout_status);
  
  	runlen = count - mreq;
 +	if (eio != NULL)
 +		*eio = FALSE;
  	for (i = 0; i < count; i++) {
  		vm_page_t mt = mc[i];
  
 @@ -522,6 +528,8 @@ vm_pageout_flush(vm_page_t *mc, int coun
  			vm_page_lock(mt);
  			vm_page_activate(mt);
  			vm_page_unlock(mt);
 +			if (eio != NULL && i >= mreq && i - mreq < runlen)
 +				*eio = TRUE;
  			break;
  		case VM_PAGER_AGAIN:
  			if (i >= mreq && i - mreq < runlen)
 
 Modified: head/sys/vm/vm_pageout.h
 ==============================================================================
 --- head/sys/vm/vm_pageout.h	Sat Mar 17 22:29:05 2012	(r233099)
 +++ head/sys/vm/vm_pageout.h	Sat Mar 17 23:00:32 2012	(r233100)
 @@ -102,7 +102,7 @@ extern void vm_waitpfault(void);
  
  #ifdef _KERNEL
  boolean_t vm_pageout_fallback_object_lock(vm_page_t, vm_page_t *);
 -int vm_pageout_flush(vm_page_t *, int, int, int, int *);
 +int vm_pageout_flush(vm_page_t *, int, int, int, int *, boolean_t *);
  void vm_pageout_oom(int shortage);
  boolean_t vm_pageout_page_lock(vm_page_t, vm_page_t *);
  void vm_contig_grow_cache(int, vm_paddr_t, vm_paddr_t);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/165927: commit references a PR
Date: Sat, 17 Mar 2012 23:03:32 +0000 (UTC)

 Author: kib
 Date: Sat Mar 17 23:03:20 2012
 New Revision: 233101
 URL: http://svn.freebsd.org/changeset/base/233101
 
 Log:
   Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client
   behaviour on error from write RPC back to behaviour of old nfs client.
   When set to not zero, the pages for which write failed are kept dirty.
   
   PR:	kern/165927
   Reviewed by:	alc
   MFC after:	2 weeks
 
 Modified:
   head/sys/fs/nfsclient/nfs_clbio.c
   head/sys/fs/nfsclient/nfs_clvnops.c
 
 Modified: head/sys/fs/nfsclient/nfs_clbio.c
 ==============================================================================
 --- head/sys/fs/nfsclient/nfs_clbio.c	Sat Mar 17 23:00:32 2012	(r233100)
 +++ head/sys/fs/nfsclient/nfs_clbio.c	Sat Mar 17 23:03:20 2012	(r233101)
 @@ -66,6 +66,7 @@ extern int ncl_numasync;
  extern enum nfsiod_state ncl_iodwant[NFS_MAXASYNCDAEMON];
  extern struct nfsmount *ncl_iodmount[NFS_MAXASYNCDAEMON];
  extern int newnfs_directio_enable;
 +extern int nfs_keep_dirty_on_error;
  
  int ncl_pbuf_freecnt = -1;	/* start out unlimited */
  
 @@ -348,9 +349,11 @@ ncl_putpages(struct vop_putpages_args *a
  	pmap_qremove(kva, npages);
  	relpbuf(bp, &ncl_pbuf_freecnt);
  
 -	vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid);
 -	if (must_commit)
 -		ncl_clearcommit(vp->v_mount);
 +	if (error == 0 || !nfs_keep_dirty_on_error) {
 +		vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid);
 +		if (must_commit)
 +			ncl_clearcommit(vp->v_mount);
 +	}
  	return rtvals[0];
  }
  
 
 Modified: head/sys/fs/nfsclient/nfs_clvnops.c
 ==============================================================================
 --- head/sys/fs/nfsclient/nfs_clvnops.c	Sat Mar 17 23:00:32 2012	(r233100)
 +++ head/sys/fs/nfsclient/nfs_clvnops.c	Sat Mar 17 23:03:20 2012	(r233101)
 @@ -241,6 +241,10 @@ int newnfs_directio_enable = 0;
  SYSCTL_INT(_vfs_nfs, OID_AUTO, nfs_directio_enable, CTLFLAG_RW,
  	   &newnfs_directio_enable, 0, "Enable NFS directio");
  
 +int nfs_keep_dirty_on_error;
 +SYSCTL_INT(_vfs_nfs, OID_AUTO, nfs_keep_dirty_on_error, CTLFLAG_RW,
 +    &nfs_keep_dirty_on_error, 0, "Retry pageout if error returned");
 +
  /*
   * This sysctl allows other processes to mmap a file that has been opened
   * O_DIRECT by a process.  In general, having processes mmap the file while
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/165927: commit references a PR
Date: Sat, 31 Mar 2012 06:45:20 +0000 (UTC)

 Author: kib
 Date: Sat Mar 31 06:44:48 2012
 New Revision: 233728
 URL: http://svn.freebsd.org/changeset/base/233728
 
 Log:
   MFC r233100:
   In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flag
   if the filesystem performed short write and we are skipping the page
   due to this.
   
   Propogate write error from the pager back to the callers of
   vm_pageout_flush().  Report the failure to write a page from the
   requested range as the FALSE return value from vm_object_page_clean(),
   and propagate it back to msync(2) to return EIO to usermode.
   
   While there, convert the clearobjflags variable in the
   vm_object_page_clean() and arguments of the helper functions to
   boolean.
   
   PR:	kern/165927
 
 Modified:
   stable/9/sys/vm/vm_contig.c
   stable/9/sys/vm/vm_map.c
   stable/9/sys/vm/vm_mmap.c
   stable/9/sys/vm/vm_object.c
   stable/9/sys/vm/vm_object.h
   stable/9/sys/vm/vm_pageout.c
   stable/9/sys/vm/vm_pageout.h
 Directory Properties:
   stable/9/sys/   (props changed)
 
 Modified: stable/9/sys/vm/vm_contig.c
 ==============================================================================
 --- stable/9/sys/vm/vm_contig.c	Sat Mar 31 01:21:54 2012	(r233727)
 +++ stable/9/sys/vm/vm_contig.c	Sat Mar 31 06:44:48 2012	(r233728)
 @@ -139,7 +139,8 @@ vm_contig_launder_page(vm_page_t m, vm_p
  			   object->type == OBJT_DEFAULT) {
  			vm_page_unlock_queues();
  			m_tmp = m;
 -			vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC, 0, NULL);
 +			vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC, 0,
 +			    NULL, NULL);
  			VM_OBJECT_UNLOCK(object);
  			vm_page_lock_queues();
  			return (0);
 
 Modified: stable/9/sys/vm/vm_map.c
 ==============================================================================
 --- stable/9/sys/vm/vm_map.c	Sat Mar 31 01:21:54 2012	(r233727)
 +++ stable/9/sys/vm/vm_map.c	Sat Mar 31 06:44:48 2012	(r233728)
 @@ -2591,6 +2591,7 @@ vm_map_sync(
  	vm_object_t object;
  	vm_ooffset_t offset;
  	unsigned int last_timestamp;
 +	boolean_t failed;
  
  	vm_map_lock_read(map);
  	VM_MAP_RANGE_CHECK(map, start, end);
 @@ -2620,6 +2621,7 @@ vm_map_sync(
  
  	if (invalidate)
  		pmap_remove(map->pmap, start, end);
 +	failed = FALSE;
  
  	/*
  	 * Make a second pass, cleaning/uncaching pages from the indicated
 @@ -2648,7 +2650,8 @@ vm_map_sync(
  		vm_object_reference(object);
  		last_timestamp = map->timestamp;
  		vm_map_unlock_read(map);
 -		vm_object_sync(object, offset, size, syncio, invalidate);
 +		if (!vm_object_sync(object, offset, size, syncio, invalidate))
 +			failed = TRUE;
  		start += size;
  		vm_object_deallocate(object);
  		vm_map_lock_read(map);
 @@ -2658,7 +2661,7 @@ vm_map_sync(
  	}
  
  	vm_map_unlock_read(map);
 -	return (KERN_SUCCESS);
 +	return (failed ? KERN_FAILURE : KERN_SUCCESS);
  }
  
  /*
 
 Modified: stable/9/sys/vm/vm_mmap.c
 ==============================================================================
 --- stable/9/sys/vm/vm_mmap.c	Sat Mar 31 01:21:54 2012	(r233727)
 +++ stable/9/sys/vm/vm_mmap.c	Sat Mar 31 06:44:48 2012	(r233728)
 @@ -509,6 +509,8 @@ sys_msync(td, uap)
  		return (EINVAL);	/* Sun returns ENOMEM? */
  	case KERN_INVALID_ARGUMENT:
  		return (EBUSY);
 +	case KERN_FAILURE:
 +		return (EIO);
  	default:
  		return (EINVAL);
  	}
 
 Modified: stable/9/sys/vm/vm_object.c
 ==============================================================================
 --- stable/9/sys/vm/vm_object.c	Sat Mar 31 01:21:54 2012	(r233727)
 +++ stable/9/sys/vm/vm_object.c	Sat Mar 31 06:44:48 2012	(r233728)
 @@ -101,9 +101,10 @@ SYSCTL_INT(_vm, OID_AUTO, old_msync, CTL
      "Use old (insecure) msync behavior");
  
  static int	vm_object_page_collect_flush(vm_object_t object, vm_page_t p,
 -		    int pagerflags, int flags, int *clearobjflags);
 +		    int pagerflags, int flags, boolean_t *clearobjflags,
 +		    boolean_t *eio);
  static boolean_t vm_object_page_remove_write(vm_page_t p, int flags,
 -		    int *clearobjflags);
 +		    boolean_t *clearobjflags);
  static void	vm_object_qcollapse(vm_object_t object);
  static void	vm_object_vndeallocate(vm_object_t object);
  
 @@ -774,7 +775,7 @@ vm_object_terminate(vm_object_t object)
   * page should be flushed, and FALSE otherwise.
   */
  static boolean_t
 -vm_object_page_remove_write(vm_page_t p, int flags, int *clearobjflags)
 +vm_object_page_remove_write(vm_page_t p, int flags, boolean_t *clearobjflags)
  {
  
  	/*
 @@ -783,7 +784,7 @@ vm_object_page_remove_write(vm_page_t p,
  	 * cleared in this case so we do not have to set them.
  	 */
  	if ((flags & OBJPC_NOSYNC) != 0 && (p->oflags & VPO_NOSYNC) != 0) {
 -		*clearobjflags = 0;
 +		*clearobjflags = FALSE;
  		return (FALSE);
  	} else {
  		pmap_remove_write(p);
 @@ -805,21 +806,25 @@ vm_object_page_remove_write(vm_page_t p,
   *	Odd semantics: if start == end, we clean everything.
   *
   *	The object must be locked.
 + *
 + *	Returns FALSE if some page from the range was not written, as
 + *	reported by the pager, and TRUE otherwise.
   */
 -void
 +boolean_t
  vm_object_page_clean(vm_object_t object, vm_ooffset_t start, vm_ooffset_t end,
      int flags)
  {
  	vm_page_t np, p;
  	vm_pindex_t pi, tend, tstart;
 -	int clearobjflags, curgeneration, n, pagerflags;
 +	int curgeneration, n, pagerflags;
 +	boolean_t clearobjflags, eio, res;
  
  	mtx_assert(&vm_page_queue_mtx, MA_NOTOWNED);
  	VM_OBJECT_LOCK_ASSERT(object, MA_OWNED);
  	KASSERT(object->type == OBJT_VNODE, ("Not a vnode object"));
  	if ((object->flags & OBJ_MIGHTBEDIRTY) == 0 ||
  	    object->resident_page_count == 0)
 -		return;
 +		return (TRUE);
  
  	pagerflags = (flags & (OBJPC_SYNC | OBJPC_INVAL)) != 0 ?
  	    VM_PAGER_PUT_SYNC : VM_PAGER_CLUSTER_OK;
 @@ -828,6 +833,7 @@ vm_object_page_clean(vm_object_t object,
  	tstart = OFF_TO_IDX(start);
  	tend = (end == 0) ? object->size : OFF_TO_IDX(end + PAGE_MASK);
  	clearobjflags = tstart == 0 && tend >= object->size;
 +	res = TRUE;
  
  rescan:
  	curgeneration = object->generation;
 @@ -844,7 +850,7 @@ rescan:
  				if ((flags & OBJPC_SYNC) != 0)
  					goto rescan;
  				else
 -					clearobjflags = 0;
 +					clearobjflags = FALSE;
  			}
  			np = vm_page_find_least(object, pi);
  			continue;
 @@ -853,12 +859,16 @@ rescan:
  			continue;
  
  		n = vm_object_page_collect_flush(object, p, pagerflags,
 -		    flags, &clearobjflags);
 +		    flags, &clearobjflags, &eio);
 +		if (eio) {
 +			res = FALSE;
 +			clearobjflags = FALSE;
 +		}
  		if (object->generation != curgeneration) {
  			if ((flags & OBJPC_SYNC) != 0)
  				goto rescan;
  			else
 -				clearobjflags = 0;
 +				clearobjflags = FALSE;
  		}
  
  		/*
 @@ -873,8 +883,10 @@ rescan:
  		 * behind, but there is not much we can do there if
  		 * filesystem refuses to write it.
  		 */
 -		if (n == 0)
 +		if (n == 0) {
  			n = 1;
 +			clearobjflags = FALSE;
 +		}
  		np = vm_page_find_least(object, pi + n);
  	}
  #if 0
 @@ -883,11 +895,12 @@ rescan:
  
  	if (clearobjflags)
  		vm_object_clear_flag(object, OBJ_MIGHTBEDIRTY);
 +	return (res);
  }
  
  static int
  vm_object_page_collect_flush(vm_object_t object, vm_page_t p, int pagerflags,
 -    int flags, int *clearobjflags)
 +    int flags, boolean_t *clearobjflags, boolean_t *eio)
  {
  	vm_page_t ma[vm_pageout_page_count], p_first, tp;
  	int count, i, mreq, runlen;
 @@ -920,7 +933,7 @@ vm_object_page_collect_flush(vm_object_t
  	for (tp = p_first, i = 0; i < count; tp = TAILQ_NEXT(tp, listq), i++)
  		ma[i] = tp;
  
 -	vm_pageout_flush(ma, count, pagerflags, mreq, &runlen);
 +	vm_pageout_flush(ma, count, pagerflags, mreq, &runlen, eio);
  	return (runlen);
  }
  
 @@ -938,17 +951,20 @@ vm_object_page_collect_flush(vm_object_t
   * Note: certain anonymous maps, such as MAP_NOSYNC maps,
   * may start out with a NULL object.
   */
 -void
 +boolean_t
  vm_object_sync(vm_object_t object, vm_ooffset_t offset, vm_size_t size,
      boolean_t syncio, boolean_t invalidate)
  {
  	vm_object_t backing_object;
  	struct vnode *vp;
  	struct mount *mp;
 -	int flags, fsync_after;
 +	int error, flags, fsync_after;
 +	boolean_t res;
  
  	if (object == NULL)
 -		return;
 +		return (TRUE);
 +	res = TRUE;
 +	error = 0;
  	VM_OBJECT_LOCK(object);
  	while ((backing_object = object->backing_object) != NULL) {
  		VM_OBJECT_LOCK(backing_object);
 @@ -994,13 +1010,16 @@ vm_object_sync(vm_object_t object, vm_oo
  			fsync_after = FALSE;
  		}
  		VM_OBJECT_LOCK(object);
 -		vm_object_page_clean(object, offset, offset + size, flags);
 +		res = vm_object_page_clean(object, offset, offset + size,
 +		    flags);
  		VM_OBJECT_UNLOCK(object);
  		if (fsync_after)
 -			(void) VOP_FSYNC(vp, MNT_WAIT, curthread);
 +			error = VOP_FSYNC(vp, MNT_WAIT, curthread);
  		VOP_UNLOCK(vp, 0);
  		VFS_UNLOCK_GIANT(vfslocked);
  		vn_finished_write(mp);
 +		if (error != 0)
 +			res = FALSE;
  		VM_OBJECT_LOCK(object);
  	}
  	if ((object->type == OBJT_VNODE ||
 @@ -1020,6 +1039,7 @@ vm_object_sync(vm_object_t object, vm_oo
  		    OFF_TO_IDX(offset + size + PAGE_MASK), flags);
  	}
  	VM_OBJECT_UNLOCK(object);
 +	return (res);
  }
  
  /*
 
 Modified: stable/9/sys/vm/vm_object.h
 ==============================================================================
 --- stable/9/sys/vm/vm_object.h	Sat Mar 31 01:21:54 2012	(r233727)
 +++ stable/9/sys/vm/vm_object.h	Sat Mar 31 06:44:48 2012	(r233728)
 @@ -228,7 +228,7 @@ void vm_object_set_writeable_dirty (vm_o
  void vm_object_init (void);
  void vm_object_page_cache(vm_object_t object, vm_pindex_t start,
      vm_pindex_t end);
 -void vm_object_page_clean(vm_object_t object, vm_ooffset_t start,
 +boolean_t vm_object_page_clean(vm_object_t object, vm_ooffset_t start,
      vm_ooffset_t end, int flags);
  void vm_object_page_remove(vm_object_t object, vm_pindex_t start,
      vm_pindex_t end, int options);
 @@ -239,7 +239,7 @@ void vm_object_reference_locked(vm_objec
  int  vm_object_set_memattr(vm_object_t object, vm_memattr_t memattr);
  void vm_object_shadow (vm_object_t *, vm_ooffset_t *, vm_size_t);
  void vm_object_split(vm_map_entry_t);
 -void vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t,
 +boolean_t vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t,
      boolean_t);
  void vm_object_madvise (vm_object_t, vm_pindex_t, int, int);
  #endif				/* _KERNEL */
 
 Modified: stable/9/sys/vm/vm_pageout.c
 ==============================================================================
 --- stable/9/sys/vm/vm_pageout.c	Sat Mar 31 01:21:54 2012	(r233727)
 +++ stable/9/sys/vm/vm_pageout.c	Sat Mar 31 06:44:48 2012	(r233728)
 @@ -445,7 +445,8 @@ more:
  	/*
  	 * we allow reads during pageouts...
  	 */
 -	return (vm_pageout_flush(&mc[page_base], pageout_count, 0, 0, NULL));
 +	return (vm_pageout_flush(&mc[page_base], pageout_count, 0, 0, NULL,
 +	    NULL));
  }
  
  /*
 @@ -459,9 +460,12 @@ more:
   *
   *	Returned runlen is the count of pages between mreq and first
   *	page after mreq with status VM_PAGER_AGAIN.
 + *	*eio is set to TRUE if pager returned VM_PAGER_ERROR or VM_PAGER_FAIL
 + *	for any page in runlen set.
   */
  int
 -vm_pageout_flush(vm_page_t *mc, int count, int flags, int mreq, int *prunlen)
 +vm_pageout_flush(vm_page_t *mc, int count, int flags, int mreq, int *prunlen,
 +    boolean_t *eio)
  {
  	vm_object_t object = mc[0]->object;
  	int pageout_status[count];
 @@ -493,6 +497,8 @@ vm_pageout_flush(vm_page_t *mc, int coun
  	vm_pager_put_pages(object, mc, count, flags, pageout_status);
  
  	runlen = count - mreq;
 +	if (eio != NULL)
 +		*eio = FALSE;
  	for (i = 0; i < count; i++) {
  		vm_page_t mt = mc[i];
  
 @@ -522,6 +528,8 @@ vm_pageout_flush(vm_page_t *mc, int coun
  			vm_page_lock(mt);
  			vm_page_activate(mt);
  			vm_page_unlock(mt);
 +			if (eio != NULL && i >= mreq && i - mreq < runlen)
 +				*eio = TRUE;
  			break;
  		case VM_PAGER_AGAIN:
  			if (i >= mreq && i - mreq < runlen)
 
 Modified: stable/9/sys/vm/vm_pageout.h
 ==============================================================================
 --- stable/9/sys/vm/vm_pageout.h	Sat Mar 31 01:21:54 2012	(r233727)
 +++ stable/9/sys/vm/vm_pageout.h	Sat Mar 31 06:44:48 2012	(r233728)
 @@ -102,7 +102,7 @@ extern void vm_waitpfault(void);
  
  #ifdef _KERNEL
  boolean_t vm_pageout_fallback_object_lock(vm_page_t, vm_page_t *);
 -int vm_pageout_flush(vm_page_t *, int, int, int, int *);
 +int vm_pageout_flush(vm_page_t *, int, int, int, int *, boolean_t *);
  void vm_pageout_oom(int shortage);
  boolean_t vm_pageout_page_lock(vm_page_t, vm_page_t *);
  void vm_contig_grow_cache(int, vm_paddr_t, vm_paddr_t);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/165927: commit references a PR
Date: Sat, 31 Mar 2012 06:50:38 +0000 (UTC)

 Author: kib
 Date: Sat Mar 31 06:50:27 2012
 New Revision: 233730
 URL: http://svn.freebsd.org/changeset/base/233730
 
 Log:
   MFC r233101:
   Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client
   behaviour on error from write RPC back to behaviour of old nfs client.
   When set to not zero, the pages for which write failed are kept dirty.
   
   PR:	kern/165927
 
 Modified:
   stable/9/sys/fs/nfsclient/nfs_clbio.c
   stable/9/sys/fs/nfsclient/nfs_clvnops.c
 Directory Properties:
   stable/9/sys/   (props changed)
   stable/9/sys/fs/   (props changed)
 
 Modified: stable/9/sys/fs/nfsclient/nfs_clbio.c
 ==============================================================================
 --- stable/9/sys/fs/nfsclient/nfs_clbio.c	Sat Mar 31 06:48:41 2012	(r233729)
 +++ stable/9/sys/fs/nfsclient/nfs_clbio.c	Sat Mar 31 06:50:27 2012	(r233730)
 @@ -66,6 +66,7 @@ extern int ncl_numasync;
  extern enum nfsiod_state ncl_iodwant[NFS_MAXASYNCDAEMON];
  extern struct nfsmount *ncl_iodmount[NFS_MAXASYNCDAEMON];
  extern int newnfs_directio_enable;
 +extern int nfs_keep_dirty_on_error;
  
  int ncl_pbuf_freecnt = -1;	/* start out unlimited */
  
 @@ -348,9 +349,11 @@ ncl_putpages(struct vop_putpages_args *a
  	pmap_qremove(kva, npages);
  	relpbuf(bp, &ncl_pbuf_freecnt);
  
 -	vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid);
 -	if (must_commit)
 -		ncl_clearcommit(vp->v_mount);
 +	if (error == 0 || !nfs_keep_dirty_on_error) {
 +		vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid);
 +		if (must_commit)
 +			ncl_clearcommit(vp->v_mount);
 +	}
  	return rtvals[0];
  }
  
 
 Modified: stable/9/sys/fs/nfsclient/nfs_clvnops.c
 ==============================================================================
 --- stable/9/sys/fs/nfsclient/nfs_clvnops.c	Sat Mar 31 06:48:41 2012	(r233729)
 +++ stable/9/sys/fs/nfsclient/nfs_clvnops.c	Sat Mar 31 06:50:27 2012	(r233730)
 @@ -241,6 +241,10 @@ int newnfs_directio_enable = 0;
  SYSCTL_INT(_vfs_nfs, OID_AUTO, nfs_directio_enable, CTLFLAG_RW,
  	   &newnfs_directio_enable, 0, "Enable NFS directio");
  
 +int nfs_keep_dirty_on_error;
 +SYSCTL_INT(_vfs_nfs, OID_AUTO, nfs_keep_dirty_on_error, CTLFLAG_RW,
 +    &nfs_keep_dirty_on_error, 0, "Retry pageout if error returned");
 +
  /*
   * This sysctl allows other processes to mmap a file that has been opened
   * O_DIRECT by a process.  In general, having processes mmap the file while
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/165927: commit references a PR
Date: Tue, 10 Apr 2012 10:44:52 +0000 (UTC)

 Author: kib
 Date: Tue Apr 10 10:44:41 2012
 New Revision: 234094
 URL: http://svn.freebsd.org/changeset/base/234094
 
 Log:
   MFC r233100:
   In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flag
   if the filesystem performed short write and we are skipping the page
   due to this.
   
   Propogate write error from the pager back to the callers of
   vm_pageout_flush().  Report the failure to write a page from the
   requested range as the FALSE return value from vm_object_page_clean(),
   and propagate it back to msync(2) to return EIO to usermode.
   
   While there, convert the clearobjflags variable in the
   vm_object_page_clean() and arguments of the helper functions to
   boolean.
   
   PR:	kern/165927
   Tested by:	David Wolfskill
 
 Modified:
   stable/8/sys/vm/vm_contig.c
   stable/8/sys/vm/vm_map.c
   stable/8/sys/vm/vm_mmap.c
   stable/8/sys/vm/vm_object.c
   stable/8/sys/vm/vm_object.h
   stable/8/sys/vm/vm_pageout.c
   stable/8/sys/vm/vm_pageout.h
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/boot/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
   stable/8/sys/dev/e1000/   (props changed)
   stable/8/sys/i386/conf/XENHVM   (props changed)
 
 Modified: stable/8/sys/vm/vm_contig.c
 ==============================================================================
 --- stable/8/sys/vm/vm_contig.c	Tue Apr 10 09:27:41 2012	(r234093)
 +++ stable/8/sys/vm/vm_contig.c	Tue Apr 10 10:44:41 2012	(r234094)
 @@ -135,7 +135,8 @@ vm_contig_launder_page(vm_page_t m, vm_p
  		} else if (object->type == OBJT_SWAP ||
  			   object->type == OBJT_DEFAULT) {
  			m_tmp = m;
 -			vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC, 0, NULL);
 +			vm_pageout_flush(&m_tmp, 1, VM_PAGER_PUT_SYNC, 0,
 +			    NULL, NULL);
  			VM_OBJECT_UNLOCK(object);
  			return (0);
  		}
 
 Modified: stable/8/sys/vm/vm_map.c
 ==============================================================================
 --- stable/8/sys/vm/vm_map.c	Tue Apr 10 09:27:41 2012	(r234093)
 +++ stable/8/sys/vm/vm_map.c	Tue Apr 10 10:44:41 2012	(r234094)
 @@ -2573,6 +2573,7 @@ vm_map_sync(
  	vm_object_t object;
  	vm_ooffset_t offset;
  	unsigned int last_timestamp;
 +	boolean_t failed;
  
  	vm_map_lock_read(map);
  	VM_MAP_RANGE_CHECK(map, start, end);
 @@ -2602,6 +2603,7 @@ vm_map_sync(
  
  	if (invalidate)
  		pmap_remove(map->pmap, start, end);
 +	failed = FALSE;
  
  	/*
  	 * Make a second pass, cleaning/uncaching pages from the indicated
 @@ -2630,7 +2632,8 @@ vm_map_sync(
  		vm_object_reference(object);
  		last_timestamp = map->timestamp;
  		vm_map_unlock_read(map);
 -		vm_object_sync(object, offset, size, syncio, invalidate);
 +		if (!vm_object_sync(object, offset, size, syncio, invalidate))
 +			failed = TRUE;
  		start += size;
  		vm_object_deallocate(object);
  		vm_map_lock_read(map);
 @@ -2640,7 +2643,7 @@ vm_map_sync(
  	}
  
  	vm_map_unlock_read(map);
 -	return (KERN_SUCCESS);
 +	return (failed ? KERN_FAILURE : KERN_SUCCESS);
  }
  
  /*
 
 Modified: stable/8/sys/vm/vm_mmap.c
 ==============================================================================
 --- stable/8/sys/vm/vm_mmap.c	Tue Apr 10 09:27:41 2012	(r234093)
 +++ stable/8/sys/vm/vm_mmap.c	Tue Apr 10 10:44:41 2012	(r234094)
 @@ -487,6 +487,8 @@ msync(td, uap)
  		return (EINVAL);	/* Sun returns ENOMEM? */
  	case KERN_INVALID_ARGUMENT:
  		return (EBUSY);
 +	case KERN_FAILURE:
 +		return (EIO);
  	default:
  		return (EINVAL);
  	}
 
 Modified: stable/8/sys/vm/vm_object.c
 ==============================================================================
 --- stable/8/sys/vm/vm_object.c	Tue Apr 10 09:27:41 2012	(r234093)
 +++ stable/8/sys/vm/vm_object.c	Tue Apr 10 10:44:41 2012	(r234094)
 @@ -105,9 +105,10 @@ SYSCTL_INT(_vm, OID_AUTO, old_msync, CTL
      "Use old (insecure) msync behavior");
  
  static int	vm_object_page_collect_flush(vm_object_t object, vm_page_t p,
 -		    int pagerflags, int flags, int *clearobjflags);
 +		    int pagerflags, int flags, boolean_t *clearobjflags,
 +		    boolean_t *eio);
  static boolean_t vm_object_page_remove_write(vm_page_t p, int flags,
 -		    int *clearobjflags);
 +		    boolean_t *clearobjflags);
  static void	vm_object_qcollapse(vm_object_t object);
  static void	vm_object_vndeallocate(vm_object_t object);
  
 @@ -772,7 +773,7 @@ vm_object_terminate(vm_object_t object)
  }
  
  static boolean_t
 -vm_object_page_remove_write(vm_page_t p, int flags, int *clearobjflags)
 +vm_object_page_remove_write(vm_page_t p, int flags, boolean_t *clearobjflags)
  {
  
  	/*
 @@ -781,7 +782,7 @@ vm_object_page_remove_write(vm_page_t p,
  	 * cleared in this case so we do not have to set them.
  	 */
  	if ((flags & OBJPC_NOSYNC) != 0 && (p->oflags & VPO_NOSYNC) != 0) {
 -		*clearobjflags = 0;
 +		*clearobjflags = FALSE;
  		return (FALSE);
  	} else {
  		pmap_remove_write(p);
 @@ -803,20 +804,24 @@ vm_object_page_remove_write(vm_page_t p,
   *	Odd semantics: if start == end, we clean everything.
   *
   *	The object must be locked.
 + *
 + *	Returns FALSE if some page from the range was not written, as
 + *	reported by the pager, and TRUE otherwise.
   */
 -void
 +boolean_t
  vm_object_page_clean(vm_object_t object, vm_pindex_t start, vm_pindex_t end,
      int flags)
  {
  	vm_page_t np, p;
  	vm_pindex_t pi, tend;
 -	int clearobjflags, curgeneration, n, pagerflags;
 +	int curgeneration, n, pagerflags;
 +	boolean_t clearobjflags, eio, res;
  
  	VM_OBJECT_LOCK_ASSERT(object, MA_OWNED);
  	KASSERT(object->type == OBJT_VNODE, ("Not a vnode object"));
  	if ((object->flags & OBJ_MIGHTBEDIRTY) == 0 ||
  	    object->resident_page_count == 0)
 -		return;
 +		return (TRUE);
  
  	pagerflags = (flags & (OBJPC_SYNC | OBJPC_INVAL)) != 0 ?
  	    VM_PAGER_PUT_SYNC : VM_PAGER_CLUSTER_OK;
 @@ -835,7 +840,8 @@ vm_object_page_clean(vm_object_t object,
  	 * stay dirty so do not mess with the page and do not clear the
  	 * object flags.
  	 */
 -	clearobjflags = 1;
 +	clearobjflags = TRUE;
 +	res = TRUE;
  
  rescan:
  	curgeneration = object->generation;
 @@ -858,7 +864,11 @@ rescan:
  			continue;
  
  		n = vm_object_page_collect_flush(object, p, pagerflags,
 -		    flags, &clearobjflags);
 +		    flags, &clearobjflags, &eio);
 +		if (eio) {
 +			res = FALSE;
 +			clearobjflags = FALSE;
 +		}
  		if (object->generation != curgeneration)
  			goto rescan;
  
 @@ -874,8 +884,10 @@ rescan:
  		 * behind, but there is not much we can do there if
  		 * filesystem refuses to write it.
  		 */
 -		if (n == 0)
 +		if (n == 0) {
  			n = 1;
 +			clearobjflags = FALSE;
 +		}
  		np = vm_page_find_least(object, pi + n);
  	}
  	vm_page_unlock_queues();
 @@ -886,11 +898,12 @@ rescan:
  	vm_object_clear_flag(object, OBJ_CLEANING);
  	if (clearobjflags && start == 0 && tend == object->size)
  		vm_object_clear_flag(object, OBJ_MIGHTBEDIRTY);
 +	return (res);
  }
  
  static int
  vm_object_page_collect_flush(vm_object_t object, vm_page_t p, int pagerflags,
 -    int flags, int *clearobjflags)
 +    int flags, boolean_t *clearobjflags, boolean_t *eio)
  {
  	vm_page_t ma[vm_pageout_page_count], p_first, tp;
  	int count, i, mreq, runlen;
 @@ -921,7 +934,7 @@ vm_object_page_collect_flush(vm_object_t
  	for (tp = p_first, i = 0; i < count; tp = TAILQ_NEXT(tp, listq), i++)
  		ma[i] = tp;
  
 -	vm_pageout_flush(ma, count, pagerflags, mreq, &runlen);
 +	vm_pageout_flush(ma, count, pagerflags, mreq, &runlen, eio);
  	return (runlen);
  }
  
 @@ -935,17 +948,20 @@ vm_object_page_collect_flush(vm_object_t
   * Note: certain anonymous maps, such as MAP_NOSYNC maps,
   * may start out with a NULL object.
   */
 -void
 +boolean_t
  vm_object_sync(vm_object_t object, vm_ooffset_t offset, vm_size_t size,
      boolean_t syncio, boolean_t invalidate)
  {
  	vm_object_t backing_object;
  	struct vnode *vp;
  	struct mount *mp;
 -	int flags, fsync_after;
 +	int error, flags, fsync_after;
 +	boolean_t res;
  
  	if (object == NULL)
 -		return;
 +		return (TRUE);
 +	res = TRUE;
 +	error = 0;
  	VM_OBJECT_LOCK(object);
  	while ((backing_object = object->backing_object) != NULL) {
  		VM_OBJECT_LOCK(backing_object);
 @@ -991,16 +1007,18 @@ vm_object_sync(vm_object_t object, vm_oo
  			fsync_after = FALSE;
  		}
  		VM_OBJECT_LOCK(object);
 -		vm_object_page_clean(object,
 +		res = vm_object_page_clean(object,
  		    OFF_TO_IDX(offset),
  		    OFF_TO_IDX(offset + size + PAGE_MASK),
  		    flags);
  		VM_OBJECT_UNLOCK(object);
  		if (fsync_after)
 -			(void) VOP_FSYNC(vp, MNT_WAIT, curthread);
 +			error = VOP_FSYNC(vp, MNT_WAIT, curthread);
  		VOP_UNLOCK(vp, 0);
  		VFS_UNLOCK_GIANT(vfslocked);
  		vn_finished_write(mp);
 +		if (error != 0)
 +			res = FALSE;
  		VM_OBJECT_LOCK(object);
  	}
  	if ((object->type == OBJT_VNODE ||
 @@ -1013,6 +1031,7 @@ vm_object_sync(vm_object_t object, vm_oo
  		    purge ? FALSE : TRUE);
  	}
  	VM_OBJECT_UNLOCK(object);
 +	return (res);
  }
  
  /*
 
 Modified: stable/8/sys/vm/vm_object.h
 ==============================================================================
 --- stable/8/sys/vm/vm_object.h	Tue Apr 10 09:27:41 2012	(r234093)
 +++ stable/8/sys/vm/vm_object.h	Tue Apr 10 10:44:41 2012	(r234094)
 @@ -220,7 +220,7 @@ void vm_object_set_writeable_dirty (vm_o
  void vm_object_init (void);
  void vm_object_page_cache(vm_object_t object, vm_pindex_t start,
      vm_pindex_t end);
 -void vm_object_page_clean (vm_object_t, vm_pindex_t, vm_pindex_t, boolean_t);
 +boolean_t vm_object_page_clean(vm_object_t, vm_pindex_t, vm_pindex_t, boolean_t);
  void vm_object_page_remove (vm_object_t, vm_pindex_t, vm_pindex_t, boolean_t);
  boolean_t vm_object_populate(vm_object_t, vm_pindex_t, vm_pindex_t);
  void vm_object_reference (vm_object_t);
 @@ -228,7 +228,7 @@ void vm_object_reference_locked(vm_objec
  int  vm_object_set_memattr(vm_object_t object, vm_memattr_t memattr);
  void vm_object_shadow (vm_object_t *, vm_ooffset_t *, vm_size_t);
  void vm_object_split(vm_map_entry_t);
 -void vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t,
 +boolean_t vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t,
      boolean_t);
  void vm_object_madvise (vm_object_t, vm_pindex_t, int, int);
  #endif				/* _KERNEL */
 
 Modified: stable/8/sys/vm/vm_pageout.c
 ==============================================================================
 --- stable/8/sys/vm/vm_pageout.c	Tue Apr 10 09:27:41 2012	(r234093)
 +++ stable/8/sys/vm/vm_pageout.c	Tue Apr 10 10:44:41 2012	(r234094)
 @@ -391,7 +391,8 @@ more:
  	/*
  	 * we allow reads during pageouts...
  	 */
 -	return (vm_pageout_flush(&mc[page_base], pageout_count, 0, 0, NULL));
 +	return (vm_pageout_flush(&mc[page_base], pageout_count, 0, 0, NULL,
 +	    NULL));
  }
  
  /*
 @@ -405,9 +406,12 @@ more:
   *
   *	Returned runlen is the count of pages between mreq and first
   *	page after mreq with status VM_PAGER_AGAIN.
 + *	*eio is set to TRUE if pager returned VM_PAGER_ERROR or VM_PAGER_FAIL
 + *	for any page in runlen set.
   */
  int
 -vm_pageout_flush(vm_page_t *mc, int count, int flags, int mreq, int *prunlen)
 +vm_pageout_flush(vm_page_t *mc, int count, int flags, int mreq, int *prunlen,
 +    boolean_t *eio)
  {
  	vm_object_t object = mc[0]->object;
  	int pageout_status[count];
 @@ -439,6 +443,8 @@ vm_pageout_flush(vm_page_t *mc, int coun
  	vm_pager_put_pages(object, mc, count, flags, pageout_status);
  
  	runlen = count - mreq;
 +	if (eio != NULL)
 +		*eio = FALSE;
  	vm_page_lock_queues();
  	for (i = 0; i < count; i++) {
  		vm_page_t mt = mc[i];
 @@ -467,6 +473,8 @@ vm_pageout_flush(vm_page_t *mc, int coun
  			 * will try paging out it again later).
  			 */
  			vm_page_activate(mt);
 +			if (eio != NULL && i >= mreq && i - mreq < runlen)
 +				*eio = TRUE;
  			break;
  		case VM_PAGER_AGAIN:
  			if (i >= mreq && i - mreq < runlen)
 
 Modified: stable/8/sys/vm/vm_pageout.h
 ==============================================================================
 --- stable/8/sys/vm/vm_pageout.h	Tue Apr 10 09:27:41 2012	(r234093)
 +++ stable/8/sys/vm/vm_pageout.h	Tue Apr 10 10:44:41 2012	(r234094)
 @@ -102,7 +102,7 @@ extern void vm_waitpfault(void);
  
  #ifdef _KERNEL
  boolean_t vm_pageout_fallback_object_lock(vm_page_t, vm_page_t *);
 -int vm_pageout_flush(vm_page_t *, int, int, int, int *);
 +int vm_pageout_flush(vm_page_t *, int, int, int, int *, boolean_t *);
  void vm_pageout_oom(int shortage);
  void vm_contig_grow_cache(int, vm_paddr_t, vm_paddr_t);
  #endif
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->closed 
State-Changed-By: kib 
State-Changed-When: Tue Apr 10 10:52:25 UTC 2012 
State-Changed-Why:  
Committed to all supported branches. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=165927 
>Unformatted:
