From citrin@mx3.rambler.ru  Mon Sep  8 17:45:45 2008
Return-Path: <citrin@mx3.rambler.ru>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A1FDD106564A
	for <FreeBSD-gnats-submit@freebsd.org>; Mon,  8 Sep 2008 17:45:45 +0000 (UTC)
	(envelope-from citrin@mx3.rambler.ru)
Received: from mx-carp-ext.rambler.ru (mx-carp-ext.rambler.ru [81.19.66.237])
	by mx1.freebsd.org (Postfix) with ESMTP id 6262E8FC1D
	for <FreeBSD-gnats-submit@freebsd.org>; Mon,  8 Sep 2008 17:45:45 +0000 (UTC)
	(envelope-from citrin@mx3.rambler.ru)
Received: by mx3.rambler.ru (Postfix, from userid 1072)
	id 6F01C5C3B; Mon,  8 Sep 2008 21:34:36 +0400 (MSD)
Message-Id: <20080908173436.6F01C5C3B@mx3.rambler.ru>
Date: Mon,  8 Sep 2008 21:34:36 +0400 (MSD)
From: Anton Yuzhaninov <citrin@citrin.ru>
To: FreeBSD-gnats-submit@freebsd.org
Subject: sendfile on tmpfs data corruption
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         127213
>Category:       kern
>Synopsis:       [tmpfs] [patch] sendfile on tmpfs data corruption
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    delphij
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Sep 08 17:50:02 UTC 2008
>Closed-Date:    Sat Feb 19 13:40:44 UTC 2011
>Last-Modified:  Sat Feb 19 13:40:44 UTC 2011
>Originator:     Anton Yuzhaninov
>Release:        FreeBSD 7.1-PRERELEASE amd64
>Organization:
Rambler
>Environment:
System: FreeBSD mx3.rambler.ru 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #1: Fri Sep 5 21:15:43 MSD 2008 root@mx3.rambler.ru:/usr/obj/usr/src/sys/MAIL amd64

>Description:
sendfile(2) currupts data for files on tmpfs
>How-To-Repeat:
1. Create tmpfs and write test file

# mount -t tmpfs tmpfs /spool/tmpfs
# echo 'this is test file' > /spool/tmpfs/tt
# cat /spool/tmpfs/tt
this is test file

2. Transfer this file using sendfile. E. g. using web server from port
www/nginx

# cat nginx.conf
events {
    worker_connections  1024;
}

http {
    default_type  application/octet-stream;
    access_log off;
    sendfile   on;

    server {
        listen       8081;
        location / {
            root   /spool/tmpfs;
        }
    }
}

# nginx -c nginx.conf

Content of fetched file will be different garbage on each request:

# fetch -qo - http://127.0.0.1:8081/tt | hd
00000000  d4 98 7a 00 08 00 00 00  dc 98 7a 00 08 00 00 00  |Ԙz.....ܘz.....|
00000010  e4 98                                             ||
00000012

# fetch -qo - http://127.0.0.1:8081/tt | md5
ff035bff2dcf972ee7dfd023455997ef
# fetch -qo - http://127.0.0.1:8081/tt | md5
bb07cd1a0ede3c7af24e4a86a14b0493
# fetch -qo - http://127.0.0.1:8081/tt | md5
e84539257c2650da866a186435acc93d

With sendfile disbled in nginx.conf file is not corrupted, but we need
sendfile support for other application (where only sendfile supported).

>Fix:
Not known.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Sep 8 22:26:04 UTC 2008 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=127213 

From: Nate Eldredge <neldredge@math.ucsd.edu>
To: bug-followup@FreeBSD.org, citrin@citrin.ru
Cc:  
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Sun, 5 Oct 2008 23:20:42 -0700 (PDT)

 Hi,
 
 I investigated this a bit.
 
 First, note that this bug has some security implications, because it 
 appears that the garbage written by sendfile is kernel memory contents, 
 which could contain something sensitive.  It is sufficient for an attacker 
 to have read access to a file on a mounted tmpfs.  So it should really get 
 fixed.
 
 I'm not terribly familiar with vfs or vm internals, but it appears that 
 sendfile causes VOP_READ to be called with the IO_VMIO flag and a dummy 
 uio.  tmpfs_read (in sys/fs/tmpfs/tmpfs_vnops.c) doesn't handle this 
 correctly; it always just copies the data to the supplied uio, which in 
 this case does nothing.  It looks like the data is supposed to make it 
 into vn->v_object, and tmpfs_read doesn't do that.  (If I understand it 
 correctly, on a normal filesystem this is taken care of by bread().)
 
 I am not sure what the correct semantics of IO_VMIO are supposed to be, so 
 I don't know what the correct fix would be.  However, a quick fix is to 
 not have a v_object at all; remove the call to vnode_create_vobject in 
 tmpfs_open.  This seems to be legal since procfs, etc, work that way.  It 
 does however mean that sendfile doesn't work at all.
 
 I am curious what was the point of having a v_object in the first place, 
 since the data is already in virtual memory.  Unless the goal was just to 
 make sendfile work, which evidently wasn't successful.
 
 Incidentally, to the initial reporter, what application do you have that 
 requires sendfile?  In my experience, most things will fall back to a 
 read/write loop if sendfile fails, since sendfile isn't available on all 
 systems or under all circumstances.  So if you apply the quick fix so that 
 sendfile always fails, it might at least get your application working 
 again.
 
 -- 
 
 Nate Eldredge
 neldredge@math.ucsd.edu

From: Maxim Konovalov <maxim@macomnet.ru>
To: Nate Eldredge <neldredge@math.ucsd.edu>
Cc: bug-followup@freebsd.org
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Mon, 6 Oct 2008 11:36:38 +0400 (MSD)

 Hello,
 
 On Mon, 6 Oct 2008, 06:40-0000, Nate Eldredge wrote:
 
 [...]
 >  Incidentally, to the initial reporter, what application do you have
 >  that requires sendfile?  In my experience, most things will fall
 >  back to a read/write loop if sendfile fails, since sendfile isn't
 >  available on all systems or under all circumstances.  So if you
 >  apply the quick fix so that sendfile always fails, it might at
 >  least get your application working again.
 >
 As stated in the PR Andrey used nginx (ports/www/nginx).  But I could
 easily reproduce the bug with the stock ftpd(8) with the ftproot on
 tmpfs.
 
 -- 
 Maxim Konovalov

From: Anton Yuzhaninov <citrin@citrin.ru>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Tue, 07 Oct 2008 01:38:50 +0400

  > Incidentally, to the initial reporter, what application do you have that
  > requires sendfile?
 
 We want to use tmpfs with our homegrown application, which can work only using sendfile().
 Currently we use md+ufs, but with md data present in memory twice - in md and in VM cache.
 
 -- 
   Anton Yuzhaninov

From: Nate Eldredge <neldredge@math.ucsd.edu>
To: Maxim Konovalov <maxim@macomnet.ru>
Cc: bug-followup@freebsd.org, JH <jh275@s6.sector6.net>
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Mon, 6 Oct 2008 15:22:57 -0700 (PDT)

 On Mon, 6 Oct 2008, Maxim Konovalov wrote:
 
 > Hello,
 >
 > On Mon, 6 Oct 2008, 06:40-0000, Nate Eldredge wrote:
 >
 > [...]
 >>  Incidentally, to the initial reporter, what application do you have
 >>  that requires sendfile?  In my experience, most things will fall
 >>  back to a read/write loop if sendfile fails, since sendfile isn't
 >>  available on all systems or under all circumstances.  So if you
 >>  apply the quick fix so that sendfile always fails, it might at
 >>  least get your application working again.
 >>
 > As stated in the PR Andrey used nginx (ports/www/nginx).  But I could
 > easily reproduce the bug with the stock ftpd(8) with the ftproot on
 > tmpfs.
 
 To simplify matters further, here is the testcase I used when testing 
 this, which uses sendfile to send some data over a unix domain socket. 
 Do:
 
 ./server /tmpfs/data mysocket &
 ./client mysocket >data.out
 cmp /tmpfs/data data.out
 
 If things work right, data and data.out should be identical.  But if data 
 is a file on a tmpfs, data.out contains apparently random kernel memory 
 contents.
 
 # This is a shell archive.  Save it in a file, remove anything before
 # this line, and then unpack it by entering "sh file".  Note, it may
 # create directories; files and directories will be owned by you and
 # have default permissions.
 #
 # This archive contains:
 #
 #	Makefile
 #	client.c
 #	server.c
 #	util.c
 #	util.h
 #
 echo x - Makefile
 sed 's/^X//' >Makefile << 'END-of-Makefile'
 XCC = gcc
 XCFLAGS = -Wall -W -g
 X
 Xall : server client
 X
 Xserver : server.o util.o
 X	$(CC) -o $@ $>
 X
 Xclient : client.o util.o
 X	$(CC) -o $@ $>
 X
 Xserver.o client.o util.o : util.h
 X
 Xclean :
 X	rm -f server client *.o
 END-of-Makefile
 echo x - client.c
 sed 's/^X//' >client.c << 'END-of-client.c'
 X#include <stdio.h>
 X#include <fcntl.h>
 X#include <unistd.h>
 X#include <stdlib.h>
 X#include "util.h"
 X
 Xint main(int argc, char *argv[]) {
 X	int s;
 X	if (argc < 2) {
 X		fprintf(stderr, "Usage: %s socketname\n", argv[0]);
 X		exit(1);
 X	}
 X	if ((s = connect_unix_socket(argv[1])) < 0) {
 X		exit(1);
 X	}
 X	fake_sendfile(s, 1);
 X	return 0;
 X}
 X
 X
 X 
 END-of-client.c
 echo x - server.c
 sed 's/^X//' >server.c << 'END-of-server.c'
 X#include <stdio.h>
 X#include <fcntl.h>
 X#include <unistd.h>
 X#include <stdlib.h>
 X#include "util.h"
 X
 Xint main(int argc, char *argv[]) {
 X	int f, listener, connection;
 X	if (argc < 3) {
 X		fprintf(stderr, "Usage: %s filename socketname\n", argv[0]);
 X		exit(1);
 X	}
 X	if ((f = open(argv[1], O_RDONLY)) < 0) {
 X		perror(argv[1]);
 X		exit(1);
 X	}
 X	if ((listener = listen_unix_socket(argv[2])) < 0) {
 X		exit(1);
 X	}
 X	if ((connection = accept_unix_socket(listener)) >= 0) {
 X		real_sendfile(f, connection);
 X	}
 X	return 0;
 X}
 X
 X
 X 
 END-of-server.c
 echo x - util.c
 sed 's/^X//' >util.c << 'END-of-util.c'
 X/* send data from file to unix domain socket */
 X
 X#include <stdio.h>
 X#include <time.h>
 X#include <signal.h>
 X#include <errno.h>
 X#include <sys/types.h>
 X#include <sys/socket.h>
 X#include <sys/un.h>
 X#include <string.h>
 X#include <stdlib.h>
 X#include <unistd.h>
 X
 Xint create_unix_socket(void) {
 X	int fd;
 X	if ((fd = socket(PF_LOCAL, SOCK_STREAM, 0)) < 0) {
 X		perror("socket");
 X		return -1;
 X	}
 X	return fd;
 X}
 X
 Xint make_unix_sockaddr(const char *pathname, struct sockaddr_un *sa) {
 X	memset(sa, 0, sizeof(*sa));
 X	sa->sun_family = PF_LOCAL;
 X	if (strlen(pathname) + 1 > sizeof(sa->sun_path)) {
 X		fprintf(stderr, "%s: pathname too long (max %lu)\n",
 X			pathname, sizeof(sa->sun_path));
 X		errno = ENAMETOOLONG;
 X		return -1;
 X	}
 X	strcpy(sa->sun_path, pathname);
 X	return 0;
 X}
 X
 Xstatic char *sockname;
 Xvoid delete_socket(void) {
 X	unlink(sockname);
 X}
 X
 Xint listen_unix_socket(const char *path) {
 X	int fd;
 X	struct sockaddr_un sa;
 X	if (make_unix_sockaddr(path, &sa) < 0)
 X		return -1;
 X	if ((fd = create_unix_socket()) < 0)
 X		return -1;
 X	if (bind(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
 X		perror("bind");
 X		close(fd);
 X		return -1;
 X	}
 X	sockname = strdup(path);
 X	atexit(delete_socket);
 X
 X	if (listen(fd, 5) < 0) {
 X		perror("listen");
 X		close(fd);
 X		return -1;
 X	}
 X	return fd;
 X}
 X
 Xint accept_unix_socket(int fd) {
 X	int s;
 X	if ((s = accept(fd, NULL, 0)) < 0) {
 X		perror("accept");
 X		return -1;
 X	}
 X	return s;
 X}
 X
 Xint connect_unix_socket(const char *path) {
 X	int fd;
 X	struct sockaddr_un sa;
 X	if (make_unix_sockaddr(path, &sa) < 0)
 X		return -1;
 X	if ((fd = create_unix_socket()) < 0)
 X		return -1;
 X	if (connect(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
 X		perror("connect");
 X		return -1;
 X	}
 X	return fd;
 X}
 X
 X#define BUFSIZE 65536
 X
 Xint fake_sendfile(int from, int to) {
 X	char buf[BUFSIZE];
 X	int v;
 X	int sent = 0;
 X	while ((v = read(from, buf, BUFSIZE)) > 0) {
 X		int d = 0;
 X		while (d < v) {
 X			int w = write(to, buf, v - d);
 X			if (w <= 0) {
 X				perror("write");
 X				return -1;
 X			}
 X			d += w;
 X			sent += w;
 X		}
 X	}
 X	if (v != 0) {
 X		perror("read");
 X		return -1;
 X	}
 X	return sent;
 X}
 X
 Xint real_sendfile(int from, int to) {
 X	int v;
 X	v = sendfile(from, to, 0, 0, NULL, NULL, 0);
 X	if (v < 0) {
 X		perror("sendfile");
 X	}
 X	return v;
 X}
 X 
 X 
 END-of-util.c
 echo x - util.h
 sed 's/^X//' >util.h << 'END-of-util.h'
 X/* send data from file to unix domain socket */
 X
 X#include <stdio.h>
 X#include <time.h>
 X#include <signal.h>
 X#include <errno.h>
 X#include <sys/types.h>
 X#include <sys/socket.h>
 X#include <sys/un.h>
 X
 Xint create_unix_socket(void);
 Xint make_unix_sockaddr(const char *pathname, struct sockaddr_un *sa);
 Xint listen_unix_socket(const char *path);
 Xint accept_unix_socket(int fd);
 Xint connect_unix_socket(const char *path);
 Xint fake_sendfile(int from, int to);
 Xint real_sendfile(int from, int to);
 X 
 X 
 END-of-util.h
 exit
 
 
 -- 
 
 Nate Eldredge
 neldredge@math.ucsd.edu

From: Maxim Konovalov <maxim@macomnet.ru>
To: Nate Eldredge <neldredge@math.ucsd.edu>
Cc: bug-followup@freebsd.org, JH <jh275@s6.sector6.net>
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Tue, 7 Oct 2008 07:43:17 +0400 (MSD)

 On Mon, 6 Oct 2008, 15:22-0700, Nate Eldredge wrote:
 
 > On Mon, 6 Oct 2008, Maxim Konovalov wrote:
 >
 > > Hello,
 > >
 > > On Mon, 6 Oct 2008, 06:40-0000, Nate Eldredge wrote:
 > >
 > > [...]
 > > >  Incidentally, to the initial reporter, what application do you have
 > > >  that requires sendfile?  In my experience, most things will fall
 > > >  back to a read/write loop if sendfile fails, since sendfile isn't
 > > >  available on all systems or under all circumstances.  So if you
 > > >  apply the quick fix so that sendfile always fails, it might at
 > > >  least get your application working again.
 > > >
 > > As stated in the PR Andrey used nginx (ports/www/nginx).  But I could
 > > easily reproduce the bug with the stock ftpd(8) with the ftproot on
 > > tmpfs.
 >
 > To simplify matters further, here is the testcase I used when
 > testing this, which uses sendfile to send some data over a unix
 > domain socket. Do:
 >
 > ./server /tmpfs/data mysocket &
 > ./client mysocket >data.out
 > cmp /tmpfs/data data.out
 >
 > If things work right, data and data.out should be identical.  But if
 > data is a file on a tmpfs, data.out contains apparently random
 > kernel memory contents.
 >
 Hi Nate,
 
 It'd be really nice if you extend
 src/tools/regression/sockets/sendfile regression test for this bug.
 Now it doesn't detect this case.
 
 -- 
 Maxim Konovalov

From: Marc Olzheim <zlo@zlo.nu>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Wed, 29 Jul 2009 10:51:40 +0200

 This is not just a bug, but a security hole. What you are seeing is the
 content of other VM pages. This has exactly the same symptoms as
 http://security.freebsd.org/advisories/FreeBSD-SA-05:02.sendfile.asc
 
 In short: do not use tmpfs unless you trust all users on the system with
 root access.

From: Bruce Cran <bruce@cran.org.uk>
To: bug-followup@FreeBSD.org, citrin@citrin.ru
Cc:  
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Mon, 7 Sep 2009 03:41:11 +0100

 I've submitted the PR kern/138465 which contains a patch to
 tools/regression/sockets/sendfile which adds an MD5 field to the header
 structure; when all the data has been received the MD5 is recalculated
 and the test fails if they differ.
 
 As expected it fails when /tmp is a tmpfs filesystem.
 
 -- 
 Bruce

From: Gleb Kurtsou <gleb.kurtsou@gmail.com>
To: bug-followup@FreeBSD.org, citrin@citrin.ru
Cc:  
Subject: Re: kern/127213: [tmpfs] sendfile on tmpfs data corruption
Date: Fri, 2 Oct 2009 23:48:29 +0300

 --GvXjxJ+pjyke8COw
 Content-Type: text/plain; charset=utf-8
 Content-Disposition: inline
 
 [ resending it with changed subject.
 looks like it was ignored by bug-followup@ ]
 
 Try the patch attached, it fixes the bug for me.
 The same workaround is used by zfs.
 
 
 --GvXjxJ+pjyke8COw
 Content-Type: text/plain; charset=utf-8
 Content-Disposition: attachment; filename="tmpfs-sendfile.patch.txt"
 
 diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c
 index db8ceea..a6e4510 100644
 --- a/sys/fs/tmpfs/tmpfs_vnops.c
 +++ b/sys/fs/tmpfs/tmpfs_vnops.c
 @@ -43,6 +43,8 @@ __FBSDID("$FreeBSD$");
  #include <sys/priv.h>
  #include <sys/proc.h>
  #include <sys/resourcevar.h>
 +#include <sys/sched.h>
 +#include <sys/sf_buf.h>
  #include <sys/stat.h>
  #include <sys/systm.h>
  #include <sys/unistd.h>
 @@ -428,15 +430,72 @@ tmpfs_setattr(struct vop_setattr_args *v)
  }
  
  /* --------------------------------------------------------------------- */
 +static int
 +tmpfs_nocacheread(vm_object_t tobj, vm_pindex_t idx,
 +    vm_offset_t offset, size_t tlen, struct uio *uio)
 +{
 +	vm_page_t	m;
 +	int		error;
 +
 +	VM_OBJECT_LOCK(tobj);
 +	vm_object_pip_add(tobj, 1);
 +	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
 +	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
 +	if (m->valid != VM_PAGE_BITS_ALL) {
 +		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
 +			error = vm_pager_get_pages(tobj, &m, 1, 0);
 +			if (error != 0) {
 +				printf("tmpfs get pages from pager error [read]\n");
 +				goto out;
 +			}
 +		} else
 +			vm_page_zero_invalid(m, TRUE);
 +	}
 +	VM_OBJECT_UNLOCK(tobj);
 +	error = uiomove_fromphys(&m, offset, tlen, uio);
 +	VM_OBJECT_LOCK(tobj);
 +out:
 +	vm_page_lock_queues();
 +	vm_page_unwire(m, TRUE);
 +	vm_page_unlock_queues();
 +	vm_page_wakeup(m);
 +	vm_object_pip_subtract(tobj, 1);
 +	VM_OBJECT_UNLOCK(tobj);
 +
 +	return (error);
 +}
 +
 +static __inline int
 +tmpfs_nocacheread_buf(vm_object_t tobj, vm_pindex_t idx,
 +    vm_offset_t offset, size_t tlen, void *buf)
 +{
 +	struct uio uio;
 +	struct iovec iov;
 +
 +	uio.uio_iovcnt = 1;
 +	uio.uio_iov = &iov;
 +	iov.iov_base = buf;
 +	iov.iov_len = tlen;
 +
 +	uio.uio_offset = 0;
 +	uio.uio_resid = tlen;
 +	uio.uio_rw = UIO_READ;
 +	uio.uio_segflg = UIO_SYSSPACE;
 +	uio.uio_td = curthread;
 +
 +	return (tmpfs_nocacheread(tobj, idx, offset, tlen, &uio));
 +}
  
  static int
  tmpfs_mappedread(vm_object_t vobj, vm_object_t tobj, size_t len, struct uio *uio)
  {
 +	struct sf_buf	*sf;
  	vm_pindex_t	idx;
  	vm_page_t	m;
  	vm_offset_t	offset;
  	off_t		addr;
  	size_t		tlen;
 +	char		*ma;
  	int		error;
  
  	addr = uio->uio_offset;
 @@ -460,33 +519,30 @@ lookupvpg:
  		vm_page_wakeup(m);
  		VM_OBJECT_UNLOCK(vobj);
  		return	(error);
 +	} else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) {
 +		if (vm_page_sleep_if_busy(m, FALSE, "tmfsmr"))
 +			goto lookupvpg;
 +		vm_page_busy(m);
 +		VM_OBJECT_UNLOCK(vobj);
 +		sched_pin();
 +		sf = sf_buf_alloc(m, SFB_CPUPRIVATE);
 +		ma = (char *)sf_buf_kva(sf);
 +		error = tmpfs_nocacheread_buf(tobj, idx, offset, tlen,
 +		    ma + offset);
 +		if (error == 0) {
 +			uio->uio_offset += tlen;
 +			uio->uio_resid -= tlen;
 +		}
 +		sf_buf_free(sf);
 +		sched_unpin();
 +		VM_OBJECT_LOCK(vobj);
 +		vm_page_wakeup(m);
 +		VM_OBJECT_UNLOCK(vobj);
 +		return	(error);
  	}
  	VM_OBJECT_UNLOCK(vobj);
  nocache:
 -	VM_OBJECT_LOCK(tobj);
 -	vm_object_pip_add(tobj, 1);
 -	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
 -	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
 -	if (m->valid != VM_PAGE_BITS_ALL) {
 -		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
 -			error = vm_pager_get_pages(tobj, &m, 1, 0);
 -			if (error != 0) {
 -				printf("tmpfs get pages from pager error [read]\n");
 -				goto out;
 -			}
 -		} else
 -			vm_page_zero_invalid(m, TRUE);
 -	}
 -	VM_OBJECT_UNLOCK(tobj);
 -	error = uiomove_fromphys(&m, offset, tlen, uio);
 -	VM_OBJECT_LOCK(tobj);
 -out:
 -	vm_page_lock_queues();
 -	vm_page_unwire(m, TRUE);
 -	vm_page_unlock_queues();
 -	vm_page_wakeup(m);
 -	vm_object_pip_subtract(tobj, 1);
 -	VM_OBJECT_UNLOCK(tobj);
 +	error = tmpfs_nocacheread(tobj, idx, offset, tlen, uio);
  
  	return	(error);
  }
 
 --GvXjxJ+pjyke8COw--
State-Changed-From-To: open->patched 
State-Changed-By: delphij 
State-Changed-When: Wed Oct 7 23:17:33 UTC 2009 
State-Changed-Why:  
Patch from gk@ has been applied against -HEAD, MFC reminder. 


Responsible-Changed-From-To: freebsd-fs->delphij 
Responsible-Changed-By: delphij 
Responsible-Changed-When: Wed Oct 7 23:17:33 UTC 2009 
Responsible-Changed-Why:  
Take. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=127213 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/127213: commit references a PR
Date: Wed,  7 Oct 2009 23:17:24 +0000 (UTC)

 Author: delphij
 Date: Wed Oct  7 23:17:15 2009
 New Revision: 197850
 URL: http://svn.freebsd.org/changeset/base/197850
 
 Log:
   Add a special workaround to handle UIO_NOCOPY case.  This fixes data
   corruption observed when sendfile() is being used.
   
   PR:		kern/127213
   Submitted by:	gk
   MFC after:	2 weeks
 
 Modified:
   head/sys/fs/tmpfs/tmpfs_vnops.c
 
 Modified: head/sys/fs/tmpfs/tmpfs_vnops.c
 ==============================================================================
 --- head/sys/fs/tmpfs/tmpfs_vnops.c	Wed Oct  7 23:01:31 2009	(r197849)
 +++ head/sys/fs/tmpfs/tmpfs_vnops.c	Wed Oct  7 23:17:15 2009	(r197850)
 @@ -43,6 +43,8 @@ __FBSDID("$FreeBSD$");
  #include <sys/priv.h>
  #include <sys/proc.h>
  #include <sys/resourcevar.h>
 +#include <sys/sched.h>
 +#include <sys/sf_buf.h>
  #include <sys/stat.h>
  #include <sys/systm.h>
  #include <sys/unistd.h>
 @@ -428,15 +430,72 @@ tmpfs_setattr(struct vop_setattr_args *v
  }
  
  /* --------------------------------------------------------------------- */
 +static int
 +tmpfs_nocacheread(vm_object_t tobj, vm_pindex_t idx,
 +    vm_offset_t offset, size_t tlen, struct uio *uio)
 +{
 +	vm_page_t	m;
 +	int		error;
 +
 +	VM_OBJECT_LOCK(tobj);
 +	vm_object_pip_add(tobj, 1);
 +	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
 +	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
 +	if (m->valid != VM_PAGE_BITS_ALL) {
 +		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
 +			error = vm_pager_get_pages(tobj, &m, 1, 0);
 +			if (error != 0) {
 +				printf("tmpfs get pages from pager error [read]\n");
 +				goto out;
 +			}
 +		} else
 +			vm_page_zero_invalid(m, TRUE);
 +	}
 +	VM_OBJECT_UNLOCK(tobj);
 +	error = uiomove_fromphys(&m, offset, tlen, uio);
 +	VM_OBJECT_LOCK(tobj);
 +out:
 +	vm_page_lock_queues();
 +	vm_page_unwire(m, TRUE);
 +	vm_page_unlock_queues();
 +	vm_page_wakeup(m);
 +	vm_object_pip_subtract(tobj, 1);
 +	VM_OBJECT_UNLOCK(tobj);
 +
 +	return (error);
 +}
 +
 +static __inline int
 +tmpfs_nocacheread_buf(vm_object_t tobj, vm_pindex_t idx,
 +    vm_offset_t offset, size_t tlen, void *buf)
 +{
 +	struct uio uio;
 +	struct iovec iov;
 +
 +	uio.uio_iovcnt = 1;
 +	uio.uio_iov = &iov;
 +	iov.iov_base = buf;
 +	iov.iov_len = tlen;
 +
 +	uio.uio_offset = 0;
 +	uio.uio_resid = tlen;
 +	uio.uio_rw = UIO_READ;
 +	uio.uio_segflg = UIO_SYSSPACE;
 +	uio.uio_td = curthread;
 +
 +	return (tmpfs_nocacheread(tobj, idx, offset, tlen, &uio));
 +}
  
  static int
  tmpfs_mappedread(vm_object_t vobj, vm_object_t tobj, size_t len, struct uio *uio)
  {
 +	struct sf_buf	*sf;
  	vm_pindex_t	idx;
  	vm_page_t	m;
  	vm_offset_t	offset;
  	off_t		addr;
  	size_t		tlen;
 +	char		*ma;
  	int		error;
  
  	addr = uio->uio_offset;
 @@ -461,33 +520,30 @@ lookupvpg:
  		vm_page_wakeup(m);
  		VM_OBJECT_UNLOCK(vobj);
  		return	(error);
 +	} else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) {
 +		if (vm_page_sleep_if_busy(m, FALSE, "tmfsmr"))
 +			goto lookupvpg;
 +		vm_page_busy(m);
 +		VM_OBJECT_UNLOCK(vobj);
 +		sched_pin();
 +		sf = sf_buf_alloc(m, SFB_CPUPRIVATE);
 +		ma = (char *)sf_buf_kva(sf);
 +		error = tmpfs_nocacheread_buf(tobj, idx, offset, tlen,
 +		    ma + offset);
 +		if (error == 0) {
 +			uio->uio_offset += tlen;
 +			uio->uio_resid -= tlen;
 +		}
 +		sf_buf_free(sf);
 +		sched_unpin();
 +		VM_OBJECT_LOCK(vobj);
 +		vm_page_wakeup(m);
 +		VM_OBJECT_UNLOCK(vobj);
 +		return	(error);
  	}
  	VM_OBJECT_UNLOCK(vobj);
  nocache:
 -	VM_OBJECT_LOCK(tobj);
 -	vm_object_pip_add(tobj, 1);
 -	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
 -	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
 -	if (m->valid != VM_PAGE_BITS_ALL) {
 -		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
 -			error = vm_pager_get_pages(tobj, &m, 1, 0);
 -			if (error != 0) {
 -				printf("tmpfs get pages from pager error [read]\n");
 -				goto out;
 -			}
 -		} else
 -			vm_page_zero_invalid(m, TRUE);
 -	}
 -	VM_OBJECT_UNLOCK(tobj);
 -	error = uiomove_fromphys(&m, offset, tlen, uio);
 -	VM_OBJECT_LOCK(tobj);
 -out:
 -	vm_page_lock_queues();
 -	vm_page_unwire(m, TRUE);
 -	vm_page_unlock_queues();
 -	vm_page_wakeup(m);
 -	vm_object_pip_subtract(tobj, 1);
 -	VM_OBJECT_UNLOCK(tobj);
 +	error = tmpfs_nocacheread(tobj, idx, offset, tlen, uio);
  
  	return	(error);
  }
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: Andriy Gapon <avg@freebsd.org>
To: bug-followup@freebsd.org, Xin LI <delphij@freebsd.org>
Cc:  
Subject: Re: kern/127213: [tmpfs] [patch] sendfile on tmpfs data corruption
Date: Wed, 15 Sep 2010 13:26:05 +0300

 r197850 has never been MFC-ed?!
 
 -- 
 Andriy Gapon

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/127213: commit references a PR
Date: Fri, 24 Sep 2010 17:27:04 +0000 (UTC)

 Author: delphij
 Date: Fri Sep 24 17:26:57 2010
 New Revision: 213106
 URL: http://svn.freebsd.org/changeset/base/213106
 
 Log:
   MFC r197850:
   
   Add a special workaround to handle UIO_NOCOPY case.  This fixes data
   corruption observed when sendfile() is being used.
   
   Requested by:	avg
   PR:		kern/127213
   Submitted by:	gk
 
 Modified:
   stable/8/sys/fs/tmpfs/tmpfs_vnops.c
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
   stable/8/sys/dev/xen/xenpci/   (props changed)
 
 Modified: stable/8/sys/fs/tmpfs/tmpfs_vnops.c
 ==============================================================================
 --- stable/8/sys/fs/tmpfs/tmpfs_vnops.c	Fri Sep 24 16:40:46 2010	(r213105)
 +++ stable/8/sys/fs/tmpfs/tmpfs_vnops.c	Fri Sep 24 17:26:57 2010	(r213106)
 @@ -43,6 +43,8 @@ __FBSDID("$FreeBSD$");
  #include <sys/priv.h>
  #include <sys/proc.h>
  #include <sys/resourcevar.h>
 +#include <sys/sched.h>
 +#include <sys/sf_buf.h>
  #include <sys/stat.h>
  #include <sys/systm.h>
  #include <sys/unistd.h>
 @@ -433,15 +435,72 @@ tmpfs_setattr(struct vop_setattr_args *v
  }
  
  /* --------------------------------------------------------------------- */
 +static int
 +tmpfs_nocacheread(vm_object_t tobj, vm_pindex_t idx,
 +    vm_offset_t offset, size_t tlen, struct uio *uio)
 +{
 +	vm_page_t	m;
 +	int		error;
 +
 +	VM_OBJECT_LOCK(tobj);
 +	vm_object_pip_add(tobj, 1);
 +	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
 +	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
 +	if (m->valid != VM_PAGE_BITS_ALL) {
 +		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
 +			error = vm_pager_get_pages(tobj, &m, 1, 0);
 +			if (error != 0) {
 +				printf("tmpfs get pages from pager error [read]\n");
 +				goto out;
 +			}
 +		} else
 +			vm_page_zero_invalid(m, TRUE);
 +	}
 +	VM_OBJECT_UNLOCK(tobj);
 +	error = uiomove_fromphys(&m, offset, tlen, uio);
 +	VM_OBJECT_LOCK(tobj);
 +out:
 +	vm_page_lock_queues();
 +	vm_page_unwire(m, TRUE);
 +	vm_page_unlock_queues();
 +	vm_page_wakeup(m);
 +	vm_object_pip_subtract(tobj, 1);
 +	VM_OBJECT_UNLOCK(tobj);
 +
 +	return (error);
 +}
 +
 +static __inline int
 +tmpfs_nocacheread_buf(vm_object_t tobj, vm_pindex_t idx,
 +    vm_offset_t offset, size_t tlen, void *buf)
 +{
 +	struct uio uio;
 +	struct iovec iov;
 +
 +	uio.uio_iovcnt = 1;
 +	uio.uio_iov = &iov;
 +	iov.iov_base = buf;
 +	iov.iov_len = tlen;
 +
 +	uio.uio_offset = 0;
 +	uio.uio_resid = tlen;
 +	uio.uio_rw = UIO_READ;
 +	uio.uio_segflg = UIO_SYSSPACE;
 +	uio.uio_td = curthread;
 +
 +	return (tmpfs_nocacheread(tobj, idx, offset, tlen, &uio));
 +}
  
  static int
  tmpfs_mappedread(vm_object_t vobj, vm_object_t tobj, size_t len, struct uio *uio)
  {
 +	struct sf_buf	*sf;
  	vm_pindex_t	idx;
  	vm_page_t	m;
  	vm_offset_t	offset;
  	off_t		addr;
  	size_t		tlen;
 +	char		*ma;
  	int		error;
  
  	addr = uio->uio_offset;
 @@ -465,33 +524,30 @@ lookupvpg:
  		vm_page_wakeup(m);
  		VM_OBJECT_UNLOCK(vobj);
  		return	(error);
 +	} else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) {
 +		if (vm_page_sleep_if_busy(m, FALSE, "tmfsmr"))
 +			goto lookupvpg;
 +		vm_page_busy(m);
 +		VM_OBJECT_UNLOCK(vobj);
 +		sched_pin();
 +		sf = sf_buf_alloc(m, SFB_CPUPRIVATE);
 +		ma = (char *)sf_buf_kva(sf);
 +		error = tmpfs_nocacheread_buf(tobj, idx, offset, tlen,
 +		    ma + offset);
 +		if (error == 0) {
 +			uio->uio_offset += tlen;
 +			uio->uio_resid -= tlen;
 +		}
 +		sf_buf_free(sf);
 +		sched_unpin();
 +		VM_OBJECT_LOCK(vobj);
 +		vm_page_wakeup(m);
 +		VM_OBJECT_UNLOCK(vobj);
 +		return	(error);
  	}
  	VM_OBJECT_UNLOCK(vobj);
  nocache:
 -	VM_OBJECT_LOCK(tobj);
 -	vm_object_pip_add(tobj, 1);
 -	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
 -	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
 -	if (m->valid != VM_PAGE_BITS_ALL) {
 -		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
 -			error = vm_pager_get_pages(tobj, &m, 1, 0);
 -			if (error != 0) {
 -				printf("tmpfs get pages from pager error [read]\n");
 -				goto out;
 -			}
 -		} else
 -			vm_page_zero_invalid(m, TRUE);
 -	}
 -	VM_OBJECT_UNLOCK(tobj);
 -	error = uiomove_fromphys(&m, offset, tlen, uio);
 -	VM_OBJECT_LOCK(tobj);
 -out:
 -	vm_page_lock_queues();
 -	vm_page_unwire(m, TRUE);
 -	vm_page_unlock_queues();
 -	vm_page_wakeup(m);
 -	vm_object_pip_subtract(tobj, 1);
 -	VM_OBJECT_UNLOCK(tobj);
 +	error = tmpfs_nocacheread(tobj, idx, offset, tlen, uio);
  
  	return	(error);
  }
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: delphij 
State-Changed-When: Sat Feb 19 13:39:53 UTC 2011 
State-Changed-Why:  
Patch MFC'ed as 213106. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=127213 
>Unformatted:
