From nobody@FreeBSD.org  Sun Oct  5 08:24:09 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 02CDE106569D
	for <freebsd-gnats-submit@FreeBSD.org>; Sun,  5 Oct 2008 08:24:09 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id E3F6E8FC16
	for <freebsd-gnats-submit@FreeBSD.org>; Sun,  5 Oct 2008 08:24:08 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id m958O8xb055625
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 5 Oct 2008 08:24:08 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id m958O8Lu055624;
	Sun, 5 Oct 2008 08:24:08 GMT
	(envelope-from nobody)
Message-Id: <200810050824.m958O8Lu055624@www.freebsd.org>
Date: Sun, 5 Oct 2008 08:24:08 GMT
From: Nejc koberne <nejc@skoberne.net>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Rewinding on unionfs and Subversion
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         127872
>Category:       bin
>Synopsis:       [libc] [patch] Rewinding on unionfs and Subversion
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Oct 05 08:30:04 UTC 2008
>Closed-Date:    
>Last-Modified:  Wed Oct 08 20:49:20 UTC 2008
>Originator:     Nejc koberne
>Release:        7.0-STABLE
>Organization:
Tnode.com
>Environment:
FreeBSD server.domain.com 7.0-STABLE FreeBSD 7.0-STABLE #5: Sun Aug 10 09:54:42 CEST 2008     root@server.domain.com:/usr/src/sys/amd64/compile/SERVER  amd64
>Description:
0. Introduction
---------------

We tried to install Subversion in a jail on a unionfs filesystem.
Unfortunately, the subversion FreeBSD port doesn't do "make check" before
installing, so we spent quite some time debugging to find out it was
actually a filesystem bug.

1. The bug in real world.
-------------------------

After installing Subversion and get it running using mod_dav_svn, when
trying to make a "svn commit" or "svn import", errors similar to these
show up in apache error log:

[Wed Oct 01 21:04:35 2008] [error] [client 10.1.1.11] Could not MERGE resource "/svn/test2/!svn/act/e28480c9-eb8f-dd11-808c-0018fe7759ca" into "/svn/test2".  [409, #0]
[Wed Oct 01 21:04:35 2008] [error] [client 10.1.1.11] An error occurred while committing the transaction.  [409, #2]
[Wed Oct 01 21:04:35 2008] [error] [client 10.1.1.11] Can't remove '/usr/local/svn/test2/db/transactions/2-2.txn/node.0.0'  [409, #2]
[Wed Oct 01 21:04:35 2008] [error] [client 10.1.1.11] Can't remove file '/usr/local/svn/test2/db/transactions/2-2.txn/node.0.0': No such file or directory  [409, #2]

The last error is also shown up at client side (svn binary). However,
all actions are succesfully accomplished, so the error shouldn't appear
at all. Also, when doing "make check" after building subversion from
source, would also fail with identical errors.

2. Bug location
---------------

After some ktracing and tracing the function calls back to the "root",
we discovered that the bug is probably present in Standard C library. 

>How-To-Repeat:
3. Proof of concept code
------------------------

The following code below works fine on UFS, but fails on unionfs. The
code itself was taken from subversion codebase and is rewritten not to
use Apache apr library, but uses libc functions directly instead. It
just shows that there are discrepancies in UFS vs. unionfs behaviour. 

-----------------------------------------------------------------------
#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>
#include <fcntl.h>
#include <string.h>

typedef struct dirent direntry;

int remove_dir(char *path) {
        int need_rewind;

        if (path[0] == '\0') path = ".";

        DIR *dir;
        if ((dir = opendir(path)) == NULL) {
                perror("opendir");
                return 1;
        }

        do {
                need_rewind = 0;

                int ret;
                long position;
                direntry entry;
                direntry *result;

                while (1) {
                        if (readdir_r(dir, &entry, &result) != 0) {
                                perror("readdir_r");
                                return 1;
                        }

                        if (result == NULL) {
                                break;
                        }

                        printf("Working on '%s'\n", entry.d_name);
                        printf("  entry.d_fileno is %d\n", entry.d_fileno);

                        if ((entry.d_type == DT_DIR) &&
                           ((strcmp(entry.d_name, ".") == 0) ||
                           (strcmp(entry.d_name, "..") == 0))) {
                                continue;
                        }
                        else {
                                char fullpath[1000];

                                need_rewind = 1;

                                strcpy(fullpath, path);
                                strcat(fullpath, "/");
                                strcat(fullpath, entry.d_name);

                                printf("  fullpath is '%s'\n", fullpath);

                                if (entry.d_type == DT_DIR) {
                                        if (remove_dir(fullpath)) {
                                                return 1;
                                        }
                                }
                                else {
                                        if (unlink(fullpath) == -1) {
                                                perror("unlink");
                                                return 1;
                                        }
                                }
                        }
                }

                if (need_rewind) {
                        printf("Rewinding\n");
                        rewinddir(dir);
                }
        } while (need_rewind);

        if (closedir(dir) == -1) {
                perror("closedir");
                return 1;
        }

        if (rmdir(path) == -1) {
                perror("rmdir");
                return 1;
        }

        return 0;
}

int main(int argc, const char *const argv[]) {
        if (mkdir("test", 0755) == -1) {
                perror("mkdir");
                return 1;
        }

        int file = -1;
        if ((file = open("test/file",
             O_WRONLY | O_CREAT | O_TRUNC, 0644)) == -1) {
                perror("open");
                return 1;
        }

        if (close(file) == -1) {
                perror("close");
                return 1;
        }

        if (remove_dir("test")) {
                printf("Test failed\n");
                return 1;
        }
        else {
                printf("It works as it should\n");
                return 0;
        }
}
----------------------------------------------------------------------

>Fix:
See the attached patch for /usr/src/lib/libc/gen/rewinddir.c.

Patch attached with submission follows:

--- /usr/src/lib/libc/gen/rewinddir.c   2007-01-09 01:27:55.000000000 +0100
+++ rewinddir.c 2008-10-05 02:24:15.000000000 +0200
@@ -33,8 +33,17 @@
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD: src/lib/libc/gen/rewinddir.c,v 1.6 2007/01/09 00:27:55 imp Exp $");

+#include "namespace.h"
 #include <sys/types.h>
+#include <sys/param.h>
+#include <sys/mount.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <unistd.h>
 #include <dirent.h>
+#include <stdlib.h>
+#include <string.h>
+#include "un-namespace.h"

 #include "telldir.h"

@@ -42,7 +51,158 @@
 rewinddir(dirp)
        DIR *dirp;
 {
-
        _seekdir(dirp, dirp->dd_rewind);
+
+       int incr;
+       int unionstack;
+
+    /*
+     * Use the system page size if that is a multiple of DIRBLKSIZ.
+     * Hopefully this can be a big win someday by allowing page
+     * trades to user space to be done by _getdirentries().
+     */
+    incr = getpagesize();
+    if ((incr % DIRBLKSIZ) != 0)
+        incr = DIRBLKSIZ;
+
+    /*
+     * Determine whether this directory is the top of a union stack.
+     */
+    if (dirp->dd_flags & DTF_NODUP) {
+        struct statfs sfb;
+
+        if (_fstatfs(dirp->dd_fd, &sfb) < 0)
+            goto fail;
+        unionstack = !strcmp(sfb.f_fstypename, "unionfs")
+            || (sfb.f_flags & MNT_UNION);
+    } else {
+        unionstack = 0;
+    }
+
+    if (unionstack) {
+        int len = dirp->dd_len;
+        int space = dirp->dd_len;
+        char *buf = dirp->dd_buf;
+        char *ddptr = dirp->dd_buf;
+        char *ddeptr;
+        int n;
+        struct dirent **dpv;
+
+               if (lseek(dirp->dd_fd, 0, SEEK_SET) == -1)
+                       goto fail;
+               dirp->dd_seek = 0;
+
+        /*
+         * The strategy here is to read all the directory
+         * entries into a buffer, sort the buffer, and
+         * remove duplicate entries by setting the inode
+         * number to zero.
+         */
+
+        do {
+            /*
+             * Always make at least DIRBLKSIZ bytes
+             * available to _getdirentries
+             */
+            if (space < DIRBLKSIZ) {
+                space += incr;
+                len += incr;
+                buf = reallocf(buf, len);
+                if (buf == NULL)
+                    goto fail;
+                ddptr = buf + (len - space);
+            }
+
+            n = _getdirentries(dirp->dd_fd, ddptr, space, &dirp->dd_seek);
+            if (n > 0) {
+                ddptr += n;
+                space -= n;
+            }
+        } while (n > 0);
+
+        ddeptr = ddptr;
+        dirp->dd_flags |= __DTF_READALL;
+
+        /*
+         * There is now a buffer full of (possibly) duplicate
+         * names.
+         */
+        dirp->dd_buf = buf;
+
+        /*
+         * Go round this loop twice...
+         *
+         * Scan through the buffer, counting entries.
+         * On the second pass, save pointers to each one.
+         * Then sort the pointers and remove duplicate names.
+         */
+        for (dpv = 0;;) {
+            n = 0;
+            ddptr = buf;
+            while (ddptr < ddeptr) {
+                struct dirent *dp;
+
+                dp = (struct dirent *) ddptr;
+                if ((long)dp & 03L)
+                    break;
+                if ((dp->d_reclen <= 0) ||
+                    (dp->d_reclen > (ddeptr + 1 - ddptr)))
+                    break;
+                ddptr += dp->d_reclen;
+                if (dp->d_fileno) {
+                    if (dpv)
+                        dpv[n] = dp;
+                    n++;
+                }
+            }
+
+            if (dpv) {
+                struct dirent *xp;
+
+                /*
+                 * This sort must be stable.
+                 */
+                mergesort(dpv, n, sizeof(*dpv), alphasort);
+
+                dpv[n] = NULL;
+                xp = NULL;
+
+                /*
+                 * Scan through the buffer in sort order,
+                 * zapping the inode number of any
+                 * duplicate names.
+                 */
+                for (n = 0; dpv[n]; n++) {
+                    struct dirent *dp = dpv[n];
+
+                    if ((xp == NULL) ||
+                        strcmp(dp->d_name, xp->d_name)) {
+                        xp = dp;
+                    } else {
+                        dp->d_fileno = 0;
+                    }
+                    if (dp->d_type == DT_WHT &&
+                        (dirp->dd_flags & DTF_HIDEW))
+                        dp->d_fileno = 0;
+                }
+
+                free(dpv);
+                break;
+            } else {
+                dpv = malloc((n+1) * sizeof(struct dirent *));
+                if (dpv == NULL)
+                    break;
+            }
+        }
+
+        dirp->dd_len = len;
+               dirp->dd_loc = 0;
+        dirp->dd_size = ddptr - dirp->dd_buf;
+    }
+
+/*
+ * Silently ignore any errors
+ */
+fail:
        dirp->dd_rewind = telldir(dirp);
 }


>Release-Note:
>Audit-Trail:
>Unformatted:
