From nobody@FreeBSD.org  Mon Jun 11 10:18:28 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 4F9CA106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 11 Jun 2012 10:18:28 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 398CB8FC0C
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 11 Jun 2012 10:18:28 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q5BAISY4072550
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 11 Jun 2012 10:18:28 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q5BAISlM072549;
	Mon, 11 Jun 2012 10:18:28 GMT
	(envelope-from nobody)
Message-Id: <201206111018.q5BAISlM072549@red.freebsd.org>
Date: Mon, 11 Jun 2012 10:18:28 GMT
From: Peter Maloney <peter.maloney@brockmann-consult.de>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [nfs] [zfs] .zfs/snapshot directory is messed up when viewed by a Linux client, and ls -l  can hang it
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         168947
>Category:       kern
>Synopsis:       [nfs] [zfs] .zfs/snapshot directory is messed up when viewed by a Linux client, and ls -l  can hang it
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-fs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jun 11 10:20:11 UTC 2012
>Closed-Date:    
>Last-Modified:  Sun Jun 17 22:37:40 UTC 2012
>Originator:     Peter Maloney
>Release:        8.3-STABLE csup *default date=2012.06.01.00.00.00
>Organization:
Brockmann Consult
>Environment:
FreeBSD bcnas1bak.bc.local 8.3-STABLE FreeBSD 8.3-STABLE #2: Thu Jun  7 12:15:33 CEST 2012     root@bcnas1bak.bc.local:/usr/obj/usr/src/sys/GENERIC  amd64

>Description:
I set this to serious/medium since it can cause hangs, and not critical since it is only one server and I can work around it, and I'm not sure if it still hangs with the latest FreeBSD I'm running.

-------------------------------------
The messed up .zfs snapshot directory 
(non-critical, but likely related): 
-------------------------------------

FreeBSD local (FreeBSD nfs client looks the same):
root@bcnas1bak:/tank/.zfs/snapshot# ls -ld daily*
drwxr-xr-x  29 root  wheel  29 Apr 11 13:47 daily-2012-04-13T00:00:00
drwxr-xr-x  29 root  wheel  29 Apr 11 13:47 daily-2012-04-14T00:00:01
drwxr-xr-x  29 root  wheel  29 Apr 11 13:47 daily-2012-04-15T00:00:00
drwxr-xr-x  29 root  wheel  29 Apr 11 13:47 daily-2012-04-16T00:00:00
drwxr-xr-x  29 root  wheel  29 Apr 11 13:47 daily-2012-04-17T00:00:00
drwxr-xr-x  29 root  wheel  29 Apr 11 13:47 daily-2012-04-18T00:00:00
..

Linux nfs client:
# mount -t nfs -o nolock bcnas1bak:/tank/bcnasvm1 tank
# cd tank/.zfs/snapshot
# ls -ld daily*

(this line looks fine)
drwxr-xr-x 29 root  root         29 Apr 11 13:47 daily-2012-04-13T00:00:00

(this line is a file, not a directory, and has wrong uid and gid)
-r-xr-xr-x  1 peter vboxusers 50384 Jan 18  2010 daily-2012-06-10T00:18:00

(this line is a link, not a directory)
lrwxrwxrwx  1 root  root         21 May 17  2011 daily-2012-05-20T00:18:00 -> package-x-generic.png

etc.


-------------------------------------
The hang (last seen 8.2-STABLE 
(critical?)
-------------------------------------

But unfortunately, I have failed every attempt to create a 2nd server that will hang. Only one production server hangs this way. The replicated backup server with all the same snapshots (shown above) does not hang.

The hang is likely not related to using up all memory mounting the snapshots. I can also hang it just as easily mounting a small number of them. After running "ls -l" (after a fresh reboot due to previous PR I submitted) on the Linux nfs client, arc is only using 667.29 MiB, and the system has 44 GB free.

When it hangs, only the one dataset is affected. All nfs services are frozen. "ls" in the dataset directories hangs. If I remember correctly, "zpool" commands work, and "zfs list" will hang, but "zfs list tank/someunrelateddataset" will not.

Also note I have not hung my production server this way yet with 8.3. The last hang was with 8.2-STABLE Sept2011.

-------------------------------------
more info...
-------------------------------------

Sometimes I would see messages like this:
Oct 31 14:55:39 bcnas1 mountd[47733]: can't delete exports for /tank/.zfs/snapshot/daily-2011-10-06T09:27:52: Invalid argument

But lately I don't, possibly because I hid the .zfs directory with "zfs set snapdir=hidden tank". Or possibly because of a newer FreeBSD.
>How-To-Repeat:
To repeat the messed up looking snapshot directory problem, just take a snapshot every day (or maybe just make lots all at once), (and then maybe restart nfs or reboot,) and then connect with a Linux client, and use ls -l in the .zfs/snapshot directory.
>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sun Jun 17 22:37:26 UTC 2012 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=168947 
>Unformatted:
