From nobody@FreeBSD.ORG Mon Jul 12 10:11:32 1999
Return-Path: <nobody@FreeBSD.ORG>
Received: by hub.freebsd.org (Postfix, from userid 32767)
	id BDE74150A7; Mon, 12 Jul 1999 10:11:32 -0700 (PDT)
Message-Id: <19990712171132.BDE74150A7@hub.freebsd.org>
Date: Mon, 12 Jul 1999 10:11:32 -0700 (PDT)
From: conrad@th.physik.uni-bonn.de
Sender: nobody@FreeBSD.ORG
To: freebsd-gnats-submit@freebsd.org
Subject: At boottime NFS mounts on a 3.2 client from a 2.2.7 server fails, when in turn the server has NFS mounted filesystems on the client
X-Send-Pr-Version: www-1.0

>Number:         12609
>Category:       kern
>Synopsis:       At boottime NFS mounts on a 3.2 client from a 2.2.7 server fails, when in turn the server has NFS mounted filesystems on the client
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jul 12 10:20:00 PDT 1999
>Closed-Date:    Tue Jun 26 03:31:00 PDT 2001
>Last-Modified:  Tue Jun 26 03:32:19 PDT 2001
>Originator:     Jan Conrad
>Release:        3.2-STABLE, 2.2.7-RELEASE, also occured with 'MATT's nfs patches installed
>Organization:
Univ. of Bonn
>Environment:
client: FreeBSD merlin.th.physik.uni-bonn.de 3.2-STABLE FreeBSD 3.2-STABLE #4: Mon Jul 12 11:46:44 CEST 1999     conrad@merlin.th.physik.uni-bonn.de:/nilles/freebsd/src/sys/compile/MERLIN  i386
server: FreeBSD dirac.th.physik.uni-bonn.de 2.2.7-RELEASE FreeBSD 2.2.7-RELEASE #0: Sun Oct 25 16:58:29 CET 1998     conrad@merlin.physik.uni-bonn.de:/nilles/freebsd/src/sys/compile/MACH  i386
>Description:
Consider the following setup:

merlin's fstab:
  pauli:/nilles/backup    /nilles/backup          nfs rw,bg,soft,nosuid,nodev 0 0
  dirac:/nilles/share     /nilles/share           nfs rw,bg,soft,nodev        0 0
  dirac:/nilles/home      /nilles/home            nfs rw,bg,soft,nosuid,nodev 0 0
  dirac:/usr/www/httpd/root       /nilles/www     nfs rw,bg,soft,nosuid,nodev 0 0
dirac's fstab:
  /nilles/freebsd@merlin  /nilles/freebsd         nfs rw,bg,soft,nodev    0 0

(pauli runs 2.2.7 as well....)
When dirac does NOT mount /nilles/freebsd@merlin, merlin reboots
gracefully, but when one mounts /nilles/freebsd@merlin on dirac
(even without accessing it...) the mounts of /nilles/share etc. on
merlin is backgrounded at boottime.

the following message appears (for the first two mountpoints ..@dirac):
	nfs: bad MNT RPC: RPC: timed out
(the message is generated by mount_nfs!)
But the mount of /nilles/backup@pauli succeeds before the mounts
 ..@dirac are attempted.

Output from 'tcpdump -s 200 -vv host merlin':
(the first packet stems from (presumably dirac's) xntpd)
<BOOT, /nilles/freebsd@merlin mounted on dirac>
17:05:06.194400 dirac.th.physik.uni-bonn.de.ntp > merlin.th.physik.uni-bonn.de.ntp: v3 sym_act strat 2 poll 6 prec -16 dist 0.016845
 disp 0.003265 ref sunmanager.lrz-muenchen.de@3140780589.210189819 orig 3140780614.147031784 rec +0.001723289 xmt +92.047328948 (ttl
 64, id 53479)
17:05:35.047701 arp who-has merlin.th.physik.uni-bonn.de tell merlin.th.physik.uni-bonn.de
17:05:35.331584 arp who-has pauli.th.physik.uni-bonn.de tell merlin.th.physik.uni-bonn.de
17:05:35.336906 arp who-has dirac.th.physik.uni-bonn.de tell merlin.th.physik.uni-bonn.de
17:05:35.336921 arp reply dirac.th.physik.uni-bonn.de is-at 0:90:27:1c:b1:c3
17:05:35.337018 merlin.th.physik.uni-bonn.de.1017 > dirac.th.physik.uni-bonn.de.sunrpc: udp 56 (ttl 64, id 8)
17:05:35.337119 dirac.th.physik.uni-bonn.de.sunrpc > merlin.th.physik.uni-bonn.de.1017: udp 28 (ttl 64, id 54383)
17:05:35.337336 merlin.th.physik.uni-bonn.de.1016 > dirac.th.physik.uni-bonn.de.sunrpc: udp 56 (ttl 64, id 9)
17:05:35.337397 dirac.th.physik.uni-bonn.de.sunrpc > merlin.th.physik.uni-bonn.de.1016: udp 28 (ttl 64, id 54384)
17:05:35.337637 merlin.th.physik.uni-bonn.de.408c9418 > dirac.th.physik.uni-bonn.de.nfs: 40 null (ttl 64, id 10)
17:05:35.337692 dirac.th.physik.uni-bonn.de.nfs > merlin.th.physik.uni-bonn.de.408c9418: reply ok 24 null (ttl 64, id 54385)
17:05:35.337894 merlin.th.physik.uni-bonn.de.1014 > dirac.th.physik.uni-bonn.de.sunrpc: udp 56 (ttl 64, id 11)
17:05:35.337959 dirac.th.physik.uni-bonn.de.sunrpc > merlin.th.physik.uni-bonn.de.1014: udp 28 (ttl 64, id 54386)
17:05:35.338219 merlin.th.physik.uni-bonn.de.1013 > dirac.th.physik.uni-bonn.de.983: udp 112 (ttl 64, id 12)
17:05:35.338527 dirac.th.physik.uni-bonn.de.4cac80a7 > merlin.th.physik.uni-bonn.de.nfs: 92 getattr fh 0,131079/2 (ttl 64, id 54387)
17:05:35.338664 merlin.th.physik.uni-bonn.de > dirac.th.physik.uni-bonn.de: icmp: merlin.th.physik.uni-bonn.de udp port nfsd unreach
able (ttl 255, id 13)
17:05:36.724607 dirac.th.physik.uni-bonn.de.4cac80a7 > merlin.th.physik.uni-bonn.de.nfs: 92 getattr fh 0,131079/2 (ttl 64, id 54399)
17:05:36.724715 merlin.th.physik.uni-bonn.de > dirac.th.physik.uni-bonn.de: icmp: merlin.th.physik.uni-bonn.de udp port nfsd unreach
able (ttl 255, id 14)
17:05:39.494636 dirac.th.physik.uni-bonn.de.4cac80a7 > merlin.th.physik.uni-bonn.de.nfs: 92 getattr fh 0,131079/2 (ttl 64, id 54424)
17:05:39.494743 merlin.th.physik.uni-bonn.de > dirac.th.physik.uni-bonn.de: icmp: merlin.th.physik.uni-bonn.de udp port nfsd unreach
able (ttl 255, id 15)
17:05:45.024673 dirac.th.physik.uni-bonn.de.4cac80a7 > merlin.th.physik.uni-bonn.de.nfs: 92 getattr fh 0,131079/2 (ttl 64, id 54473)
17:05:45.024782 merlin.th.physik.uni-bonn.de > dirac.th.physik.uni-bonn.de: icmp: merlin.th.physik.uni-bonn.de udp port nfsd unreach
able (ttl 255, id 16)
<AND ONCE AGAIN>

<BOOT, /nilles/freebsd@merlin NOT mounted on dirac>
17:08:18.196237 dirac.th.physik.uni-bonn.de.ntp > merlin.th.physik.uni-bonn.de.ntp: v3 sym_act strat 2 poll 6 prec -16 dist 0.015213 di
sp 0.002761 ref unios.rz.Uni-Osnabrueck.DE@3140780890.215399742 orig 3140780800.776019096 rec +0.328375816 xmt +97.420174598 (ttl 64, i
d 58653)
17:08:39.028644 arp who-has merlin.th.physik.uni-bonn.de tell merlin.th.physik.uni-bonn.de
17:08:39.312554 arp who-has pauli.th.physik.uni-bonn.de tell merlin.th.physik.uni-bonn.de
17:08:39.317972 arp who-has dirac.th.physik.uni-bonn.de tell merlin.th.physik.uni-bonn.de
17:08:39.317988 arp reply dirac.th.physik.uni-bonn.de is-at 0:90:27:1c:b1:c3
17:08:39.318086 merlin.th.physik.uni-bonn.de.1017 > dirac.th.physik.uni-bonn.de.sunrpc: udp 56 (ttl 64, id 8)
17:08:39.318175 dirac.th.physik.uni-bonn.de.sunrpc > merlin.th.physik.uni-bonn.de.1017: udp 28 (ttl 64, id 59013)
17:08:39.318391 merlin.th.physik.uni-bonn.de.1016 > dirac.th.physik.uni-bonn.de.sunrpc: udp 56 (ttl 64, id 9)
17:08:39.318451 dirac.th.physik.uni-bonn.de.sunrpc > merlin.th.physik.uni-bonn.de.1016: udp 28 (ttl 64, id 59014)
17:08:39.318690 merlin.th.physik.uni-bonn.de.408c0aaf > dirac.th.physik.uni-bonn.de.nfs: 40 null (ttl 64, id 10)
17:08:39.318744 dirac.th.physik.uni-bonn.de.nfs > merlin.th.physik.uni-bonn.de.408c0aaf: reply ok 24 null (ttl 64, id 59015)
17:08:39.318946 merlin.th.physik.uni-bonn.de.1014 > dirac.th.physik.uni-bonn.de.sunrpc: udp 56 (ttl 64, id 11)
17:08:39.319010 dirac.th.physik.uni-bonn.de.sunrpc > merlin.th.physik.uni-bonn.de.1014: udp 28 (ttl 64, id 59016)
17:08:39.319272 merlin.th.physik.uni-bonn.de.1013 > dirac.th.physik.uni-bonn.de.983: udp 112 (ttl 64, id 12)
17:08:39.319747 dirac.th.physik.uni-bonn.de.983 > merlin.th.physik.uni-bonn.de.1013: udp 68 (ttl 64, id 59017)
17:08:39.320193 merlin.th.physik.uni-bonn.de.126e77e4 > dirac.th.physik.uni-bonn.de.nfs: 92 getattr fh 0,263172/2 (ttl 64, id 13)
17:08:39.320276 dirac.th.physik.uni-bonn.de.nfs > merlin.th.physik.uni-bonn.de.126e77e4: reply ok 112 getattr DIR 755 ids 151/43 sz 512
  (ttl 64, id 59020)
17:08:39.320665 merlin.th.physik.uni-bonn.de.126e77e5 > dirac.th.physik.uni-bonn.de.nfs: 92 fsinfo [|nfs] (ttl 64, id 14)
17:08:39.320731 dirac.th.physik.uni-bonn.de.nfs > merlin.th.physik.uni-bonn.de.126e77e5: reply ok 164 fsinfo POST: DIR 755 ids 151/43 s
z 512 nlink 11 rdev 3/232 fsid 3000000e8 nodeid e800000000 a/m/ctime 931737610.000000 913809812.000000 913809812.000000  [|nfs] (ttl 64
, id 59021)
17:08:39.320934 merlin.th.physik.uni-bonn.de.126e77e6 > dirac.th.physik.uni-bonn.de.nfs: 92 fsstat fh 0,263172/2 (ttl 64, id 15)
17:08:39.321004 dirac.th.physik.uni-bonn.de.nfs > merlin.th.physik.uni-bonn.de.126e77e6: reply ok 168 fsstat POST: DIR 755 ids 151/43 s
z 512 nlink 11 rdev 3/232 fsid 3000000e8 nodeid e800000000 a/m/ctime 931737610.000000 913809812.000000 913809812.000000  [|nfs] (ttl 64
, id 59022)
<AND SO ON>

When dirac was a DEC UNIX box over one year ago (the clients were all
FreeBSD), the problem did never occur... However, once it occured when
mounting a filesystem from a very heavily loaded DEC UNIX box...
So maybe its a problem on the FreeBSD client side...
>How-To-Repeat:
Difficult: I tried it with other 2.2.7 machines as server but didn't
succeed (but I have only one server loaded like dirac..).
But one time (!) it happend when mounting a filesystem from a
DEC UNIX 4.0c box....
>Fix:
Putting mount -a -t nfs in /etc/rc after network_pass_3 doesn't help.
Maybe inserting a sleep would
Real solution: unknown (to me...)

>Release-Note:
>Audit-Trail:

From: Jan Conrad <conrad@th.physik.uni-bonn.de>
To: freebsd-gnats-submit@freebsd.org, conrad@th.physik.uni-bonn.de
Cc:  
Subject: Re: kern/12609: At boottime NFS mounts on a 3.2 client from a 2.2.7
 server fails, when in turn the server has NFS mounted filesystems on the
 client
Date: Thu, 15 Jul 1999 13:28:38 +0200 (CEST)

 Hi again,
 
 The problem is due to the use of realpath in mountd. Realpath in turn uses
 getcwd, the 'canonical' part of which lstats some directory entries when
 looking at a parent directory of a mount point.
 
 If now two NFS mountpoints are in the same directory and the one first in
 the directory is down for some reason, a getcwd somewhere in the second
 one will block till the first one is up again.
 
 However, mountd uses realpath to check the mountpoint, therefore it will
 not mount it until the first one is up again......
 
 Anyhow - I think this causes some NFS lookups, even if not by mountd....
 
 Dirty Workaround:
     hide all NFS mountpoints for one machine in a single dir (e.g.
     /mount/machine/dir) and then symlink to it..
 
 Better:
     Rewrite getcwd (Is there no syscall to find out the device of a
     mountpoint without doing a stat on it??)
     (I am gonna ask this on freebsd-hackers)
 
 best regards
     Jan
 
 -- 
 Physikalisches Institut der Universitaet Bonn
 Nussallee 12
 D-53115 Bonn
 GERMANY
 
 
 

From: Jan Conrad <conrad@th.physik.uni-bonn.de>
To: <freebsd-gnats-submit@FreeBSD.org>
Cc: <freebsd-bugs@freebsd.org>,
	Jan Conrad <conrad@th.physik.uni-bonn.de>
Subject: Re: kern/12609: At boottime NFS mounts on a 3.2 client from a 2.2.7
 server fails, when in turn the server has NFS mounted filesystems on the
 client
Date: Tue, 26 Jun 2001 12:28:29 +0200 (CEST)

 Hi,
 
 please close this PR.
 
 The problem should have been solved in 3.x.
 (see bin/6658)
 
 -Jan
 
 
State-Changed-From-To: open->closed 
State-Changed-By: dwmalone 
State-Changed-When: Tue Jun 26 03:31:00 PDT 2001 
State-Changed-Why:  
Submitter says problem should be fixed in 3.x. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=12609 
>Unformatted:
