From lennox@cs.columbia.edu  Thu Sep  4 15:55:33 2003
Return-Path: <lennox@cs.columbia.edu>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1686516A4BF
	for <FreeBSD-gnats-submit@freebsd.org>; Thu,  4 Sep 2003 15:55:33 -0700 (PDT)
Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2FE9B43F85
	for <FreeBSD-gnats-submit@freebsd.org>; Thu,  4 Sep 2003 15:55:32 -0700 (PDT)
	(envelope-from lennox@cs.columbia.edu)
Received: from cnr.cs.columbia.edu (cnr.cs.columbia.edu [128.59.19.133])
	by cs.columbia.edu (8.12.9/8.12.9) with ESMTP id h84MtTaH009275
	(version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT)
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 4 Sep 2003 18:55:31 -0400 (EDT)
Received: from cnr.cs.columbia.edu (localhost [127.0.0.1])
	by cnr.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id h84MsxuT041672
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 4 Sep 2003 18:54:59 -0400 (EDT)
	(envelope-from lennox@cnr.cs.columbia.edu)
Received: (from lennox@localhost)
	by cnr.cs.columbia.edu (8.12.9/8.12.9/Submit) id h84MsxdA041659;
	Thu, 4 Sep 2003 18:54:59 -0400 (EDT)
Message-Id: <200309042254.h84MsxdA041659@cnr.cs.columbia.edu>
Date: Thu, 4 Sep 2003 18:54:59 -0400 (EDT)
From: Jonathan Lennox <lennox@cs.columbia.edu>
Reply-To: Jonathan Lennox <lennox@cs.columbia.edu>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         56461
>Category:       kern
>Synopsis:       [rpc] FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 04 16:00:24 PDT 2003
>Closed-Date:    Fri Nov 02 13:41:47 UTC 2012
>Last-Modified:  Fri Nov 02 13:41:47 UTC 2012
>Originator:     Jonathan Lennox
>Release:        FreeBSD 5.1-RELEASE-p2 i386
>Organization:
Columbia University Computer Science
>Environment:
System: FreeBSD cnr.cs.columbia.edu 5.1-RELEASE-p2 FreeBSD 5.1-RELEASE-p2 #0: Wed Aug 27 22:24:11 EDT 2003 lennox@cnr.cs.columbia.edu:/usr/obj/usr/src/sys/CNR i386


>Description:

Linux's implementation of NFS NLM locks is buggy: it doesn't support lock
cookies longer than 8 bytes in size.  See the comment in
<http://lxr.linux.no/source/include/linux/lockd/xdr.h?v=2.6.0-test2> on the
definition of 'struct nlm_cookie': "NLM cookies. Technically they can be 1K,
Nobody uses over 8 bytes however."

Unfortunately, this is actually "nobody" except FreeBSD 5.x, which uses
16-byte cookies.  As a result, any attempt by a FreeBSD client to lock an
NFS-mounted file from a Linux server results in the process on the FreeBSD
client hanging, unkillably.

Getting this fixed in Linux will probably be difficult -- after all, it
doesn't inconvenience *Linux* users.  Moreover, since this hasn't been fixed
as of Linux 2.6-test, any server-side fix is going to take a *long* time to
be reliably deployed.  As such, I'm afraid that in order to have successful
interoperation with Linux NFS servers, the FreeBSD NFS lock client code
needs to be modified to send only 8-byte NLM cookies.

The patch I've attached below is a quick-and-dirty fix, as recommended by
Dan Nelson on freebsd-hackers on 29 April 2003.  However, it loses
functionality, since the protection against PID recycling is disabled.

A proper fix would be either to somehow compress all three pieces of
information -- pid, pid_start, and msg_seq -- into eight bytes (difficult);
maintain an in-kernel table mapping an eight-byte sequence number to
lockd_msg_ident; or find some other, smaller way of defending against pid
recycling.

>How-To-Repeat:

Make sure rpc.lockd and rpc.statd are running.

NFS-mount a filesystem from a Linux fileserver.

flock() the file.

Observe the flock()ing process hanging.  Notice that not even kill -9 will
kill the process.

>Fix:

Apply the following patch, and rebuild rpc.lockd and your kernel.

--- nfs_lock.h.orig	Thu Sep  4 18:11:45 2003
+++ nfs_lock.h	Thu Sep  4 18:12:17 2003
@@ -49,12 +49,10 @@
 /*
  * This structure is used to uniquely identify the process which originated
  * a particular message to lockd.  A sequence number is used to differentiate
- * multiple messages from the same process.  A process start time is used to
- * detect the unlikely, but possible, event of the recycling of a pid.
+ * multiple messages from the same process.
  */
 struct lockd_msg_ident {
 	pid_t		pid;            /* The process ID. */
-	struct timeval	pid_start;	/* Start time of process id */
 	int		msg_seq;	/* Sequence number of message */
 };
 
--- nfs_lock.c.orig	Thu Sep  4 18:11:50 2003
+++ nfs_lock.c	Thu Sep  4 18:14:45 2003
@@ -117,7 +117,6 @@
 		p->p_nlminfo->pid_start = p->p_stats->p_start;
 		timevaladd(&p->p_nlminfo->pid_start, &boottime);
 	}
-	msg.lm_msg_ident.pid_start = p->p_nlminfo->pid_start;
 	msg.lm_msg_ident.msg_seq = ++(p->p_nlminfo->msg_seq);
 
 	msg.lm_fl = *fl;
@@ -257,8 +256,8 @@
 	 */
 	if (targetp->p_nlminfo == NULL ||
 	    ((ansp->la_msg_ident.msg_seq != -1) &&
-	      (timevalcmp(&targetp->p_nlminfo->pid_start,
-			&ansp->la_msg_ident.pid_start, !=) ||
+	      (/*timevalcmp(&targetp->p_nlminfo->pid_start,
+                 &ansp->la_msg_ident.pid_start, !=) || */
 	       targetp->p_nlminfo->msg_seq != ansp->la_msg_ident.msg_seq))) {
 		PROC_UNLOCK(targetp);
 		return (EPIPE);
>Release-Note:
>Audit-Trail:

From: Kris Kennaway <kris@obsecurity.org>
To: freebsd-gnats-submit@FreeBSD.org, lennox@cs.columbia.edu
Cc:  
Subject: Re: kern/56461
Date: Fri, 10 Oct 2003 23:19:36 -0700

 I can confirm that this patch resolves my interoperability problems
 with a FreeBSD 5.1 client and Linux 2.4.x (Redhat) server.  Thanks!
 
 Kris
Responsible-Changed-From-To: freebsd-bugs->kris 
Responsible-Changed-By: kris 
Responsible-Changed-When: Thu Nov 6 19:53:46 PST 2003 
Responsible-Changed-Why:  
I am looking at this 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56461 

From: Jonathan Lennox <lennox@cs.columbia.edu>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/56461: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd
Date: Tue, 13 Jan 2004 17:11:33 -0500

 This problem is still present under FreeBSD 5.2-RELEASE.
 
 The patch still applies.  I also need to apply the patch in PR bin/56500
 (rpc.lockd should use reserved ports) to get the NLM client to work with our
 Linux NFS server.
Responsible-Changed-From-To: kris->freebsd-bugs 
Responsible-Changed-By: kris 
Responsible-Changed-When: Wed May 5 22:13:10 PDT 2004 
Responsible-Changed-Why:  
I no longer have the resources to test this problem 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56461 
State-Changed-From-To: open->analyzed 
State-Changed-By: bms 
State-Changed-When: Fri Jun 18 11:54:22 GMT 2004 
State-Changed-Why:  
It's pretty clear we have a good analysis for this now. 
The problem is limited in scope to -CURRENT. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56461 

From: Bruce M Simpson <bms@spc.org>
To: freebsd-net@FreeBSD.org
Cc: alfred@FreeBSD.org, kris@FreeBSD.org,
	Jonathan Lennox <lennox@cs.columbia.edu>,
	freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/56461: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd
Date: Fri, 18 Jun 2004 12:49:30 +0100

 --10jrOL3x2xqLmOsH
 Content-Type: multipart/mixed; boundary="mJm6k4Vb/yFcL9ZU"
 Content-Disposition: inline
 
 
 --mJm6k4Vb/yFcL9ZU
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 I've attached my thoughts on this issue. I haven't gone ahead and
 committed the fix in the PR as it makes us just as braindead as Linux,
 but it would be good to be able to have this in GENERIC so that it
 can be enabled in those situations where it's needed.
 
 Regards,
 BMS
 
 --mJm6k4Vb/yFcL9ZU
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="lockd-linux-compat.txt"
 
 Synopsis:
 
 Linux NFS advisory locks are broken and incompatible with the rest
 of the world. FreeBSD 5.x in particular uses BSD/OS derived NFS code
 and thus is affected. FreeBSD 4.x does not implement client-side NFS
 advisory locks.
 
 This problem is also documented as existing for MacOS X, IRIX and BSD/OS:
 http://www.netsys.com/bsdi-users/2002-04/msg00036.html
 http://www.uwsg.iu.edu/hypermail/linux/kernel/0311.0/0498.html
 http://lists.freebsd.org/pipermail/freebsd-hackers/2003-July/001833.html
 http://lists.freebsd.org/pipermail/freebsd-hackers/2003-April/000592.html
 
 The patch provided in the PR is verified to solve the problem, but
 it would be good to make this functionality optional at run-time,
 as many people are likely to be using Linux NFS shares read/write
 with advisory locks.
 
 Walkthrough:
 
 The addition of pid_start to struct lockd_msg_ident is what triggered
 this problem. The offending member is referenced by the NFS code, and
 rpc.lockd itself.
 
 The kernel interface code for rpc.lockd resides in
 src/usr.sbin/rpc.lockd/kern.c.
 
 LOCKD_MSG is what gets passed from the kernel to rpc.lockd via the
 named pipe /var/run/lock.
 
 NFSCLNT_LOCKDANS is used by lockd to send a response back. struct
 lockd_ans is the structure passed via this syscall. The kernel code
 for this is in nfslockdans(), in src/sys/nfsclient/nfs_lock.c.
 
 Proposed solution:
 
 Actual NLM request conversion to/from the kernel happens in rpc.lockd;
 there are several places in kern.c, notably test_request() and
 lock_request(), which reference struct nlm4_testargs, struct nlm_testargs,
 struct nlm_lockargs, and struct nlm4_lockargs.
 These are defined in src/include/rpcsvc/nlm_prot.x.
 
 XXX Are the lockd cookies different from the regular NFS filehandles?
 
 	arg4.cookie.n_bytes = (char *)&msg->lm_msg_ident;
 	arg4.cookie.n_len = sizeof(msg->lm_msg_ident);
 
 There's no need to change this structure, just the number of bytes
 provided by it; the lm_msg_ident structure needs to change if we're
 doing Linux compatbility, and is probably best served by adding
 a sysctl to keep track of whether we're in this mode or not.
 
 So embedding a union of structs in lm_msg_ident is probably the way to go,
 and taking the sizeof() the embedded struct as appropriate.
 
 I would suggest adding a sysctl to the tree: vfs.nfs.pid_start_locks,
 "Use process start time as well as PID to differentiate client-side NFS locks".
 This should be referenced from nfslockdans() as per the original patch
 to check if the timercmp comparison should be skipped.
 
 --mJm6k4Vb/yFcL9ZU--
 
 --10jrOL3x2xqLmOsH
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Comment: ''
 
 iD8DBQFA0tbJueUpAYYNtTsRApOvAJ0eHzIGWVsy1AZr47L8NuOgd3K5PQCeIseX
 w+UzIFGJW52FfeV2PsmXw+U=
 =7hCl
 -----END PGP SIGNATURE-----
 
 --10jrOL3x2xqLmOsH--

From: Alfred Perlstein <alfred@freebsd.org>
To: freebsd-net@FreeBSD.org, kris@FreeBSD.org,
	Jonathan Lennox <lennox@cs.columbia.edu>,
	freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/56461: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd
Date: Fri, 18 Jun 2004 10:51:21 -0700

 This fucking sucks.
 
 *Sigh* make it a sysctl, but can someone please lay the smack
 down on the linuxiots and have them fix thier crap?
 
 
 
 * Bruce M Simpson <bms@spc.org> [040618 04:50] wrote:
 > I've attached my thoughts on this issue. I haven't gone ahead and
 > committed the fix in the PR as it makes us just as braindead as Linux,
 > but it would be good to be able to have this in GENERIC so that it
 > can be enabled in those situations where it's needed.
 > 
 > Regards,
 > BMS
 
 > Synopsis:
 > 
 > Linux NFS advisory locks are broken and incompatible with the rest
 > of the world. FreeBSD 5.x in particular uses BSD/OS derived NFS code
 > and thus is affected. FreeBSD 4.x does not implement client-side NFS
 > advisory locks.
 > 
 > This problem is also documented as existing for MacOS X, IRIX and BSD/OS:
 > http://www.netsys.com/bsdi-users/2002-04/msg00036.html
 > http://www.uwsg.iu.edu/hypermail/linux/kernel/0311.0/0498.html
 > http://lists.freebsd.org/pipermail/freebsd-hackers/2003-July/001833.html
 > http://lists.freebsd.org/pipermail/freebsd-hackers/2003-April/000592.html
 > 
 > The patch provided in the PR is verified to solve the problem, but
 > it would be good to make this functionality optional at run-time,
 > as many people are likely to be using Linux NFS shares read/write
 > with advisory locks.
 > 
 > Walkthrough:
 > 
 > The addition of pid_start to struct lockd_msg_ident is what triggered
 > this problem. The offending member is referenced by the NFS code, and
 > rpc.lockd itself.
 > 
 > The kernel interface code for rpc.lockd resides in
 > src/usr.sbin/rpc.lockd/kern.c.
 > 
 > LOCKD_MSG is what gets passed from the kernel to rpc.lockd via the
 > named pipe /var/run/lock.
 > 
 > NFSCLNT_LOCKDANS is used by lockd to send a response back. struct
 > lockd_ans is the structure passed via this syscall. The kernel code
 > for this is in nfslockdans(), in src/sys/nfsclient/nfs_lock.c.
 > 
 > Proposed solution:
 > 
 > Actual NLM request conversion to/from the kernel happens in rpc.lockd;
 > there are several places in kern.c, notably test_request() and
 > lock_request(), which reference struct nlm4_testargs, struct nlm_testargs,
 > struct nlm_lockargs, and struct nlm4_lockargs.
 > These are defined in src/include/rpcsvc/nlm_prot.x.
 > 
 > XXX Are the lockd cookies different from the regular NFS filehandles?
 > 
 > 	arg4.cookie.n_bytes = (char *)&msg->lm_msg_ident;
 > 	arg4.cookie.n_len = sizeof(msg->lm_msg_ident);
 > 
 > There's no need to change this structure, just the number of bytes
 > provided by it; the lm_msg_ident structure needs to change if we're
 > doing Linux compatbility, and is probably best served by adding
 > a sysctl to keep track of whether we're in this mode or not.
 > 
 > So embedding a union of structs in lm_msg_ident is probably the way to go,
 > and taking the sizeof() the embedded struct as appropriate.
 > 
 > I would suggest adding a sysctl to the tree: vfs.nfs.pid_start_locks,
 > "Use process start time as well as PID to differentiate client-side NFS locks".
 > This should be referenced from nfslockdans() as per the original patch
 > to check if the timercmp comparison should be skipped.
 
 
 
 
 -- 
 - Alfred Perlstein
 - Research Engineering Development Inc.
 - email: bright@mu.org cell: 408-480-4684

From: Dan Nelson <dnelson@allantgroup.com>
To: freebsd-net@FreeBSD.org, alfred@FreeBSD.org, kris@FreeBSD.org,
	Jonathan Lennox <lennox@cs.columbia.edu>,
	freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/56461: FreeBSD client rpc.lockd incompatible with Linux server rpc.lockd
Date: Fri, 18 Jun 2004 17:35:07 -0500

 In the last episode (Jun 18), Bruce M Simpson said:
 > I've attached my thoughts on this issue. I haven't gone ahead and
 > committed the fix in the PR as it makes us just as braindead as
 > Linux, but it would be good to be able to have this in GENERIC so
 > that it can be enabled in those situations where it's needed.
 
 Linux kernels 2.4.26 and above have fixed this particular bug, so the
 need for a compatibility hack on our end is not as great anymore.
 
 http://www.kernel.org/pub/linux/kernel/v2.4/ChangeLog-2.4.26 , search
 for "cookie".
 
 -- 
 	Dan Nelson
 	dnelson@allantgroup.com
Responsible-Changed-From-To: freebsd-bugs->bms 
Responsible-Changed-By: bms 
Responsible-Changed-When: Tue Jun 22 16:30:32 GMT 2004 
Responsible-Changed-Why:  
I am working on a patch to convert this into a mount option 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56461 
State-Changed-From-To: analyzed->suspended 
State-Changed-By: bms 
State-Changed-When: Wed Jun 23 04:39:05 GMT 2004 
State-Changed-Why:  
Problem fixed in Linux now but will take time to filter down to 
each distro (disparate userland and kernel components responsible 
for the issue). Therefore a workaround in FreeBSD probably isn't 
warranted. 


Responsible-Changed-From-To: bms->freebsd-bugs 
Responsible-Changed-By: bms 
Responsible-Changed-When: Wed Jun 23 04:39:05 GMT 2004 
Responsible-Changed-Why:  
Back to the free pool 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56461 
State-Changed-From-To: suspended->closed 
State-Changed-By: eadler 
State-Changed-When: Fri Nov 2 13:41:46 UTC 2012 
State-Changed-Why:  
this has probably filtered down to various distros by now 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56461 
>Unformatted:
