From quinot@inf.enst.fr  Mon Jan 14 13:13:17 2002
Return-Path: <quinot@inf.enst.fr>
Received: from infres.enst.fr (infres-192.enst.fr [137.194.192.1])
	by hub.freebsd.org (Postfix) with ESMTP id F05E537B402
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 14 Jan 2002 13:13:16 -0800 (PST)
Received: from shalmaneser.enst.fr (shalmaneser.enst.fr [137.194.162.11])
	by infres.enst.fr (Postfix) with ESMTP
	id B67161888; Mon, 14 Jan 2002 22:13:14 +0100 (MET)
Received: by shalmaneser.enst.fr (Postfix, from userid 11117)
	id 15F52112ED; Mon, 14 Jan 2002 22:13:13 +0100 (CET)
Message-Id: <20020114211313.15F52112ED@shalmaneser.enst.fr>
Date: Mon, 14 Jan 2002 22:13:13 +0100 (CET)
From: Thomas Quinot <thomas@cuivre.fr.eu.org>
Reply-To: Thomas Quinot <thomas@cuivre.fr.eu.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc: thomas@cuivre.fr.eu.org
Subject: rpc.lockd problems on server
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         33897
>Category:       bin
>Synopsis:       rpc.lockd problems on server
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    alfred
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jan 14 13:20:00 PST 2002
>Closed-Date:    Wed Jan 16 16:13:10 PST 2002
>Last-Modified:  Wed Jan 16 16:29:18 PST 2002
>Originator:     Thomas Quinot
>Release:        FreeBSD 5.0-CURRENT i386
>Organization:
>Environment:
System: FreeBSD shalmaneser.enst.fr 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Mon Jan 7 11:41:40 CET 2002 quinot@shalmaneser.enst.fr:/usr/obj/usr/src/sys/SHALMANESER i386


	
>Description:
	Since my last -CURRENT update, rpc.lockd dumps core every now
	and then, and procmaiol running on a Solaris NFS client
	hangs when trying to deliver mail to a mailbox on myFreeBSD server.
>How-To-Repeat:
	procmail delivery from Solaris client to FreeBSD 5-CURRENT server.
>Fix:
	None known so far.


>Release-Note:
>Audit-Trail:

From: Thomas Quinot <thomas@cuivre.fr.eu.org>
To: freebsd-gnats-submit@freebsd.org
Cc:  
Subject: Re: bin/33897: rpc.lockd problems on server
Date: Tue, 15 Jan 2002 10:09:04 +0100

 I was able to get a core dump from a binary with debugging symbols.
 Here we go:
 
 Script started on Tue Jan 15 10:05:27 2002
 # gdb rpc.lockd -c /rpc.lockd.core
 GNU gdb 4.18
 Copyright 1998 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for details.
 This GDB was configured as "i386-unknown-freebsd"...
 Core was generated by `rpc.lockd'.
 Program terminated with signal 11, Segmentation fault.
 Reading symbols from /usr/lib/librpcsvc.so.2...done.
 Reading symbols from /usr/lib/libutil.so.3...done.
 Reading symbols from /usr/lib/libc.so.5...done.
 Reading symbols from /usr/libexec/ld-elf.so.1...done.
 #0  0x804db24 in retry_blockingfilelocklist ()
     at /usr/src/usr.sbin/rpc.lockd/lockd_lock.c:1263
 1263				LIST_INSERT_BEFORE(nfl, ifl, nfslocklist);
 (gdb) print nfl
 $1 = (struct file_lock *) 0x0
 (gdb) print ifl
 $2 = (struct file_lock *) 0x8065800
 (gdb) print nfslocklist
 No symbol "nfslocklist" in current context.
 (gdb) bt
 #0  0x804db24 in retry_blockingfilelocklist ()
     at /usr/src/usr.sbin/rpc.lockd/lockd_lock.c:1263
 #1  0x804de5d in unlock_partialfilelock (fl=0xbfbff1ec)
     at /usr/src/usr.sbin/rpc.lockd/lockd_lock.c:1511
 #2  0x804e2ad in do_unlock (fl=0xbfbff1ec)
     at /usr/src/usr.sbin/rpc.lockd/lockd_lock.c:1767
 #3  0x804e5e1 in unlock (lock=0xbfbff6cc, flags=2)
     at /usr/src/usr.sbin/rpc.lockd/lockd_lock.c:1946
 #4  0x804c310 in nlm4_unlock_4_svc (arg=0xbfbff6c4, rqstp=0xbfbffc24)
     at /usr/src/usr.sbin/rpc.lockd/lock_proc.c:1114
 #5  0x804ac36 in nlm_prog_4 (rqstp=0xbfbffc24, transp=0x805c000)
     at nlm_prot_svc.c:434
 #6  0x280d5c25 in svc_getreq_common () from /usr/lib/libc.so.5
 #7  0x280d5a28 in svc_getreqset () from /usr/lib/libc.so.5
 #8  0x2809fff4 in svc_run () from /usr/lib/libc.so.5
 #9  0x804afdc in main (argc=1, argv=0xbfbffdd8)
     at /usr/src/usr.sbin/rpc.lockd/lockd.c:207
 #10 0x80498eb in _start ()
 (gdb) quit
 
 Script done on Tue Jan 15 10:05:58 2002
 
 Hope this helps,
 Thomas.
 
Responsible-Changed-From-To: freebsd-bugs->alfred 
Responsible-Changed-By: sheldonh 
Responsible-Changed-When: Tue Jan 15 01:54:24 PST 2002 
Responsible-Changed-Why:  
Over to our rpc.lockd maintainer. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=33897 

From: Mike Makonnen <mike_makonnen@yahoo.com>
To: freebsd-gnats-submit@freebsd.org
Cc:  
Subject: Re: bin/33897: rpc.lockd problems on server
Date: Tue, 15 Jan 2002 16:44:00 -0800

 Alfred, I took a look at retry_blockingfilelocklist() and the solution seemed simple enough. Please correct me if I am wrong. It seems said routine doesn't take into account boundary conditions when putting back file_lock entries into the blocked lock-list. Specifically, it fails when the file_lock being put back is the last element in the list, and when it is the only element in the list. I've included a patch below. 
 
 Basically, it introduces another variable: pfl, which keeps track of the list item before ifl. That way if nfl is NULL, ifl gets inserted after pfl. If pfl is also NULL, then it gets inserted at the head of the list (since it was the only element in the list).
 
 Thomas, could you give it a try and see if it solves your problems?
 
 
 cheers,
 mike makonnen
 
 Index: rpc.lockd/lockd_lock.c
 ===================================================================
 RCS file: /FreeBSD/ncvs/src/usr.sbin/rpc.lockd/lockd_lock.c,v
 retrieving revision 1.6
 diff -u -r1.6 lockd_lock.c
 --- rpc.lockd/lockd_lock.c	2 Dec 2001 11:10:46 -0000	1.6
 +++ rpc.lockd/lockd_lock.c	15 Jan 2002 21:37:16 -0000
 @@ -1226,11 +1226,12 @@
  retry_blockingfilelocklist(void)
  {
  	/* Retry all locks in the blocked list */
 -	struct file_lock *ifl, *nfl; /* Iterator */
 +	struct file_lock *ifl, *nfl, *pfl; /* Iterator */
  	enum partialfilelock_status pflstatus;
  
  	debuglog("Entering retry_blockingfilelocklist\n");
  
 +	pfl = NULL;
  	ifl = LIST_FIRST(&blockedlocklist_head);
  	debuglog("Iterator choice %p\n",ifl);
  
 @@ -1241,6 +1242,7 @@
  		 */
  		nfl = LIST_NEXT(ifl, nfslocklist);
  		debuglog("Iterator choice %p\n",ifl);
 +		debuglog("Prev iterator choice %p\n",pfl);
  		debuglog("Next iterator choice %p\n",nfl);
  
  		/*
 @@ -1260,11 +1262,24 @@
  		} else {
  			/* Reinsert lock back into same place in blocked list */
  			debuglog("Replacing blocked lock\n");
 -			LIST_INSERT_BEFORE(nfl, ifl, nfslocklist);
 +			if (nfl == NULL)
 +				/* ifl is the last elem. in the list */
 +				if (pfl == NULL)
 +					/* ifl is the only elem. in the list */
 +					LIST_INSERT_HEAD(&blockedlocklist_head, ifl, nfslocklist);
 +				else
 +					LIST_INSERT_AFTER(pfl, ifl, nfslocklist);
 +			else
 +				LIST_INSERT_BEFORE(nfl, ifl, nfslocklist);
  		}
  
  		/* Valid increment behavior regardless of state of ifl */
  		ifl = nfl;
 +		/* if a lock was granted incrementing pfl would make it nfl */
 +		if (pfl != NULL && (LIST_NEXT(pfl, nfslocklist) != nfl))
 +			pfl = LIST_NEXT(pfl, nfslocklist);
 +		else
 +			pfl = LIST_FIRST(&blockedlocklist_head);
  	}
  
  	debuglog("Exiting retry_blockingfilelocklist\n");
 
 To Unsubscribe: send mail to majordomo@FreeBSD.org
 with "unsubscribe freebsd-bugs" in the body of the message

From: Mike Makonnen <mike_makonnen@yahoo.com>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: bin/33897: rpc.lockd problems on server
Date: Tue, 15 Jan 2002 20:44:05 -0800

 Thomas, please use this patch instead.
 The previous patch was correct, but this one is cleaner :)
 
 cheers,
 mike makonnen
 
 Index: rpc.lockd/lockd_lock.c
 ===================================================================
 RCS file: /FreeBSD/ncvs/src/usr.sbin/rpc.lockd/lockd_lock.c,v
 retrieving revision 1.6
 diff -u -r1.6 lockd_lock.c
 --- rpc.lockd/lockd_lock.c	2 Dec 2001 11:10:46 -0000	1.6
 +++ rpc.lockd/lockd_lock.c	16 Jan 2002 04:16:51 -0000
 @@ -1226,11 +1226,12 @@
  retry_blockingfilelocklist(void)
  {
  	/* Retry all locks in the blocked list */
 -	struct file_lock *ifl, *nfl; /* Iterator */
 +	struct file_lock *ifl, *nfl, *pfl; /* Iterator */
  	enum partialfilelock_status pflstatus;
  
  	debuglog("Entering retry_blockingfilelocklist\n");
  
 +	pfl = NULL;
  	ifl = LIST_FIRST(&blockedlocklist_head);
  	debuglog("Iterator choice %p\n",ifl);
  
 @@ -1241,6 +1242,7 @@
  		 */
  		nfl = LIST_NEXT(ifl, nfslocklist);
  		debuglog("Iterator choice %p\n",ifl);
 +		debuglog("Prev iterator choice %p\n",pfl);
  		debuglog("Next iterator choice %p\n",nfl);
  
  		/*
 @@ -1260,11 +1262,20 @@
  		} else {
  			/* Reinsert lock back into same place in blocked list */
  			debuglog("Replacing blocked lock\n");
 -			LIST_INSERT_BEFORE(nfl, ifl, nfslocklist);
 +			if (pfl != NULL)
 +				LIST_INSERT_AFTER(pfl, ifl, nfslocklist);
 +			else
 +				/* ifl is the only elem. in the list */
 +				LIST_INSERT_HEAD(&blockedlocklist_head, ifl, nfslocklist);
  		}
  
  		/* Valid increment behavior regardless of state of ifl */
  		ifl = nfl;
 +		/* if a lock was granted incrementing pfl would make it nfl */
 +		if (pfl != NULL && (LIST_NEXT(pfl, nfslocklist) != nfl))
 +			pfl = LIST_NEXT(pfl, nfslocklist);
 +		else
 +			pfl = LIST_FIRST(&blockedlocklist_head);
  	}
  
  	debuglog("Exiting retry_blockingfilelocklist\n");

From: Thomas Quinot <thomas@cuivre.fr.eu.org>
To: Mike Makonnen <mike_makonnen@yahoo.com>
Cc: freebsd-gnats-submit@freebsd.org
Subject: Re: bin/33897: rpc.lockd problems on server
Date: Wed, 16 Jan 2002 14:28:05 +0100

 Le 2002-01-15, Mike Makonnen crivait :
 
 > Thomas, could you give it a try and see if it solves your problems?
 
 Looks way better. I have just patched and rebooted with the new
 rpc.lockd and with 15 minutes uptime, I have not had a rpc.lockd core
 dump yet. I was able to open mailboxes and deliver email from a
 Solaris NFS client. I'll follow up after a few further hours of testing.
 
 Thanks for your prompt help!
 Thomas.
 
 -- 
     Thomas.Quinot@Cuivre.FR.EU.ORG
State-Changed-From-To: open->closed 
State-Changed-By: alfred 
State-Changed-When: Wed Jan 16 16:13:10 PST 2002 
State-Changed-Why:  
Mike Makonnen's patch seems good. 
Fixed in revision 1.7 of src/usr.sbin/rpc.lockd/lockd_lock.c 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=33897 
>Unformatted:
