From admin@tw2.thebbs.org  Fri Apr 11 08:57:53 2003
Return-Path: <admin@tw2.thebbs.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 41ECC37B401
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 11 Apr 2003 08:57:53 -0700 (PDT)
Received: from tw2.thebbs.org (hssx-yktn-59-202.sasknet.sk.ca [142.165.59.202])
	by mx1.FreeBSD.org (Postfix) with ESMTP id AB85D43FDD
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 11 Apr 2003 08:57:51 -0700 (PDT)
	(envelope-from admin@tw2.thebbs.org)
Received: from tw2.thebbs.org (localhost.kingcole.local [127.0.0.1])
	by tw2.thebbs.org (8.12.6/8.12.6) with ESMTP id h3B8UST4000900;
	Fri, 11 Apr 2003 02:34:13 -0600 (CST)
	(envelope-from admin@tw2.thebbs.org)
Received: (from root@localhost)
	by tw2.thebbs.org (8.12.6/8.12.6/Submit) id h3B8UR8J000899;
	Fri, 11 Apr 2003 02:30:27 -0600 (CST)
Message-Id: <200304110830.h3B8UR8J000899@tw2.thebbs.org>
Date: Fri, 11 Apr 2003 02:30:27 -0600 (CST)
From: Stephen Hurd <admin@tw2.thebbs.org>
Reply-To: Stephen Hurd <shurd@sasktel.net>
To: FreeBSD-gnats-submit@freebsd.org
Cc: Deuce <deuce@lordlegacy.com>, Rob Swindell <rob@synchro.net>
Subject: [PATCH] no sane record locking on *nix.  (More types needed)
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         50827
>Category:       kern
>Synopsis:       [kernel] [patch] [request] add sane record locking
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          suspended
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 11 09:00:24 PDT 2003
>Closed-Date:    
>Last-Modified:  Sat Jan 26 04:49:41 UTC 2008
>Originator:     Stephen Hurd
>Release:        FreeBSD 4.7-RELEASE i386
>Organization:
>Environment:
System: FreeBSD sharon.kingcole.local 4.7-RELEASE FreeBSD 4.7-RELEASE #33: Fri Jan 31 00:06:43 CST 2003 admin@sharon.kingcole.local:/usr/src/sys/compile/SHARON i386


>Description:
	Record locking in POSIX systems is in a sad state.  Due to some poor
	design choices when fcntl() was developed, our only methods of record
	locking are (to quote the man page) "completely stupid".  There are
	times when fcntl() locking is utterly unmanageable, and flock() locks
	are just too much.

	Currently, the record locking capabilities of DOS 3.0+ are more usefull
	than those in any POSIX compliant *nix.

	After leading the battle for sane locks with the implementation of
	flock() in 4.2BSD, the time has come for BSD to take up the job once
	again and implement sane record locking.
>How-To-Repeat:
	1) Write a threaded multi-instance server.
	2) Do record locking.
	3) Stop swearing at fcntl() (Optional)

>Fix:

	The following patch adds the F_SANEWRLCK, F_SANERDLCK, F_SANEUNLCK, 
	F_SANEWRLCKNO, and F_SANERDLCKNO lock types to fcntl().  The types
	with NO (No Overlap) will cause locks to conflict regardless of owner
	(ie: You cannot get an exclusive lock that is contained inside of a
	shared lock you currently hold)

	Sane locks are closed with the file descriptor, not the file and you
	can get an exclusive lock without opening the file with write access
	and a shared lock without opening the file with read access.

--- sanelock.patch begins here ---
diff -c /sys/kern.old/kern_descrip.c /sys/kern/kern_descrip.c
*** /sys/kern.old/kern_descrip.c	Fri Apr 11 01:47:51 2003
--- /sys/kern/kern_descrip.c	Fri Apr 11 01:51:28 2003
***************
*** 328,333 ****
--- 328,359 ----
  			error = VOP_ADVLOCK(vp, (caddr_t)p->p_leader, F_UNLCK,
  				&fl, F_POSIX);
  			break;
+ 		case F_SANEWRLCKNO:
+ 			flg |= F_NOOVRLP;
+ 		case F_SANEWRLCK:
+ 			fl.l_type=F_WRLCK;
+ 			flg &= ~F_POSIX;
+ 			flg |= F_FLOCK;
+ 			fp->f_flag |= FHASLOCK;
+ 			error = VOP_ADVLOCK(vp, (caddr_t)fp, F_SETLK,
+ 			    &fl, flg);
+ 			break;
+ 		case F_SANERDLCKNO:
+ 			flg |= F_NOOVRLP;
+ 		case F_SANERDLCK:
+ 			fl.l_type=F_RDLCK;
+ 			flg &= ~F_POSIX;
+ 			flg |= F_FLOCK;
+ 			fp->f_flag |= FHASLOCK;
+ 			error = VOP_ADVLOCK(vp, (caddr_t)fp, F_SETLK,
+ 			    &fl, flg);
+ 			break;
+ 		case F_SANEUNLCK:
+ 			flg &= ~F_POSIX;
+ 			flg |= F_FLOCK;
+ 			error = VOP_ADVLOCK(vp, (caddr_t)fp, F_UNLCK,
+ 				&fl, F_FLOCK);
+ 			break;
  		default:
  			error = EINVAL;
  			break;
diff -c /sys/kern.old/kern_lockf.c /sys/kern/kern_lockf.c
*** /sys/kern.old/kern_lockf.c	Fri Apr 11 01:47:51 2003
--- /sys/kern/kern_lockf.c	Fri Apr 11 01:48:03 2003
***************
*** 578,585 ****
  	start = lock->lf_start;
  	end = lock->lf_end;
  	while (lf != NOLOCKF) {
! 		if (((type & SELF) && lf->lf_id != lock->lf_id) ||
! 		    ((type & OTHERS) && lf->lf_id == lock->lf_id)) {
  			*prev = &lf->lf_next;
  			*overlap = lf = lf->lf_next;
  			continue;
--- 578,586 ----
  	start = lock->lf_start;
  	end = lock->lf_end;
  	while (lf != NOLOCKF) {
! 		if  ((!(lock->lf_flags & F_NOOVRLP)) &&
! 		    (((type & SELF) && lf->lf_id != lock->lf_id) ||
! 		    ((type & OTHERS) && lf->lf_id == lock->lf_id))) {
  			*prev = &lf->lf_next;
  			*overlap = lf = lf->lf_next;
  			continue;
diff -c /sys/sys.old/fcntl.h /sys/sys/fcntl.h
*** /sys/sys.old/fcntl.h	Fri Apr 11 01:48:32 2003
--- /sys/sys/fcntl.h	Fri Apr 11 01:48:11 2003
***************
*** 167,176 ****
--- 167,196 ----
  #define	F_RDLCK		1		/* shared or read lock */
  #define	F_UNLCK		2		/* unlock */
  #define	F_WRLCK		3		/* exclusive or write lock */
+ #ifndef _POSIX_SOURCE
+ /*
+  * The following lock types do NOT follow the completely stupid POSIX
+  * fcntl() semantics.  Locks are per file descriptor not per file, and
+  * you can request an exclusive lock on a file opened for read as well as
+  * a read lock on a file opened for write.
+  */
+ #define F_SANERDLCK	4		/* sane shared or read lock */
+ #define F_SANEUNLCK	5		/* unlock sane locks */
+ #define F_SANEWRLCK	6		/* sane exclusive or write lock */
+ 
+ /*
+  * These lock types are sane locks that fail if there is ANY lock in the region 
+  * they are locking that would conflict (ie: process conflicts with itself as
+  * well as other processes.
+  */
+ #define F_SANERDLCKNO	7		/* don't up/downgrade or merge locks */
+ #define F_SANEWRLCKNO	8
+ #endif
  #ifdef _KERNEL
  #define	F_WAIT		0x010		/* Wait until lock is granted */
  #define	F_FLOCK		0x020	 	/* Use flock(2) semantics for lock */
  #define	F_POSIX		0x040	 	/* Use POSIX semantics for lock */
+ #define F_NOOVRLP	0x080		/* Don't allow overlapping locks */
  #endif
  
  /*
--- sanelock.patch ends here ---


>Release-Note:
>Audit-Trail:

From: Stephen Hurd <shurd@sasktel.net>
To: freebsd-gnats-submit@FreeBSD.org, shurd@sasktel.net
Cc:  
Subject: Re: kern/50827: [PATCH] no sane record locking on *nix.  (More types
 needed)
Date: Thu, 24 Jun 2004 17:35:46 -0600

 Works with 5.2.1 sources.

From: "Simon L. Nielsen" <simon@FreeBSD.org>
To: Stephen Hurd <shurd@sasktel.net>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/50827: [PATCH] no sane record locking on *nix.  (Moretypes needed)
Date: Fri, 25 Jun 2004 11:15:49 +0200

 --mYCpIKhGyMATD0i+
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 On 2004.06.24 23:40:31 +0000, Stephen Hurd wrote:
 
 >  Works with 5.2.1 sources.
 
 Do you mean that the patches are not needed for 5.2.1 (so the PR can be
 closed) or that the patches works on 5.2.1?
 
 --=20
 Simon L. Nielsen
 
 --mYCpIKhGyMATD0i+
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.2.4 (FreeBSD)
 
 iD8DBQFA2+1Fh9pcDSc1mlERAnjrAJ9AHAvLAgdg2EnShMyWmfldolftEgCdFi6W
 oznr5l2Rtp0mN8EjBukfUpE=
 =UHrv
 -----END PGP SIGNATURE-----
 
 --mYCpIKhGyMATD0i+--

From: Stephen Hurd <shurd@sasktel.net>
To: freebsd-gnats-submit@FreeBSD.org, shurd@sasktel.net
Cc:  
Subject: Re: kern/50827: [PATCH] no sane record locking on *nix.  (More types
 needed)
Date: Wed, 30 Jun 2004 19:37:17 -0600

 Actually, I meant that the following patch works on 5.2.1  :-)
 
 --- sanelock.patch begins here ---
 diff -c /sys/kern.old/kern_descrip.c /sys/kern/kern_descrip.c
 *** /sys/kern.old/kern_descrip.c        Thu Jun 24 17:25:47 2004
 --- /sys/kern/kern_descrip.c    Thu Jun 24 18:22:09 2004
 ***************
 *** 380,385 ****
 --- 380,411 ----
                         error = VOP_ADVLOCK(vp, (caddr_t)p->p_leader,
 F_UNLCK,                            flp, F_POSIX);
                         break;
 +               case F_SANEWRLCKNO:
 +                       flg |= F_NOOVRLP;
 +               case F_SANEWRLCK:
 +                       flp->l_type=F_WRLCK;
 +                       flg &= ~F_POSIX;
 +                       flg |= F_FLOCK;
 +                       fp->f_flag |= FHASLOCK;
 +                       error = VOP_ADVLOCK(vp, (caddr_t)fp, F_SETLK,
 +                           flp, flg);
 +                       break;
 +               case F_SANERDLCKNO:
 +                       flg |= F_NOOVRLP;
 +               case F_SANERDLCK:
 +                       flp->l_type=F_RDLCK;
 +                       flg &= ~F_POSIX;
 +                       flg |= F_FLOCK;
 +                       fp->f_flag |= FHASLOCK;
 +                       error = VOP_ADVLOCK(vp, (caddr_t)fp, F_SETLK,
 +                           flp, flg);
 +                       break;
 +               case F_SANEUNLCK:
 +                       flg &= ~F_POSIX;
 +                       flg |= F_FLOCK;
 +                       error = VOP_ADVLOCK(vp, (caddr_t)fp, F_UNLCK,
 +                               flp, F_FLOCK);
 +                       break;
                 default:
                         error = EINVAL;
                         break;
 Only in /sys/kern: kern_descrip.c.orig
 diff -c /sys/kern.old/kern_lockf.c /sys/kern/kern_lockf.c
 *** /sys/kern.old/kern_lockf.c  Thu Jun 24 17:25:48 2004
 --- /sys/kern/kern_lockf.c      Thu Jun 24 17:28:52 2004
 ***************
 *** 605,612 ****
         start = lock->lf_start;
         end = lock->lf_end;
         while (lf != NOLOCKF) {
 !               if (((type & SELF) && lf->lf_id != lock->lf_id) ||
 !                   ((type & OTHERS) && lf->lf_id == lock->lf_id)) {
                         *prev = &lf->lf_next;
                         *overlap = lf = lf->lf_next;
                         continue;
 --- 605,613 ----
         start = lock->lf_start;
         end = lock->lf_end;
         while (lf != NOLOCKF) {
 !               if  ((!(lock->lf_flags & F_NOOVRLP)) &&
 !                   (((type & SELF) && lf->lf_id != lock->lf_id) ||
 !                   ((type & OTHERS) && lf->lf_id == lock->lf_id))) {
                         *prev = &lf->lf_next;
                         *overlap = lf = lf->lf_next;
                         continue;
 diff -c /sys/sys.old/fcntl.h /sys/sys/fcntl.h
 *** /sys/sys.old/fcntl.h	Fri Apr 11 01:48:32 2003
 --- /sys/sys/fcntl.h	Fri Apr 11 01:48:11 2003
 ***************
 *** 167,176 ****
 --- 167,196 ----
   #define	F_RDLCK		1		/* shared or read lock */
   #define	F_UNLCK		2		/* unlock */
   #define	F_WRLCK		3		/* exclusive or write lock */
 + #ifndef _POSIX_SOURCE
 + /*
 +  * The following lock types do NOT follow the completely stupid POSIX
 +  * fcntl() semantics.  Locks are per file descriptor not per file, and
 +  * you can request an exclusive lock on a file opened for read as well
 as+  * a read lock on a file opened for write.
 +  */
 + #define F_SANERDLCK	4		/* sane shared or read lock */
 + #define F_SANEUNLCK	5		/* unlock sane locks */
 + #define F_SANEWRLCK	6		/* sane exclusive or write lock */
 + 
 + /*
 +  * These lock types are sane locks that fail if there is ANY lock in the
 region +  * they are locking that would conflict (ie: process conflicts
 with itself as+  * well as other processes.
 +  */
 + #define F_SANERDLCKNO	7		/* don't up/downgrade or merge locks */
 + #define F_SANEWRLCKNO	8
 + #endif
   #ifdef _KERNEL
   #define	F_WAIT		0x010		/* Wait until lock is granted */
   #define	F_FLOCK		0x020	 	/* Use flock(2) semantics for lock */
   #define	F_POSIX		0x040	 	/* Use POSIX semantics for lock */
 + #define F_NOOVRLP	0x080		/* Don't allow overlapping locks */
   #endif
   
   /*
 --- sanelock.patch ends here ---
State-Changed-From-To: open->suspended 
State-Changed-By: linimon 
State-Changed-When: Tue Oct 25 23:38:39 GMT 2005 
State-Changed-Why:  
Mark as 'suspended' since this does not seem as though it is being 
actively worked on. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=50827 
>Unformatted:
