From freebsd@snark.ratmir.ru  Fri Apr 18 21:29:25 2003
Return-Path: <freebsd@snark.ratmir.ru>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C71BE37B401
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 18 Apr 2003 21:29:25 -0700 (PDT)
Received: from snark.ratmir.ru (snark.ratmir.ru [213.24.248.177])
	by mx1.FreeBSD.org (Postfix) with ESMTP id C8B6A43FBF
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 18 Apr 2003 21:29:24 -0700 (PDT)
	(envelope-from freebsd@snark.ratmir.ru)
Received: from snark.ratmir.ru (freebsd@localhost [127.0.0.1])
	by snark.ratmir.ru (8.12.9/8.12.9) with ESMTP id h3J4TMC2068895;
	Sat, 19 Apr 2003 08:29:22 +0400 (MSD)
	(envelope-from freebsd@snark.ratmir.ru)
Received: (from freebsd@localhost)
	by snark.ratmir.ru (8.12.9/8.12.9/Submit) id h3J4TMrW068894;
	Sat, 19 Apr 2003 08:29:22 +0400 (MSD)
Message-Id: <200304190429.h3J4TMrW068894@snark.ratmir.ru>
Date: Sat, 19 Apr 2003 08:29:22 +0400 (MSD)
From: Alex Semenyaka <alexs@snark.ratmir.ru>
Reply-To: Alex Semenyaka <alexs@snark.ratmir.ru>
To: FreeBSD-gnats-submit@freebsd.org
Cc: Alex Semenyaka <alexs@snark.ratmir.ru>
Subject: Control the cache size for pwd_mkdb to speedup vipw
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         51148
>Category:       bin
>Synopsis:       [patch] Control the cache size for pwd_mkdb(8) to speedup vipw
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 18 21:30:11 PDT 2003
>Closed-Date:    
>Last-Modified:  Sat May 24 17:43:52 UTC 2008
>Originator:     Alex Semenyaka
>Release:        FreeBSD 4.8-RC i386
>Organization:
Ratmir
>Environment:

FIRST:
FreeBSD home.rinet.ru 4.8-RELEASE FreeBSD 4.8-RELEASE #2: Tue Apr  1 20:54:29 MSD 2003     marck@kucha.rinet.ru:/usr/obj/FreeBSD/src.stable-48/sys/home  i386

CPU: AMD Duron(tm) Processor  (601.37-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x630  Stepping = 0
  Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
  AMD Features=0xc0440000<RSVD,AMIE,DSP,3DNow!>
real memory  = 134152192 (131008K bytes)
avail memory = 127193088 (124212K bytes)

SECOND:
FreeBSD snark.ratmir.ru 4.8-RC FreeBSD 4.8-RC #7: Sun Mar 30 07:23:48 MSD 2003 root@snark.ratmir.ru:/usr/obj/usr/src/sys/SNARK  i386

CPU: Intel Pentium III (1002.28-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x68a  Stepping = 10
  Features=0x387f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE>
real memory  = 268435456 (262144K bytes)
avail memory = 256475136 (250464K bytes)


>Description:

There are some FreeBSD boxes which have really huge master.passwd files
since we have no nss under 4.x and, thus, all users should be listed there
for a lot of programs. The host is described as FIRST above is the mailboxes
and homepages storage for the all users of some ISP in Moscow, Russia. It holds
above the 10000 lines in the master.passwd.

There is an option for pwd_mkdb which allows to set the size of cache when
database is opened (-s <cache size in Ms>). That helps sufficiently increase
the speed of database rebuilding. But there no way to use this option with
the pw_mkdb() function, shipped with vipw in 4.x and moved to libutil in
CURRENT. That function execs pwd_mkdb with options -p -d <dir>, and
optionally with -u <user>. Vipw (and maybe some other programs in -CURRENT)
just uses the call to pw_mkdb(). So there is no way for vipw to set somehow
the cache size to be used in pwd_mkdb while it can be very useful.

Some data for the real host are given below ("uname -a" output and the dmesg
part about the processor and memory see above, marked as FIRST). I measured
the time necessary for the pwd_mkdb to rebuild master.passwd database with
the same arguments as used in pw_mkdb() with the "-s" option added. Here are
results:

cache sized		total time to run
passed with -s

   1			1m1.434s
   2			0m32.604s
   4			0m3.550s
   8			0m2.792s
   12			0m2.749s
   16			0m2.528s

bash-2.05b# wc -l master.passwd
   10021 master.passwd

Default value of the cache size is 2M. As you can see it can cause very
slow execution of pwd_mkdb.

The next one - data for the another computer (data are given above under the
SECOND label), here master.passwd are generated artificially:

cache sized		total time to run
passed with -s

   2			2m54.342s
   4			2m18.354s
   8			0m39.987s
   12			0m10.734s
   16			0m7.535s
   24			0m7.469s
   32			0m7.728s

bash-2.05b# wc -l master.passwd
   37038 master.passwd

Now, even 8M is not enough.

>How-To-Repeat:

Create a really big (10000-20000-... records) master.passwd file and run

#pwd_mkdb -p -s N -d /etc master.passwd

where N is 2 to 16.

>Fix:

I suggest to pass the value of the cache size through the environment 
variable. I called it PWDB_CACHE. By default the behaviour of pkd_mkdb
and all programs those can inherit the problem does not change. But if
system administrator needs to increase the speed of the users db rebuilding
it just can set PWDB_CACHE to some value greater than 2 (2M is build-in
default) and enjoy. As in command-line the unit is megabyte.

Here is the patch:

diff -r -u ../pwd_mkdb/pwd_mkdb.c ./pwd_mkdb.c
--- ../pwd_mkdb/pwd_mkdb.c	Fri Jul 12 01:16:52 2002
+++ ./pwd_mkdb.c	Sat Apr 19 07:31:52 2003
@@ -66,14 +66,17 @@
 #define	SECURE		2
 #define	PERM_INSECURE	(S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH)
 #define	PERM_SECURE	(S_IRUSR|S_IWUSR)
+#define	MINCACHE	2
+#define	MAXCACHE	16
+#define	CACHESZ_ENV	"PWDB_CACHE"
 
 HASHINFO openinfo = {
-	4096,		/* bsize */
-	32,		/* ffactor */
-	256,		/* nelem */
-	2048 * 1024,	/* cachesize */
-	NULL,		/* hash() */
-	0		/* lorder */
+	4096,			/* bsize */
+	32,			/* ffactor */
+	256,			/* nelem */
+	MINCACHE * 1024 * 1024,	/* cachesize */
+	NULL,			/* hash() */
+	0			/* lorder */
 };
 
 static enum state { FILE_INSECURE, FILE_SECURE, FILE_ORIG } clean;
@@ -101,6 +104,8 @@
 	int ch, cnt, ypcnt, makeold, tfd, yp_enabled = 0;
 	unsigned int len;
 	int32_t pw_change, pw_expire;
+	int cache;
+	char *penv;
 	const char *t;
 	char *p;
 	char buf[MAX(MAXPATHLEN, LINE_MAX * 2)], tbuf[1024];
@@ -116,6 +121,17 @@
 	strcpy(prefix, _PATH_PWD);
 	makeold = 0;
 	username = NULL;
+
+	penv = getenv(CACHESZ_ENV);
+	if (penv != NULL) {
+		cache = atoi(penv);
+		if (cache < MINCACHE)
+			cache = MINCACHE;
+		if (cache > MAXCACHE)
+			cache = MAXCACHE;
+		openinfo.cachesize = cache * 1024 * 1024;
+	}
+
 	while ((ch = getopt(argc, argv, "Cd:ps:u:vN")) != -1)
 		switch(ch) {
 		case 'C':                       /* verify only */
>Release-Note:
>Audit-Trail:

From: Doug Barton <DougB@FreeBSD.org>
To: Alex Semenyaka <alexs@ratmir.ru>
Cc: freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org,
	freebsd-gnats-submit@freebsd.org
Subject: Re: bin/51148 Slow vipw and fast pwd_mkdb
Date: Sun, 20 Apr 2003 14:11:34 -0700 (PDT)

 On Sat, 19 Apr 2003, Alex Semenyaka wrote:
 
 > Hello,
 >
 > could somebody to comment PR bin/51148? It is suggestion how to pass
 > a value of cache size to pwd_mkdb when we are doing vipw or such.
 > It can give a greate speed-up when master.passwd is really big (and
 > sometimes it is). Appropriate cache size can make process 10 to 100
 > or more times faster. I gave the results of measurements in that
 > problem report.
 
 Having been in a "mondo huge master.passwd file" situation myself, my
 comment is that people in this situation should probably not be relying on
 tools like vipw to manage their stuff. However, my feelings aren't strong
 enough to prompt me to close your PR, so good luck. :)
 
 Doug
 
 -- 
 
     This .signature sanitized for your protection

From: "Geoffrey C. Speicher" <geoff@speicher.org>
To: Doug Barton <DougB@freebsd.org>
Cc: Alex Semenyaka <alexs@ratmir.ru>, freebsd-hackers@freebsd.org,
	freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: bin/51148 Slow vipw and fast pwd_mkdb
Date: Sun, 20 Apr 2003 19:13:30 -0400 (EDT)

 On Sun, 20 Apr 2003, Doug Barton wrote:
 
 > On Sat, 19 Apr 2003, Alex Semenyaka wrote:
 > 
 > > Hello,
 > >
 > > could somebody to comment PR bin/51148? It is suggestion how to pass
 > > a value of cache size to pwd_mkdb when we are doing vipw or such.
 > > It can give a greate speed-up when master.passwd is really big (and
 > > sometimes it is). Appropriate cache size can make process 10 to 100
 > > or more times faster. I gave the results of measurements in that
 > > problem report.
 > 
 > Having been in a "mondo huge master.passwd file" situation myself, my
 > comment is that people in this situation should probably not be relying on
 > tools like vipw to manage their stuff. However, my feelings aren't strong
 > enough to prompt me to close your PR, so good luck. :)
 
 Are your feelings strong enough to commit bin/38676 so that people can use
 pw(8) safely and concurrently on mondo huge master.passwd files?  :)
 
 bin/23501 was supposed to have fixed the problem, but it doesn't appear to
 have been committed before it was closed.  bin/38676 is a revised/improved
 version of the original patch in bin/23501.
 
 Geoff
 

From: Alex Semenyaka <alexs@ratmir.ru>
To: Doug Barton <DougB@freebsd.org>
Cc: Alex Semenyaka <alexs@ratmir.ru>, freebsd-hackers@freebsd.org,
	freebsd-gnats-submit@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: bin/51148 Slow vipw and fast pwd_mkdb
Date: Mon, 21 Apr 2003 14:49:46 +0400

 Hi,
 
 On Sun, Apr 20, 2003 at 02:11:34PM -0700, Doug Barton wrote:
 > Having been in a "mondo huge master.passwd file" situation myself, my
 > comment is that people in this situation should probably not be relying on
 > tools like vipw to manage their stuff.
 
 Actually, there is a such corporative policy in the ISP I've taken that huge
 master.passwd for the test. The main ideas: 1) no automatic changes in the
 master.passwd 2) time to time visual check and fix of users' data in
 master.passwd (heh, column from 'master.passwd's). May be good, may be bad,
 but it exists.
 
 > However, my feelings aren't strong
 > enough to prompt me to close your PR, so good luck. :)
 
 Thanks :)
 
 								SY, Alex
>Unformatted:
