Newsgroups: news.software.b
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!caen!ox.com!ox.com!emv
From: emv@ox.com (Ed Vielmetti)
Subject: Re: egrep hazardrous to your system's cnews health
In-Reply-To: res@colnet.uucp's message of Sat, 11 May 1991 04:03:14 GMT
Message-ID: <EMV.91May12163548@poe.aa.ox.com>
Sender: usenet@ox.com (Usenet News Administrator)
Organization: OTA Limited Partnership, Ann Arbor MI.
References: <1991May11.040314.15393@colnet.uucp>
Date: Sun, 12 May 1991 20:35:51 GMT

> egrep and C news performance

there are a number of programs out there which go by the name of
egrep, perform roughly the same function, and whose performance
differs measurably depending on system speed, available memory, and
the nature of the data to be grep'd for.  

your best bet would be to get an egrep that uses the Henry Spencer
regexp library or one of its derivatives; that would probably be
substantially better than the egrep that shipped with your Unix PC.
I'll have to check on the pedigree of the gnu 'egrep', it may also be
suited to your needs.

the other thing to note in the particular example is that it makes
very conservative assumptions (for portablility considerations, no
doubt) on the function of your stock utilities.  you could save an awk
invocation if your wc supports the '-l' flag, since "wc -l" is the
same as the "wc | awk '{print $1}'" for modern versions of wc.
there may well be other constructs which you could characterize,
isolate, and recode for efficiency on your own system.

C news does not ship with its own versions of sh, awk, ls, egrep, wc,
tr, sed, and cat; it is assumed that the vendor versions will be good
enough to suffice.  the existing shell scripts have to work around the
limitations of many known deficiences in these programs to be
reasonably portable.  There is enough consistency in the construction
of C news shell idioms that it's well within reason to look at
methodically replacing them with special purpose programs or more
precisely tailored shell constructs; I'd expect that the UUNET funding
of further C news development will have some of this in mind.

-- 
Edward Vielmetti, vice president for research, MSEN Inc.  emv@msen.com

"(6) The Plan shall identify how agencies and departments can
collaborate to ... expand efforts to improve, document, and evaluate
unclassified public-domain software developed by federally-funded
researchers and other software, including federally-funded educational
and training software; "
			"High-Performance Computing Act of 1991, S. 272"

