Newsgroups: comp.compression
Path: utzoo!utgpu!watserv1!watmath!looking!brad
From: brad@looking.on.ca (Brad Templeton)
Subject: Re: Modeling vs encoding (Re: Lempel-Ziv v/s huffman encoding)
Organization: Looking Glass Software Ltd.
Date: Tue, 02 Apr 91 05:50:23 GMT
Message-ID: <1991Apr02.055023.27834@looking.on.ca>
References: <1991Mar31.000653.4598@zorch.SF-Bay.ORG> <12546@pt.cs.cmu.edu> <1991Apr01.053616.3665@looking.on.ca> <1991Apr1.191751.4211@nntp-server.caltech.edu>

If PKZIP does do that, I am at a loss to explain why it sends the full
tree in the file (unless it was really forward looking, which by and large
it was not) but more to the point, why it builds a complete temp file for
the file and does a full two pass compress?

If you heard it from Katz himself, I believe it, but it seems odd that
it does the full two passes.

My two-pass compressor has a mode where it will scan the first N bytes of
the file and build tables on those, doing one pass compress from then on.
(You need such a mode to compress arbitrary sized stdin -- or you can use
the "retain huffer" mode which outputs new tables every N bytes for this)

This semi-one pass mode is only about one or two percent worse on
a typical large file, which is not bad.    On large files that
vary in data composition -- such as general tar files, etc, the
retraining method can give the best compression.


BTW, I am looking for a good name for my compressor.  Current name is
"ISP" for the "Incredible Shrinking Program" but I think it sounds a bit
wimpy, and too much like LISP.

Other choices:
	ZAP		- but it may be too close to ZIP
	Scrunch/Scrinch
	YACS		- (Yet another compression synonym)
	Pressor		- the name of its inner subroutine


What is your pick?

-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
