Newsgroups: comp.sources.wanted
Path: utzoo!utgpu!watserv1!watmath!kgdykes
From: kgdykes@watmath.waterloo.edu (Ken Dykes)
Subject: want to compress ascii text in strings not files
Organization: S.D.G. UofWaterloo
Date: Wed, 9 Jan 91 05:34:59 GMT
Message-ID: <1991Jan9.053459.13231@watmath.waterloo.edu>
Keywords: compression, strings, ascii text subset
Lines: 39

is there an effective algorithm (what is it? :-) or even lazier a C source
for a function (pascal, 286 assembler can also be survived :-)
which will reduce in-memory requirements for strings.

My application has several-K of strings of ascii subset used for messages
and diagnostics. (they vary from 1-2 chars long to about 90chars in length,
   probably average length of about 20-40chars)
I am hoping to reduce it's already large memory requirements at the expense
of a little CPU.

the bytes may be considered 7bit ascii subset (alphabetics,digits,punctuation,
white-space, ie: no ctrl-chars or \0177rubouts) (the ctrl-chars are represented
by C printf-like sequences such as \n for newline, \f for formfeed, etc)
actually the sequence "\n" occurs in >50% of the strings, so perhaps
a special representation like ctrl-J could be used :-) :-)

the in memory byte-capacity is actually 9bit not 8bit (honeywell bull dps8
architecture, or if you prefer pretend it's DEC Tops-10 or Tops-20, words
of 36 bits (four 9bit bytes))
although i would be interested in a more-protable 8-bit, not 9bit,
algorithm too.

i can think of some crude shift-&-pack algorithms, but am hoping someone
has something nifty & efficient.
If i can get a 33% space reduction without obscene amounts of cpu required
i will be ecstatic.
the simplest shift-2bits-&-pack would give me about 28% with a lot of
fairly expensive shift operations.

actually i suppose an accurate description of a compress/decompress
alogrithm would be prefered come to think of it, it will after all be
a teeny part of what is essentially a commercial package and i wouldnt
want to be accused of stealing code :-)

   -ken, not at all versed in data compression
-- 
   - Ken Dykes, Software Development Group, UofWaterloo, Canada [43.47N 80.52W]
         kgdykes@watmath.waterloo.edu  [129.97.128.1]        watmath!kgdykes
         postmaster@watbun.waterloo.edu       B8 P6/6 s+ f+ m t w e r p
