[HN Gopher] LZAV - Fast In-Memory Data Compression Algorithm (In C)
       ___________________________________________________________________
        
       LZAV - Fast In-Memory Data Compression Algorithm (In C)
        
       Author : mmphosis
       Score  : 24 points
       Date   : 2023-07-20 00:39 UTC (22 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | someplaceguy wrote:
       | What do you mean "in-memory"? Is there any other kind of data
       | compression algorithm?
       | 
       | How does it compare with zstd? I find that zstd, given the right
       | compression level, can beat almost every other compression
       | algorithm nowadays on at least one or two metrics, and very often
       | on all metrics.
       | 
       | For example, I just tried to compress the `enwik9` dataset with
       | both lz4 and `zstd -1 --single-thread` and zstd was both faster
       | on compression and decompression and also produced a 357 MB file
       | compared to 509 MB with lz4. According to these results,
       | apparently it would beat LZAV on every metric?
       | 
       | That said, even though I used `--single-thread`, somehow zstd
       | still used 150% of CPU according to `time` (?).
        
         | mmphosis wrote:
         | I don't know how practical it would be but an outside-of-memory
         | data compression algorithm would be cool. There are no
         | facilities provided to do anything but in-memory with lzav.
         | 
         | I imagine there are better algorithms than lzav for speed and
         | small compression sizes. What I like so far is that lzav is a
         | single header file:                 lzav.h
         | 
         | Terse code, simple.
        
           | [deleted]
        
           | someplaceguy wrote:
           | > I don't know how practical it would be but an outside-of-
           | memory data data compression algorithm would be cool.
           | 
           | What does "outside-of-memory" data compression algorithm
           | mean, exactly?
           | 
           | > What I like so far is that lzav is a single header file
           | 
           | That's cool indeed.
        
             | fluoridation wrote:
             | >What does "outside-of-memory" data compression algorithm
             | mean, exactly?
             | 
             | You keep your data structures (dictionary, Huffman tree,
             | etc.) in storage and stream them into memory only as
             | necessary. Same for the data being compressed. You could
             | use an extremely large dictionary while only using a few
             | kilobytes of memory.
        
             | mmphosis wrote:
             | > What does "outside-of-memory" data compression algorithm
             | mean, exactly?
             | 
             | Not RAM or ROM. Buffered from a device: database cluster,
             | disk blocks/files, network packets, serial port, ...
             | 
             | To be complete, going the other way: ..., "in-cache", or
             | "in-register" data compression
        
         | kuroguro wrote:
         | I think it means "not streaming" - ie the whole
         | compressed/uncompressed buffer has to be held in memory.
        
       | grebnov wrote:
       | The 'Silesia compression corpus' is the standard (albeit
       | outdated) lossless data compression corpus. The English wiki data
       | set does not provide a fully representative comparison, as it
       | does not include other file types such as executables or
       | structured data.
        
       ___________________________________________________________________
       (page generated 2023-07-20 23:02 UTC)