[HN Gopher] LZAV - Fast In-Memory Data Compression Algorithm (In C)
___________________________________________________________________
LZAV - Fast In-Memory Data Compression Algorithm (In C)
Author : mmphosis
Score : 24 points
Date : 2023-07-20 00:39 UTC (22 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| someplaceguy wrote:
| What do you mean "in-memory"? Is there any other kind of data
| compression algorithm?
|
| How does it compare with zstd? I find that zstd, given the right
| compression level, can beat almost every other compression
| algorithm nowadays on at least one or two metrics, and very often
| on all metrics.
|
| For example, I just tried to compress the `enwik9` dataset with
| both lz4 and `zstd -1 --single-thread` and zstd was both faster
| on compression and decompression and also produced a 357 MB file
| compared to 509 MB with lz4. According to these results,
| apparently it would beat LZAV on every metric?
|
| That said, even though I used `--single-thread`, somehow zstd
| still used 150% of CPU according to `time` (?).
| mmphosis wrote:
| I don't know how practical it would be but an outside-of-memory
| data compression algorithm would be cool. There are no
| facilities provided to do anything but in-memory with lzav.
|
| I imagine there are better algorithms than lzav for speed and
| small compression sizes. What I like so far is that lzav is a
| single header file: lzav.h
|
| Terse code, simple.
| [deleted]
| someplaceguy wrote:
| > I don't know how practical it would be but an outside-of-
| memory data data compression algorithm would be cool.
|
| What does "outside-of-memory" data compression algorithm
| mean, exactly?
|
| > What I like so far is that lzav is a single header file
|
| That's cool indeed.
| fluoridation wrote:
| >What does "outside-of-memory" data compression algorithm
| mean, exactly?
|
| You keep your data structures (dictionary, Huffman tree,
| etc.) in storage and stream them into memory only as
| necessary. Same for the data being compressed. You could
| use an extremely large dictionary while only using a few
| kilobytes of memory.
| mmphosis wrote:
| > What does "outside-of-memory" data compression algorithm
| mean, exactly?
|
| Not RAM or ROM. Buffered from a device: database cluster,
| disk blocks/files, network packets, serial port, ...
|
| To be complete, going the other way: ..., "in-cache", or
| "in-register" data compression
| kuroguro wrote:
| I think it means "not streaming" - ie the whole
| compressed/uncompressed buffer has to be held in memory.
| grebnov wrote:
| The 'Silesia compression corpus' is the standard (albeit
| outdated) lossless data compression corpus. The English wiki data
| set does not provide a fully representative comparison, as it
| does not include other file types such as executables or
| structured data.
___________________________________________________________________
(page generated 2023-07-20 23:02 UTC)