[HN Gopher] Zstandard RFC 8878
___________________________________________________________________
Zstandard RFC 8878
Author : itroot
Score : 63 points
Date : 2021-11-19 19:41 UTC (3 hours ago)
(HTM) web link (datatracker.ietf.org)
(TXT) w3m dump (datatracker.ietf.org)
| buryat wrote:
| i forgive facebook all their abuses just because they gave us
| zstd
| metafex wrote:
| they didn't though. zstd has been around even before the main
| dev joined fb, i distinctly remember it being under the persons
| personal github name.
| kzrdude wrote:
| You could read about lz4 and then later zstd on
| http://fastcompression.blogspot.com/ long before he joined
| facebook.
| oofbey wrote:
| I think you should read more about Facebook. Try e.g. the
| Damien Collins email dump, and read about how their android app
| tricked people into letting it record all phone call and text
| message records, knowing full well users would hate it if they
| found out.
|
| Clearly they produce good technology. But the company is
| morally bankrupt.
| Y_Y wrote:
| The PDF of of the information I assume you're referring to
| is: https://www.parliament.uk/documents/commons-
| committees/cultu...
|
| Here is a quote from the summary, but I could not find where
| it was substantiated in the 250-page document:
|
| > Facebook knew that the changes to its policies on the
| Android mobile phone system, which enabled the Facebook app
| to collect a record of calls and texts sent by the user would
| be controversial. To mitigate any bad PR, Facebook planned to
| make it as hard of possible for users to know that this was
| one of the underlying features of the upgrade of their app.
| jeffbee wrote:
| Does this mean the Zstd magic number is now cast in stone?
| wmf wrote:
| The file format was finalized years ago, so yes.
| cornstalks wrote:
| It's an Informational RFC, not a Standards Track RFC
| (https://en.wikipedia.org/wiki/Request_for_Comments#Status).
| That said, I think the magic number is pretty firmly
| established.
| lifthrasiir wrote:
| You may have mistaken Brotli (whose file format has no magic
| number and prevents an easy identification) with Zstandard
| (whose file format does have defined magic numbers 28 B5 2F FD
| or [50-5F] 2A 4D 18).
| jeffbee wrote:
| No I'm not confused, it's just that the Zstd magic number has
| had _8 different values_ over the years, so I 'm just
| wondering if we're past that yet.
| lifthrasiir wrote:
| Ah sure. The wire format has been fixed since 0.8.0
| (2016-08), so you must have seen a very early phase of
| development (which took one full year).
| stouset wrote:
| Can you shed some light on why this might be something of
| concern?
| ggm wrote:
| It's said to be a good fit for ZFS. I tend to lz4 because its
| baked into the older systems I use, but it may be at a point
| where my default should be zstd.
|
| bz2/gz still predominates for compressed objects in filestore
| from what I can see.
| [deleted]
| thriftwy wrote:
| Zstandard has very cool dictionary training feature, which allows
| to keep a separate dictionary and have a 50% ratio compression on
| very small (~100b) but repetitive data such as database records.
| kzrdude wrote:
| by the way, zlib-ng also seems interesting. In the sense that
| it's cleaning up and improving a very aged library
| https://github.com/Dead2/zlib-ng
| vlovich123 wrote:
| Is there a reason zstd isn't popular for HTTP and only brotli and
| gzip see adoption?
| zinekeller wrote:
| Because Facebook doesn't have a browser.
|
| (But seriously, Mozilla engineers have warned the Chrome team
| that they are too rush with the inclusion of Brotli, since that
| compression wars are heating up. They still proceeded though,
| which is unsurprising.)
| lifthrasiir wrote:
| While that might well be one reason, it should be also noted
| that Zstandard optimizes for the decompression speed with a
| reasonable compression ratio while Brotli concentrates on the
| compression ratio at the slight expense of speed (though it
| is very hard to do a fair comparison). This is evident from
| their defaults, where zstd uses a fairly low level (3 out of
| -7..22) and Brotli uses the maximum level (11 out of 1..11).
| But both have the decompression speeds far exceeding 100 MB/s
| which is the practical limit for most Internet users, so
| zstd's higher decompression speed wouldn't matter much in the
| web context.
| lifthrasiir wrote:
| Brotli was arguably designed specifically for the web, because
| it was originally used in the WOFF2 font format and also had a
| large amount of preset dictionary collected from the web
| (including HTML, CSS and JS fragments). Zstandard had no such
| consideration, and while it _could_ be as efficient as Brotli
| with a correct dictionary it does have a less merit compared to
| Brotli in the web context.
| duskwuff wrote:
| Brotli has some pretty wild optimizations for web content,
| including a gigantic (~120 KB) predefined dictionary packed
| full of sequences commonly used in HTML/JS/CSS content. This
| gives it a huge advantage on small text files.
| jhgb wrote:
| I assume it's because it's very new? That would seem like an
| obvious explanation.
| wolf550e wrote:
| zstd is from 2015.
| loeg wrote:
| For comparison, brotli is from 2013.
| erichocean wrote:
| Does Zstandard still have the junk Facebook license attached to
| it?
| lifthrasiir wrote:
| Not since 1.3.1.
| erichocean wrote:
| Thanks!
| felixhandte wrote:
| As someone on the Zstd team, I'm always happy to see it on HN!
| I'm curious though what motivates the submission?
| thewakalix wrote:
| Probably its use in elfshaker[0].
|
| [0] https://news.ycombinator.com/item?id=29277779
| kzrdude wrote:
| Zstd is always interesting.
|
| For many applications (file formats), ubiquity is important, so
| it would be fun if zstd becomes ubiquitous and can be relied on
| to be available. Let's say for example in future versions of
| HDF (HDF5 or later).
| m0zg wrote:
| Zstd is an amazing bit of work and all I ever use for data
| compression nowadays (or LZ4 when speed is even more critical).
| Several times the compression/decompression speed of gzip,
| approximately the same compression ratio with default settings.
|
| It's also supported by tar in recent Linux distros, if zstd is
| installed, so "tar acf blah.tar.zst *" works fine, and "tar xf
| blah.tar.zst" works automatically as well. Give it a try, folks,
| and retire gzip shortly afterwards.
| nigeltao wrote:
| > Several times the compression/decompression speed of gzip
|
| Just be careful that you're comparing against the best
| implementation of gzip. One recent re-implementation of zcat
| was 3.1x faster than /bin/zcat (and the CRC-32 implementation
| within was 7.3x faster than /bin/crc32). Both programs decode
| exactly the same file format. They're just different
| implementations. For details, see:
| https://nigeltao.github.io/blog/2021/fastest-safest-png-deco...
| mjevans wrote:
| I get why someone might want to avoid .zstd ; but that is the
| short name offered for humans.
|
| Was .zs not sufficient if a file format ending in 'std' is so
| abhorrent?
| m0zg wrote:
| I'm not the one who came up with the extension. It just sort
| of organically happened I guess. I'd prefer "zstd" myself,
| but, frankly, "zst" is fine as well.
| diroussel wrote:
| Four letter extensions work really well for .java and
| .json. It seems strange the abbreviate zstd anymore.
| m0zg wrote:
| I'like to point out though that "zstd" is itself an
| abbreviation, and "zstandard" would be quite onerous.
___________________________________________________________________
(page generated 2021-11-19 23:01 UTC)