[HN Gopher] LZW and GIF explained
___________________________________________________________________
LZW and GIF explained
Author : networked
Score : 79 points
Date : 2024-05-24 07:43 UTC (1 days ago)
(HTM) web link (www.eecis.udel.edu)
(TXT) w3m dump (www.eecis.udel.edu)
| mjevans wrote:
| That's probably an OK writeup but seems focused on someone who's
| good at learning without doodles and pictures. A visual /
| kinesthetic variation would be nice, even if it's just ASCII art
| of the data patterns.
| duskwuff wrote:
| A comment in the HTML source says this page was last modified
| in 1997. I think I can forgive the author for not including
| diagrams.
| robinsonb5 wrote:
| Yeah, .gif files weren't around back then. ;)
| mjevans wrote:
| Surely PRE blocks existed in 1997. "even if it's just ASCII
| art of the data patterns."
| salamo wrote:
| I found out the other day that zstd supports dictionary
| compression [0]. It's a pretty neat concept: you can "train" a
| dictionary on your corpus of text, and use this dictionary to
| compress subsequent documents. This is really useful if you have
| a lot of small documents with a similar distribution of
| characters. You only have to transmit the dictionary once over
| the wire once, and then clients can get great compression ratios
| after that.
|
| For example, suppose you have a lot of chess PGNs you'd like to
| store [1]. If you compress each PGN separately, you won't get
| nearly as high of a compression ratio as if you stored them all
| in the same file and compressed that file. But storing games
| individually is more convenient. Here are the numbers:
|
| Original PGN file: 780 bytes
|
| Default zstd compression: 508 bytes (35% decrease)
|
| With dictionary: 376 bytes (52% decrease)
|
| I'm thinking about ways to compress at an even higher ratio.
| Chess PGNs are a great candidate for this task because they have
| a huge amount of redundant and repeated text.
|
| [0] https://news.ycombinator.com/item?id=35917279
|
| [1] https://en.wikipedia.org/wiki/Portable_Game_Notation
| kevingadd wrote:
| The shared dictionary technique you describe is presently used
| for modern versions of HTTPS and for Brotli, IIRC. Very
| effective.
| follower wrote:
| ...though with the slightly unexpected side effect (for
| Brotli, at least) that your executable may end up containing
| (~200KB, from memory) of _very_ unexpected[-1] plain text
| strings which might ( & has[0]) lead to questions from
| software end-users asking why your software contains
| "random"[1] text (including potentially "culturally
| sensitive" words/phrases related to religion such as "Holy
| Roman Emperor", "Muslims", "dollars", "emacs"[2] or similar).
|
| (I encountered this aspect while investigating potential size
| optimization opportunities for the Godot game engine's
| web/WASM builds--though presumably the Brotli dictionary
| compresses well if the transfer encoding is... Brotli. :D )
|
| ---- footnotes ----
|
| [-1] An aspect also previously mentioned on HN:
| https://news.ycombinator.com/item?id=27160590
|
| [0] "This needs to be reviewed immediately #876":
| https://github.com/google/brotli/issues/876
|
| [1] Which, regardless of meaning, certainly bears
| similarities to the type of "unexpected weird text"
| commonly/normally associated with spam, malware, LLMs and
| other entities of ill repute.
|
| [2] The final example may not actually be factual. :)
| RicoElectrico wrote:
| Just XOR it with 0x55 ;)
| follower wrote:
| "Pros always use this one weird trick Brotli compression
| hates." :)
| follower wrote:
| My personal favourite GIF-related explainer:
|
| * "What's In A GIF":
| https://www.matthewflickinger.com/lab/whatsinagif/
|
| It's a five part explanation that includes both great
| visualisations and bespoke analysis tools[2].
|
| Here's a direct link to the section on LZW compression[0]:
|
| * https://www.matthewflickinger.com/lab/whatsinagif/lzw_image_...
|
| [0] Which somewhat confusingly credits "John Barkaus's LZW and
| GIF Explained" which _actually_ turns out[1] to be... the text of
| "LZW and GIF explained----Steve Blackstock" via a news group post
| by John Barkaus (and is the text at OP's HN post URL).
|
| [1]
| http://web.archive.org/web/20050217131148/http://www.danbbs....
|
| [2] Including this _fantastic_ example of visualizing individual
| sections of a multi-part binary file format, the UX /UI of which
| I think would be a great addition to a "hex viewer" application:
| https://www.matthewflickinger.com/lab/whatsinagif/gif_explor...
___________________________________________________________________
(page generated 2024-05-25 23:02 UTC)