[HN Gopher] Finally Understanding PNG (2017)
       ___________________________________________________________________
        
       Finally Understanding PNG (2017)
        
       Author : Asdrubalini
       Score  : 117 points
       Date   : 2021-05-26 08:42 UTC (14 hours ago)
        
 (HTM) web link (compress-or-die.com)
 (TXT) w3m dump (compress-or-die.com)
        
       | nayuki wrote:
       | This article talks about understanding the PNG pixel data and
       | compression, which is wonderful. Recently I made a tool to
       | understand another aspect of the PNG file format, which is the
       | full list of chunks, metadata chunks, and chunk fields.
       | https://www.nayuki.io/page/png-file-chunk-inspector
        
       | acqq wrote:
       | I'm old enough to find that it must be noted: "HDD" was _never_
       | used for diskettes, even the  "high density" ones.
       | 
       | HDD:
       | 
       | https://en.wikipedia.org/wiki/Hard_disk_drive
       | 
       | Diskettes:
       | 
       | https://en.wikipedia.org/wiki/Floppy_disk#3+1%E2%81%842-inch...
       | 
       | I'm commenting "HDD" as written in: "1.44 MB. This is also pretty
       | much the size of an uncompressed BMP image (and incidentally the
       | size of a 3.5 " HDD ... for all who can still remember" in the
       | current text of the article.
        
         | dahart wrote:
         | Oh that's funny, I am also old enough, but I'd have sworn I
         | used floppy disks that had "HDD" on them. I think you're right
         | though. All the pics are showing "2HD" and the more rare "3HD".
         | There were disks with "High Density Diskette" printed on them,
         | and initialisms "HD" and "DD" (double density). But I can't
         | find any examples of HDD outside of hard disks.
        
         | H8crilA wrote:
         | Low quality/tangential comment:
         | 
         | It was a _floppy_ disk, not a _hard_ disk. The magnetic surface
         | in an FDD disk is floppy the magnetic surface in a HDD is hard
         | (rigid).
        
           | afavour wrote:
           | > The magnetic surface in an FDD disk is floppy the magnetic
           | surface in a HDD is hard
           | 
           | Maybe I'm wrong but I was under the impression that a
           | "floppy" was called that because the original ones - 5.25",
           | _were_ floppy.
        
             | dahart wrote:
             | I think that is true, but should be noted that the
             | _original_ ones were 8 inches. :)
             | https://en.wikipedia.org/wiki/Floppy_disk
        
             | pansa2 wrote:
             | > _the original ones - 5.25 "_
             | 
             | Nitpick: The first floppy disks were 8" in size.
        
             | H8crilA wrote:
             | The 3.5" also had a floppy disk inside, only surrounded
             | with a plastic rigid case.
        
             | lonelygirl15a wrote:
             | "Floppy" refers to the media, not the packaging; some
             | (3.5") had harder plastic shells, but inside was the same
             | floppy plastic material. Hard disk platters were/are rigid
             | (metal or glass).
        
               | dahart wrote:
               | Yes that became true after 3.5 inch disks with hard
               | shells came out and were called "floppy", but before that
               | all the floppy disks had soft outer shells, and I believe
               | disks were named floppy for exactly that reason, not
               | because the inner media was floppy.
        
             | tym0 wrote:
             | Even the 3.5" were floppy inside, they just had a hard case
             | around it.
        
       | UglyToad wrote:
       | At some point I'm going to get round to adding heuristic filter
       | selection to my PNG library BigGustave [0]. I have a TODO note
       | for how to start implementing it.
       | 
       | Though to be honest using Palette for colors tends to produce a
       | huge saving for a lot of image types (those with 256 or fewer
       | colors, so documents, not natural scenes) making filter selection
       | a small added bonus.
       | 
       | [0]:
       | https://github.com/EliotJones/BigGustave/blob/master/src/Big...
        
       | ericpauley wrote:
       | Nit: it was somewhat disingenuous that the first two images
       | aren't actually encoding the same results, since noise was added
       | to the right image to accentuate the paletting effects. If the
       | image on the right didn't have noise the effect would be far less
       | pronounced.
        
       | Hamuko wrote:
       | Fun PNG fact: the official pronunciation of "PNG" is actually
       | "ping": http://www.libpng.org/pub/png/spec/iso/index-
       | object.html#1Sc...
       | 
       | I imagine there won't be a mainstream culture war over it like
       | there is with GIF and JIF though.
        
       | drsopp wrote:
       | > A PNG is compressed in three steps: pre-filtering, dictionary-
       | based coding via LZ77, and finally entropy coding according to a
       | certain Mr. Huffman.
       | 
       | Huffman was a Doctor of Science.
       | 
       | Also, (2017).
       | 
       | Edit: Couldn't OP's email. I don't have Twitter.
        
       | TonyTrapp wrote:
       | > These filters are the reason why the PNG is a nose ahead of the
       | GIF format and is therefore almost always smaller
       | 
       | Not really. I found that for many many many pictures, not using a
       | filter turns out to be the smallest option. It's really just that
       | DEFLATE is much better than LZW as used by GIF.
        
       | masklinn wrote:
       | > This is due to the fact that the specification of the PNG has a
       | sliding window of 32 kB for performance reasons.
       | 
       | It's a bit of a shortcut: in reality PNG uses DEFLATE, and it's
       | DEFLATE which has a 32kB window.
       | 
       | And it's not exactly performance reasons but memory budget,
       | especially as deflate was designed for _streaming_ , both when
       | compressing and decompressing: when decompressing you need a
       | buffer at least as large as the window, and when compressing you
       | need that plus most likely an index of some sort (so you don't
       | have to look for matches linearly through the buffer).
       | 
       | This is more relevant with the context that Phil Katz created
       | deflate in 1991, high-end desktops reached high single-digit
       | megabytes of RAM but the low end might not even be there (Windows
       | 3.0 required 1MB to run decently).
       | 
       | By comparison, if memory serves zstd uses a window of 8MB (or 16
       | depending on setting), though configurable anywhere from 1k to
       | 512MB
        
         | userbinator wrote:
         | ...and 32k allows the buffer to fit into a single 64k segment
         | along with the Huffman tables, no doubt a useful property at
         | the time. Deflate is actually very similar to LZH, and there
         | are small extensions to both which make the window 64k; but in
         | my experience it doesn't provide all that much extra
         | compression (few percent.)
        
           | magicalhippo wrote:
           | > no doubt a useful property at the time
           | 
           | For reference: https://en.wikipedia.org/wiki/X86_memory_segme
           | ntation#Real_m...
        
         | jandrese wrote:
         | I find it amusing when reading the docs on compression programs
         | and seeing statements like "Compression level of -9 should be
         | used with caution as it can consume over 8 megabytes of main
         | memory".
        
       | blowski wrote:
       | I learned a lot from that, thank you.
       | 
       | I wonder how much bandwidth I've wasted for people over the years
       | because I've saved poorly optimised versions from Photoshop. Are
       | there automated tools that can be included in a CI pipeline to
       | help with this?
        
         | chris_overseas wrote:
         | Squoosh (https://github.com/GoogleChromeLabs/squoosh) gives
         | very impressive results in my (admittedly somewhat limited)
         | experience.
        
         | willis936 wrote:
         | For graphics with only a few colors you can save a few extra
         | bits by moving to lower bit depths. PNG explicitly supports an
         | 8-bit palette color mode that is pretty lean. You can even
         | specify fewer than 8 bpp if your palette is small enough to
         | have its cardinality fall within lower powers of 2.
         | 
         | I recall PNG8 not being well supported in some applications.
         | I've run into this with game engines, which I found strange
         | since asset size is an important thing to keep in mind.
        
           | giantrobot wrote:
           | IE6 and below had no end of trouble with PNG transparency.
           | IIRC it only supported transparency correctly on PNG8 images
           | but only in img tags, not backgrounds, and ignored the alpha
           | channel entirely on PNG32.
           | 
           | There were some IE-specific hacks you could use with their
           | proprietary CSS extensions but they were really annoying. I
           | think IE7 finally fixed the transparency bugs but since there
           | was still a huge IE6 cohort it was still a crapshoot relying
           | on transparent PNGs.
        
         | ogurechny wrote:
         | I think the first step of any optimization strategy is to
         | understand what you are doing, and stating the goal that can
         | clearly be either reached or not. Size crunching on a couple of
         | images is fun, valiantly processing a sizable collection of
         | pictures with optipng makes you realize that your CPU(s) are
         | glowing from heat for hours just to take 10 bytes off the file
         | that still fills the same amount of disk sectors. That's even
         | bigger waste of energy than mining.
         | 
         | One particular trick is getting images with visually low number
         | of distinct colors, like logos and screenshots, through
         | pngquant. You don't have to worry whether the actual number of
         | colors is less than 256, how to posterize or filter the image
         | to decrease that number, or how to suppress artificial colors
         | from font anti-aliasing -- the tool measures the visual
         | likeness, and provides the means to set the acceptable limit so
         | it would leave unsuitable images intact. Obviously, when you're
         | dealing with some long-term archival storage, or assets for
         | further editing like stock clipart, such presentation-oriented
         | lossy optimization should not be used.
         | 
         | Other automatic tools are referenced in the article, though you
         | should treat them as a nice sanity check that prevents images
         | holding raw uncompressed data or simple two color graphs saved
         | as 4 byte RGBA from appearing in your project, and not a
         | magical solution that would send it to the sky performance-
         | wise.
         | 
         | Bandwidth (and loading time) is usually wasted on complex high
         | resolution images like photos or combined graphics with
         | photographic details. These can't really be optimized well
         | enough in PNG with naive lossless or lossy methods, and have to
         | be converted to JPEG to reach sane size or regenerated as
         | independent layers that use different compression.
        
         | karim79 wrote:
         | The https://kraken.io service has an API, and there's a gulp
         | and grunt plugin and a bunch of other stuff here:
         | https://kraken.io/docs/integration-libraries
        
           | djxfade wrote:
           | Kraken never answers any support requests, and I have
           | experienced outages without any information. I would not rely
           | on them.
        
           | blowski wrote:
           | That looks perfect, except for it being a SaaS product with a
           | minimum monthly spend. Is there a similar pay-as-you-go
           | model, or ideally self-hosted?
        
             | karim79 wrote:
             | Good question - not at the moment, but we're evaluating the
             | idea. Perhaps the answer would be to just make the build
             | pipeline tools free for developers to a certain limit.
        
             | aargh_aargh wrote:
             | Wait, why not optipng [1], pngcrush [2] or any other of the
             | handful of locally running lossless PNG optimizers [3]?
             | [1] https://packages.debian.org/unstable/optipng       [2]
             | https://packages.debian.org/unstable/pngcrush       [3]
             | http://www.olegkikin.com/png_optimizers/
        
               | blowski wrote:
               | Do they do what the article is recommending? Genuine
               | question, I don't know the answer.
        
               | TonyTrapp wrote:
               | They all optimize PNGs in different ways. For some files
               | one tool is the best, for others another tool. There are
               | even meta-tools like PNGGauntlet that just brute-force
               | the compression with several of these tools (e.g. pngout
               | and optipng) at the same time. That way, you can be
               | relatively sure that your file will be close to being the
               | smallest possible.
        
               | sumtechguy wrote:
               | If you are in the windows world I have been using this. h
               | ttps://sourceforge.net/projects/nikkhokkho/files/FileOpti
               | mi...
               | 
               | Which contains all of those and just tries them all. Been
               | meaning to pop it all out and make a bash script of them
               | to automate some of my stuff on a cron job instead of
               | using a GUI.
        
         | virtue3 wrote:
         | I've always just used this (it's more old school but def
         | solid):
         | 
         | https://pmt.sourceforge.io/pngcrush/
        
       | sprash wrote:
       | As additional strategy for reducing PNG sizes losslessly not
       | mentioned in the article is the brute force optimization of the
       | DEFLATE compression algorithm (via zopflipng). This gives you
       | usually a few percent improvement for "free".
        
       | huachimingo wrote:
       | Pico-8 has its cartridge format on a custom PNG.
        
       | userbinator wrote:
       | _Finding the optimal filter for each line is a science in itself,
       | which, of course, could be approached with brute computer power.
       | But since you probably do not want to wait a week in front of
       | your computer, until the PNG is finally calculated, the filters
       | are selected by experience values and assumptions._
       | 
       | That paragraph makes it sound like this article was written two
       | decades ago. pngcrush, which does bruteforce, still takes only a
       | minute or two on average (less than screen-sized) files.
        
         | nayuki wrote:
         | PNG defines 5 possible filter methods per row: 0 = None, 1 =
         | Sub, 2 = Up, 3 = Average, 4 = Paeth. Brute-forcing the filter
         | search is exponential time at Th(5^imageHeight).
         | https://www.w3.org/TR/2003/REC-PNG-20031110/#9Filters
        
         | tialaramex wrote:
         | pngcrush doesn't actually brute force, for _exactly the reason
         | the article gives_ , it would take too long.
         | 
         | Instead, pngcrush is using yet more heuristics to guess what
         | else is likely to work and then trying some of the other
         | options that wouldn't be picked by the defaults.
         | 
         | By default it only tries a few approaches that it deems _most
         | likely_ to work, but even with the  "brute" parameter it does
         | not truly try everything, again, that would take too long, and
         | pngcrush "brute" is already very slow, it just tries all the
         | methods it would ever have used on any image.
         | 
         | Actual brute force of a relatively tiny 32x32 image would
         | require trying all five filters for each line _in combination_
         | with each such filter for each other line or else you might
         | miss a combination which, astonishingly, was smaller even
         | though it wasn 't immediately obvious why. I think that's 5
         | raised to the power 32 combinations.
        
           | magicalhippo wrote:
           | It could also use a look-ahead scheme. That is, cloning the
           | compression state, compressing the current line for each of
           | the different filters, retaining the state that lead to the
           | highest compression and discarding the others.
           | 
           | This could be trivially extended to multiple lines of look-
           | ahead, though at exponential memory and CPU cost.
           | 
           | Probably dog slow though.
        
             | tialaramex wrote:
             | The simplest of these schemes is pretty fast, only five
             | times slower than the naive approach. _But_ it absolutely
             | doesn 't give you the benefit from trying all the
             | combinations and I expect it will produce worse results
             | than existing tools.
             | 
             | The problem is, while you know which outputs compressed
             | best _so far_ you can 't be sure which will compress best
             | in combination with future outputs you haven't as yet
             | filtered.
             | 
             | Imagine for example that one filter for the current row
             | gives what seems like a random sequence that doesn't
             | compress very well. Your proposed scheme rejects this
             | filter, it compressed poorly, so don't use it. But, what if
             | it turns out that every future row can be compressed to the
             | _exact same sequence_ with some other filter? Deflate can
             | compress the rows _extremely well_ together, even though
             | the first one on its own did not compress.
        
               | userbinator wrote:
               | I haven't looked into the theory of it much, but wouldn't
               | the local minimum also be the global one?
               | 
               | In your example, even if the first row is more poorly
               | compressed with one filter, could future rows "catch up"
               | the loss? Unless you have a specific counterexample, I
               | don't think it can, because the filtration operates
               | before compression and there's no feedback as such.
        
               | tialaramex wrote:
               | > I don't think it can, because the filtration operates
               | before compression and there's no feedback as such.
               | 
               | The compressor consumes the _entire_ filtered data. You
               | may have been thinking it goes something like
               | 
               | Compress(Filter(row1,method=2)).Compress(Filter(row2,meth
               | od=4)).Compress(Filter(row3,method=1)) and so on.
               | 
               | But that would be terribly wasteful. What they do is
               | actually closer to
               | 
               | Compress(Filter(row1,method=2).Filter(row2,method=4).Filt
               | er(row3,method=1) and so on
               | 
               | Does that make it clearer that if the _results_ of two
               | Filter steps are more or less similar, even if those
               | results compress badly on their own, it helps them
               | compress _better_ when they 're both nearby ?
        
               | magicalhippo wrote:
               | Obviously local approaches can fail to find the global
               | minimum. This is mitigated somewhat though I imagine by
               | the limited 32kB window deflate has.
               | 
               | Would be a fun weekend project...
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2021-05-26 23:02 UTC)