hngopher.com

       [HN Gopher] John Carmack on JPEG
       ___________________________________________________________________
        
       John Carmack on JPEG
        
       Author : tosh
       Score  : 85 points
       Date   : 2021-06-04 21:43 UTC (1 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | TazeTSchnitzel wrote:
       | > You can do [direct use of YUV] today, but you need to do the
       | color conversion manually in a shader
       | 
       | Many mobile GPUs support an extension that does YUV conversion
       | for you (GL_EXT_YUV_target). Maybe it's of less interest to
       | desktop GPU vendors?
        
         | jayd16 wrote:
         | I think the problem is moving the dev flow from layered bitmaps
         | to a shaded flow.
        
       | pornel wrote:
       | IIRC this optimization has already been used by Opera Mobile and
       | Edge in the past: upload YUV to the GPU RAM, and then convert
       | pixels to RGB on the fly when displaying. Unfortunately, I can't
       | find the links to their respective blog posts (<shakes fist at
       | Google substituting all keywrods with their synonyms and
       | searching recent pages only>).
       | 
       | However, chroma subsampling is a very primitive form of 2x2 block
       | compression (NB: not related to JPEG's DCT block size). These
       | days GPUs support much better compression natively, with 4x4
       | blocks, and much fancier modes. With clever encoding of such
       | textures, it's even possible to achieve compression ratio
       | comparable with JPEG's, while having a straightforward way to
       | convert the compressed file to the compressed texture format.
        
       | WalterGR wrote:
       | I feel daft, but:                 The vast majority of images are
       | jpegs, which are internally 420 YUV, but they get converted to 32
       | bit RGB for use in apps. Using  native YUV formats would save
       | half the memory and rendering bandwidth, speed loading, and
       | provide a tiny quality improvement.
       | 
       | What does he mean by _using_ native YUV formats? Something (I
       | wave my hand) in the rendering pipeline from the JPEG in memory
       | to pixels on the screen?
        
         | PragmaticPulp wrote:
         | > What does he mean by using native YUV formats?
         | 
         | Your display uses 3 bytes per pixel. 8 bits for each of the R,
         | G, and B channels. This is known as RGB888. (Ignoring the A or
         | alpha transparency channel for now).
         | 
         | YUV420 uses chroma subsampling, which means the color
         | information is stored at a lower resolution than the brightness
         | information. Groups of 4 pixels will have the same color, but
         | each pixel can have a different brightness. Our eyes are more
         | sensitive to brightness changes than color changes, so this is
         | usually unnoticeable.
         | 
         | This is very advantageous for compression because YUV420
         | requires 6 bytes per 4 pixels, or 1.5 bytes per pixel, because
         | groups of pixels share a single color value. That's half as
         | many bytes as RGB888.
         | 
         | When you decompress a JPEG, you first get a YUV420 output.
         | Converting from YUV420 to RGB888 doesn't add any information,
         | but it doubles the number of bits used to represent the image
         | because it stores the color value for every individual pixel
         | instead of groups of pixels. This is easier to manipulate in
         | software, but it takes twice as much memory to store and twice
         | as much bandwidth to move around relative to YUV420.
         | 
         | The idea is that if your application can work with YUV420
         | through the render pipeline and then let a GPU shader do the
         | final conversion to RGB888 within the GPU, you cut your memory
         | and bandwidth requirements in half at the expense of additional
         | code complexity.
         | 
         | Wikipedia is a good source of diagrams and details that explain
         | this further:
         | https://en.wikipedia.org/wiki/YUV#Y%E2%80%B2UV420p_(and_Y%E2...
        
           | ranman wrote:
           | Thanks, this is a very clear explanation.
        
         | vlovich123 wrote:
         | Yup. You typically first decode the JPEG into a raw image
         | format in GPU memory & then that gets transferred to scanout HW
         | that actually sends it to the display (e.g. via HDMI). The
         | argument feels a bit weak though. At that point why not just
         | use JPEG through your entire pipeline & save even more
         | bandwidth. Of course you have to decompress if you're modifying
         | the image in any way so it doesn't help there.
        
           | edflsafoiewq wrote:
           | You can't texel-fetch directly from JPEG data, it has to be
           | decompressed first.
        
             | vlovich123 wrote:
             | Yes of course. GPUs already support compressed textures
             | though that get decompressed on the fly. Of course JPEG has
             | a lot of baggage that actually makes it a poor format for
             | this but perhaps newer ones might be useful. What you lose
             | in "optimality" (compression schemes designed for GPUs are
             | better for games), you win in "generality" (ability to use
             | this outside games to, for example, improve web browsing
             | battery life).
        
               | mrec wrote:
               | There's a very hard distinction between GPU texture
               | compression formats and general image compression formats
               | - the former need to support random access lookup in
               | constant time. Anything meeting that criterion is not
               | going to be generally competitive; it's like the
               | difference between UTF-32 strings and UTF-8 strings.
        
         | david-gpu wrote:
         | You could do the YUV to RGB conversion operation in your pixel
         | shaders in the GPU. That way you save a bit of bandwidth
         | compared to uncompressed RGB.
         | 
         | It's been done. There are even GPUs that support this operation
         | natively, so there's no additional overhead.
        
         | mindfulplay wrote:
         | I would imagine a shader that converts YUV to ARGB at the time
         | of rendering as opposed to storing it all the way along the
         | pipeline as 32 bit integers.
         | 
         | It's a bit tricky because rendering pipelines composite the
         | final image through many layers of offscreen compositing before
         | the pixel hits the screen.
         | 
         | The core issue is that the offscreen composited layers would
         | still be 32bit textures which is a bigger issue. I would
         | imagine a Skia-based draw list to encode this through the
         | pipeline which could help preserve this perhaps.
        
       | cornstalks wrote:
       | One of the challenges with Y'CbCr (what Carmack is calling "YUV")
       | is that there are so many flavors.
       | 
       | He mentions 4:2:0 chroma subsampling. But he doesn't mention
       | chroma siting. Or alternative sumbsampling schemes. Or matrix
       | coefficients. Or full-range vs video-range (a.k.a. JPEG vs MPEG
       | range). Heck, how you even arrange subsampled data varies by
       | system (many libraries like planar; Apple likes bi-planar; etc.).
       | 
       | I'd love to see more support for rendering subsampled Y'CbCr
       | formats so you don't have to use so much RAM, but it gets
       | complicated quick.
        
         | jayd16 wrote:
         | Is this a real problem? Surely something in the current
         | pipeline understands what YUV flavor JPEG uses.
        
       | modeless wrote:
       | This would save memory but add compute cost if the image is drawn
       | to the framebuffer or another sRGB buffer more than once or
       | twice. It wouldn't necessarily be a win to make this behavior the
       | default everywhere.
       | 
       | In the case of web browsers it depends how the image is used. An
       | <img> with a large JPEG is probably drawn only once during tile
       | rasterization, and browsers could certainly use the memory
       | savings, so it would probably be a win. But if you had a small
       | JPEG used as a page background and tiled over the whole screen,
       | the memory savings would be small and you'd be wasting power
       | converting the same pixels from YUV to sRGB over and over, so
       | that would likely be a loss.
        
         | raphlinus wrote:
         | That is not _necessarily_ true. In some cases, the compositor
         | maybe configured to sample the input buffer directly from
         | YUV420, applying the transformation on scanout and thereby
         | saving memory bandwidth to read from the framebuffer. This
         | makes a tremendous amount of sense when the source is a video,
         | but much less sense when the source may be rendered vector
         | graphics, which generally look pretty bad with subsampled
         | chroma.
        
           | modeless wrote:
           | Sure, if you can use a YUV framebuffer then that can save
           | memory bandwidth during scanout (though the conversion to RGB
           | still happens because the screen is not YUV). But that
           | doesn't apply in the tiled page background case I mentioned,
           | as web page contents are composited in RGB.
        
       | lilyball wrote:
       | I am fascinated by his use of backslash-escaping the tweet EOD to
       | signify that it's being followed by another one.
        
         | supernintendo wrote:
         | I suppose it takes less characters than marking the ordinal of
         | the tweet and probably familiar to Carmack's technically-minded
         | audience as escaping the return character in a command line (to
         | add a new line to the command instead of running it).
        
       | hatsunearu wrote:
       | Maybe I'm too stupid to understand this but AFAIK YUV isn't
       | exactly linear so you still need to convert to linear space
       | (trichromatic linear color space like RGB/XYZ)... no?
        
         | david-gpu wrote:
         | YUV is a linear transformation of RGB to the best of my
         | knowledge. Wikipedia seems to agree.
        
           | jacobolus wrote:
           | JPEG uses Y'CrCb, not YUV per se. The matrix transformation
           | is applied to gamma-encoded R'G'B' (typically sRGB), not
           | linear RGB.
           | 
           | http://poynton.ca/PDFs/YUV_and_luminance_harmful.pdf
        
             | brigade wrote:
             | The only important bit of pedantry over analog/digital and
             | prime is reminding people that no one deals with linear
             | light unless you already know you're doing so, and that
             | blending in nonlinear space is rarely correct, no matter
             | how common it is to do so.
        
             | david-gpu wrote:
             | I think we are in agreement? What you are saying is that
             | conversion from linear YUV to linear RGB is, indeed,
             | linear.
             | 
             | Further, the transformation from Y'CrCb to gamma-encoded
             | R'G'B' is also a linear operation. Right?
        
         | simias wrote:
         | Indeed but that's also true for RGB (which is typically sRGB on
         | computers) and you can have linear YUV if you want. "RGB" and
         | "YUV" can really mean a whole bunch of things. There are many
         | RGBs and probably twice as many YUV due to the hell that's
         | video standards.
         | 
         | I'm not sure I understand where Carmack is coming from here
         | though (am I missing some context? I don't use twitter and
         | these threads are always a huge pain for me to follow
         | especially since Carmack doesn't even bother breaking on full
         | sentences). I don't get how processing in YUV instead of RGB
         | has anything to do with 10bit components for instance.
         | 
         | Also, in my experience most video software deals with YUV
         | natively and only converts as needed. It's probably different
         | in the gaming and image processing world but that's because
         | everything else is RGB and it seems to be a big ask to just
         | tell everybody to convert to YUV.
         | 
         | Besides if quality is of the essence, you will typically store
         | more that 10 bits for internal processing, probably 16 and
         | maybe even floats if you want to have as much range as
         | possible.
         | 
         | I dunno, I won't pretend that I'm smarter than Carmack, but I
         | wish there was a bit more context because it's a bit opaque for
         | me at the moment.
        
           | Jenk wrote:
           | > I don't use twitter and these threads are always a huge
           | pain for me to follow especially since Carmack doesn't even
           | bother breaking on full sentences
           | 
           | This site (threadreaderapp.com) may be of interest to you. It
           | aggregates threads into a readable column as if it were a
           | single article, here's Carmack's "thread":
           | https://threadreaderapp.com/thread/1400930510671601666.html
           | 
           | Extremely useful for dialogues/conversations on twitter.
        
           | alisonkisk wrote:
           | https://news.ycombinator.com/item?id=27399731
           | 
           | explains.
        
       | Animats wrote:
       | There's JPEG 2000, which is nothing like classic JPEG. It doesn't
       | have the artifacts classic JPEG introduces at sharp edges, so you
       | can read text. It has more color depth if you want it. It also
       | doesn't mess up non-color data such as normal maps the way
       | classic JPEG does.
       | 
       | JPEG 2000 is not used much. Decoding is slow, and encoding is
       | slower. The decoders are either buggy or proprietary. It has way
       | too many options internally. The big users of JPEG 2000 are
       | medical. It has a "lossless" mode, and medical imagery is usually
       | stored lossless because you really don't want compression
       | artifacts in X-rays.
       | 
       | (I've been struggling with JPEG 2000 recently. Second Life and
       | Open Simulator use it for asset storage. Some images won't
       | decompress properly with OpenJPEG, a free decoder. I finally
       | found out why. There's a field used for "personal health
       | information" in JPEG 2000. This is where the patient name and
       | such go in a CAT scan. That feature was added after the original
       | version. Some older images apparently have junk in that field,
       | which causes problems.)
        
         | TheDudeMan wrote:
         | "you really don't want compression artifacts in X-rays"
         | 
         | You really don't want compression artifacts in any image, which
         | is why you should set the compression ratio wisely when
         | encoding. I don't see how X-rays are any different.
        
           | eco wrote:
           | Sure, but you don't normally end up getting an unnecessary
           | biopsy with most other image artifacts.
        
           | clord wrote:
           | some images manage to communicate in spite of high
           | compression. X-rays are an example of an image where the cost
           | of misinterpretation is very high.
        
           | kragen wrote:
           | Compression artifacts in X-rays can easily kill people,
           | either by requiring additional unnecessary X-rays (which
           | cause cancer) or by causing erroneous diagnoses. Compression
           | artifacts in filtered photos of your cute pet turtle for
           | Instagram are much less likely to kill people.
        
           | me_again wrote:
           | IIRC there was a study which indicated oncologists' ability
           | to detect tumors in X-ray images was degraded even with lossy
           | compression ratios which didn't introduce obvious artifacts.
           | I couldn't find it with a quick search though.
        
           | [deleted]
        
         | jacobolus wrote:
         | Other big users of JPEG 2000 include libraries and archives
         | storing e.g. historic map images.
        
         | Keyframe wrote:
         | _JPEG 2000 is not used much_
         | 
         | There's one big win though. Digital cinema projectors, in now
         | pretty much all theatres, run DCPs which have movies encoded
         | with it.
        
           | aaron-santos wrote:
           | Adding to the list: ESA's Sentinel 2 products
           | (granules/tiles) in are typically distributed in JPEG2000
           | format. Anyone working with this imagery has probably
           | experienced all the trouble associated with extracting and
           | transforming it into a more useful format.
        
         | TazeTSchnitzel wrote:
         | > Second Life and Open Simulator use it for asset storage
         | 
         | JPEG2000 is such a cool choice for Second Life. The fact any
         | truncation of a JPEG2000 bitstream is just a lower-resolution
         | version of the image makes for convenient progressive
         | enhancement when loading textures over the network, right?
         | 
         | (And Second Life even used JPEG2000 for geometry, sort-of. I
         | guess with the advent of proper mesh support, that may be less
         | common now though.)
        
         | Sebb767 wrote:
         | What's the advantage of JPEG2000 lossless over, say, PNG?
         | Archives will probably be compress later on anyway and using a
         | widely-supported and known format seems far more suited for
         | archival purposes.
        
       | Dylan16807 wrote:
       | Does anyone know what fraction of jpegs in the real world are
       | 4:2:0?
       | 
       | For photos I use whatever the camera likes, but for everything
       | else I use 4:4:4.
        
         | colonwqbang wrote:
         | It's good to remember that image sensor pixels only have one
         | colour anyway (they are each either G, R or B). 67% of colour
         | information is made up right from the start before your even
         | encode anything. Another reason why 4:2:0 sampling may not be
         | as bad as it sounds.
        
       | onurcel wrote:
       | I thought he was working on AGI
        
         | TheDudeMan wrote:
         | That's only some of the threads.
        
         | xnx wrote:
         | I think a lot of the input data to his models are images.
        
       | yboris wrote:
       | The future of images is JPEG XL - https://jpeg.org/jpegxl/
       | 
       | A great overview and explanation:
       | https://cloudinary.com/blog/time_for_next_gen_codecs_to_deth...
        
         | pornel wrote:
         | It is, but that's not relevant to what Carmack was talking
         | about.
        
       | kragen wrote:
       | The XVideo extension, commonly used by MPlayer (at least in the
       | past, maybe still?), has supported sending YUV images to your
       | video card for 201/2 years, since XFree86 4.0.2:
       | https://www.cs.ait.ac.th/~on/mplayer/en/xv.html
        
       ___________________________________________________________________
       (page generated 2021-06-04 23:00 UTC)