[HN Gopher] John Carmack on JPEG
___________________________________________________________________
John Carmack on JPEG
Author : tosh
Score : 85 points
Date : 2021-06-04 21:43 UTC (1 hours ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| TazeTSchnitzel wrote:
| > You can do [direct use of YUV] today, but you need to do the
| color conversion manually in a shader
|
| Many mobile GPUs support an extension that does YUV conversion
| for you (GL_EXT_YUV_target). Maybe it's of less interest to
| desktop GPU vendors?
| jayd16 wrote:
| I think the problem is moving the dev flow from layered bitmaps
| to a shaded flow.
| pornel wrote:
| IIRC this optimization has already been used by Opera Mobile and
| Edge in the past: upload YUV to the GPU RAM, and then convert
| pixels to RGB on the fly when displaying. Unfortunately, I can't
| find the links to their respective blog posts (<shakes fist at
| Google substituting all keywrods with their synonyms and
| searching recent pages only>).
|
| However, chroma subsampling is a very primitive form of 2x2 block
| compression (NB: not related to JPEG's DCT block size). These
| days GPUs support much better compression natively, with 4x4
| blocks, and much fancier modes. With clever encoding of such
| textures, it's even possible to achieve compression ratio
| comparable with JPEG's, while having a straightforward way to
| convert the compressed file to the compressed texture format.
| WalterGR wrote:
| I feel daft, but: The vast majority of images are
| jpegs, which are internally 420 YUV, but they get converted to 32
| bit RGB for use in apps. Using native YUV formats would save
| half the memory and rendering bandwidth, speed loading, and
| provide a tiny quality improvement.
|
| What does he mean by _using_ native YUV formats? Something (I
| wave my hand) in the rendering pipeline from the JPEG in memory
| to pixels on the screen?
| PragmaticPulp wrote:
| > What does he mean by using native YUV formats?
|
| Your display uses 3 bytes per pixel. 8 bits for each of the R,
| G, and B channels. This is known as RGB888. (Ignoring the A or
| alpha transparency channel for now).
|
| YUV420 uses chroma subsampling, which means the color
| information is stored at a lower resolution than the brightness
| information. Groups of 4 pixels will have the same color, but
| each pixel can have a different brightness. Our eyes are more
| sensitive to brightness changes than color changes, so this is
| usually unnoticeable.
|
| This is very advantageous for compression because YUV420
| requires 6 bytes per 4 pixels, or 1.5 bytes per pixel, because
| groups of pixels share a single color value. That's half as
| many bytes as RGB888.
|
| When you decompress a JPEG, you first get a YUV420 output.
| Converting from YUV420 to RGB888 doesn't add any information,
| but it doubles the number of bits used to represent the image
| because it stores the color value for every individual pixel
| instead of groups of pixels. This is easier to manipulate in
| software, but it takes twice as much memory to store and twice
| as much bandwidth to move around relative to YUV420.
|
| The idea is that if your application can work with YUV420
| through the render pipeline and then let a GPU shader do the
| final conversion to RGB888 within the GPU, you cut your memory
| and bandwidth requirements in half at the expense of additional
| code complexity.
|
| Wikipedia is a good source of diagrams and details that explain
| this further:
| https://en.wikipedia.org/wiki/YUV#Y%E2%80%B2UV420p_(and_Y%E2...
| ranman wrote:
| Thanks, this is a very clear explanation.
| vlovich123 wrote:
| Yup. You typically first decode the JPEG into a raw image
| format in GPU memory & then that gets transferred to scanout HW
| that actually sends it to the display (e.g. via HDMI). The
| argument feels a bit weak though. At that point why not just
| use JPEG through your entire pipeline & save even more
| bandwidth. Of course you have to decompress if you're modifying
| the image in any way so it doesn't help there.
| edflsafoiewq wrote:
| You can't texel-fetch directly from JPEG data, it has to be
| decompressed first.
| vlovich123 wrote:
| Yes of course. GPUs already support compressed textures
| though that get decompressed on the fly. Of course JPEG has
| a lot of baggage that actually makes it a poor format for
| this but perhaps newer ones might be useful. What you lose
| in "optimality" (compression schemes designed for GPUs are
| better for games), you win in "generality" (ability to use
| this outside games to, for example, improve web browsing
| battery life).
| mrec wrote:
| There's a very hard distinction between GPU texture
| compression formats and general image compression formats
| - the former need to support random access lookup in
| constant time. Anything meeting that criterion is not
| going to be generally competitive; it's like the
| difference between UTF-32 strings and UTF-8 strings.
| david-gpu wrote:
| You could do the YUV to RGB conversion operation in your pixel
| shaders in the GPU. That way you save a bit of bandwidth
| compared to uncompressed RGB.
|
| It's been done. There are even GPUs that support this operation
| natively, so there's no additional overhead.
| mindfulplay wrote:
| I would imagine a shader that converts YUV to ARGB at the time
| of rendering as opposed to storing it all the way along the
| pipeline as 32 bit integers.
|
| It's a bit tricky because rendering pipelines composite the
| final image through many layers of offscreen compositing before
| the pixel hits the screen.
|
| The core issue is that the offscreen composited layers would
| still be 32bit textures which is a bigger issue. I would
| imagine a Skia-based draw list to encode this through the
| pipeline which could help preserve this perhaps.
| cornstalks wrote:
| One of the challenges with Y'CbCr (what Carmack is calling "YUV")
| is that there are so many flavors.
|
| He mentions 4:2:0 chroma subsampling. But he doesn't mention
| chroma siting. Or alternative sumbsampling schemes. Or matrix
| coefficients. Or full-range vs video-range (a.k.a. JPEG vs MPEG
| range). Heck, how you even arrange subsampled data varies by
| system (many libraries like planar; Apple likes bi-planar; etc.).
|
| I'd love to see more support for rendering subsampled Y'CbCr
| formats so you don't have to use so much RAM, but it gets
| complicated quick.
| jayd16 wrote:
| Is this a real problem? Surely something in the current
| pipeline understands what YUV flavor JPEG uses.
| modeless wrote:
| This would save memory but add compute cost if the image is drawn
| to the framebuffer or another sRGB buffer more than once or
| twice. It wouldn't necessarily be a win to make this behavior the
| default everywhere.
|
| In the case of web browsers it depends how the image is used. An
| <img> with a large JPEG is probably drawn only once during tile
| rasterization, and browsers could certainly use the memory
| savings, so it would probably be a win. But if you had a small
| JPEG used as a page background and tiled over the whole screen,
| the memory savings would be small and you'd be wasting power
| converting the same pixels from YUV to sRGB over and over, so
| that would likely be a loss.
| raphlinus wrote:
| That is not _necessarily_ true. In some cases, the compositor
| maybe configured to sample the input buffer directly from
| YUV420, applying the transformation on scanout and thereby
| saving memory bandwidth to read from the framebuffer. This
| makes a tremendous amount of sense when the source is a video,
| but much less sense when the source may be rendered vector
| graphics, which generally look pretty bad with subsampled
| chroma.
| modeless wrote:
| Sure, if you can use a YUV framebuffer then that can save
| memory bandwidth during scanout (though the conversion to RGB
| still happens because the screen is not YUV). But that
| doesn't apply in the tiled page background case I mentioned,
| as web page contents are composited in RGB.
| lilyball wrote:
| I am fascinated by his use of backslash-escaping the tweet EOD to
| signify that it's being followed by another one.
| supernintendo wrote:
| I suppose it takes less characters than marking the ordinal of
| the tweet and probably familiar to Carmack's technically-minded
| audience as escaping the return character in a command line (to
| add a new line to the command instead of running it).
| hatsunearu wrote:
| Maybe I'm too stupid to understand this but AFAIK YUV isn't
| exactly linear so you still need to convert to linear space
| (trichromatic linear color space like RGB/XYZ)... no?
| david-gpu wrote:
| YUV is a linear transformation of RGB to the best of my
| knowledge. Wikipedia seems to agree.
| jacobolus wrote:
| JPEG uses Y'CrCb, not YUV per se. The matrix transformation
| is applied to gamma-encoded R'G'B' (typically sRGB), not
| linear RGB.
|
| http://poynton.ca/PDFs/YUV_and_luminance_harmful.pdf
| brigade wrote:
| The only important bit of pedantry over analog/digital and
| prime is reminding people that no one deals with linear
| light unless you already know you're doing so, and that
| blending in nonlinear space is rarely correct, no matter
| how common it is to do so.
| david-gpu wrote:
| I think we are in agreement? What you are saying is that
| conversion from linear YUV to linear RGB is, indeed,
| linear.
|
| Further, the transformation from Y'CrCb to gamma-encoded
| R'G'B' is also a linear operation. Right?
| simias wrote:
| Indeed but that's also true for RGB (which is typically sRGB on
| computers) and you can have linear YUV if you want. "RGB" and
| "YUV" can really mean a whole bunch of things. There are many
| RGBs and probably twice as many YUV due to the hell that's
| video standards.
|
| I'm not sure I understand where Carmack is coming from here
| though (am I missing some context? I don't use twitter and
| these threads are always a huge pain for me to follow
| especially since Carmack doesn't even bother breaking on full
| sentences). I don't get how processing in YUV instead of RGB
| has anything to do with 10bit components for instance.
|
| Also, in my experience most video software deals with YUV
| natively and only converts as needed. It's probably different
| in the gaming and image processing world but that's because
| everything else is RGB and it seems to be a big ask to just
| tell everybody to convert to YUV.
|
| Besides if quality is of the essence, you will typically store
| more that 10 bits for internal processing, probably 16 and
| maybe even floats if you want to have as much range as
| possible.
|
| I dunno, I won't pretend that I'm smarter than Carmack, but I
| wish there was a bit more context because it's a bit opaque for
| me at the moment.
| Jenk wrote:
| > I don't use twitter and these threads are always a huge
| pain for me to follow especially since Carmack doesn't even
| bother breaking on full sentences
|
| This site (threadreaderapp.com) may be of interest to you. It
| aggregates threads into a readable column as if it were a
| single article, here's Carmack's "thread":
| https://threadreaderapp.com/thread/1400930510671601666.html
|
| Extremely useful for dialogues/conversations on twitter.
| alisonkisk wrote:
| https://news.ycombinator.com/item?id=27399731
|
| explains.
| Animats wrote:
| There's JPEG 2000, which is nothing like classic JPEG. It doesn't
| have the artifacts classic JPEG introduces at sharp edges, so you
| can read text. It has more color depth if you want it. It also
| doesn't mess up non-color data such as normal maps the way
| classic JPEG does.
|
| JPEG 2000 is not used much. Decoding is slow, and encoding is
| slower. The decoders are either buggy or proprietary. It has way
| too many options internally. The big users of JPEG 2000 are
| medical. It has a "lossless" mode, and medical imagery is usually
| stored lossless because you really don't want compression
| artifacts in X-rays.
|
| (I've been struggling with JPEG 2000 recently. Second Life and
| Open Simulator use it for asset storage. Some images won't
| decompress properly with OpenJPEG, a free decoder. I finally
| found out why. There's a field used for "personal health
| information" in JPEG 2000. This is where the patient name and
| such go in a CAT scan. That feature was added after the original
| version. Some older images apparently have junk in that field,
| which causes problems.)
| TheDudeMan wrote:
| "you really don't want compression artifacts in X-rays"
|
| You really don't want compression artifacts in any image, which
| is why you should set the compression ratio wisely when
| encoding. I don't see how X-rays are any different.
| eco wrote:
| Sure, but you don't normally end up getting an unnecessary
| biopsy with most other image artifacts.
| clord wrote:
| some images manage to communicate in spite of high
| compression. X-rays are an example of an image where the cost
| of misinterpretation is very high.
| kragen wrote:
| Compression artifacts in X-rays can easily kill people,
| either by requiring additional unnecessary X-rays (which
| cause cancer) or by causing erroneous diagnoses. Compression
| artifacts in filtered photos of your cute pet turtle for
| Instagram are much less likely to kill people.
| me_again wrote:
| IIRC there was a study which indicated oncologists' ability
| to detect tumors in X-ray images was degraded even with lossy
| compression ratios which didn't introduce obvious artifacts.
| I couldn't find it with a quick search though.
| [deleted]
| jacobolus wrote:
| Other big users of JPEG 2000 include libraries and archives
| storing e.g. historic map images.
| Keyframe wrote:
| _JPEG 2000 is not used much_
|
| There's one big win though. Digital cinema projectors, in now
| pretty much all theatres, run DCPs which have movies encoded
| with it.
| aaron-santos wrote:
| Adding to the list: ESA's Sentinel 2 products
| (granules/tiles) in are typically distributed in JPEG2000
| format. Anyone working with this imagery has probably
| experienced all the trouble associated with extracting and
| transforming it into a more useful format.
| TazeTSchnitzel wrote:
| > Second Life and Open Simulator use it for asset storage
|
| JPEG2000 is such a cool choice for Second Life. The fact any
| truncation of a JPEG2000 bitstream is just a lower-resolution
| version of the image makes for convenient progressive
| enhancement when loading textures over the network, right?
|
| (And Second Life even used JPEG2000 for geometry, sort-of. I
| guess with the advent of proper mesh support, that may be less
| common now though.)
| Sebb767 wrote:
| What's the advantage of JPEG2000 lossless over, say, PNG?
| Archives will probably be compress later on anyway and using a
| widely-supported and known format seems far more suited for
| archival purposes.
| Dylan16807 wrote:
| Does anyone know what fraction of jpegs in the real world are
| 4:2:0?
|
| For photos I use whatever the camera likes, but for everything
| else I use 4:4:4.
| colonwqbang wrote:
| It's good to remember that image sensor pixels only have one
| colour anyway (they are each either G, R or B). 67% of colour
| information is made up right from the start before your even
| encode anything. Another reason why 4:2:0 sampling may not be
| as bad as it sounds.
| onurcel wrote:
| I thought he was working on AGI
| TheDudeMan wrote:
| That's only some of the threads.
| xnx wrote:
| I think a lot of the input data to his models are images.
| yboris wrote:
| The future of images is JPEG XL - https://jpeg.org/jpegxl/
|
| A great overview and explanation:
| https://cloudinary.com/blog/time_for_next_gen_codecs_to_deth...
| pornel wrote:
| It is, but that's not relevant to what Carmack was talking
| about.
| kragen wrote:
| The XVideo extension, commonly used by MPlayer (at least in the
| past, maybe still?), has supported sending YUV images to your
| video card for 201/2 years, since XFree86 4.0.2:
| https://www.cs.ait.ac.th/~on/mplayer/en/xv.html
___________________________________________________________________
(page generated 2021-06-04 23:00 UTC)