[HN Gopher] Nasir Ahmed's digital-compression breakthrough helpe...
___________________________________________________________________
Nasir Ahmed's digital-compression breakthrough helped make
JPEGs/MPEGs possible
Author : Brajeshwar
Score : 92 points
Date : 2024-08-20 13:52 UTC (9 hours ago)
(HTM) web link (spectrum.ieee.org)
(TXT) w3m dump (spectrum.ieee.org)
| selimthegrim wrote:
| His Signals and Systems book is still around and not bad either.
| nayuki wrote:
| It's about the
| https://en.wikipedia.org/wiki/Discrete_cosine_transform .
|
| > (subtitle) His digital-compression breakthrough helped make
| JPEGs and MPEGs possible
|
| Technically, the DCT isn't restricted to only digital
| compression. The DCT performs a matrix multiplication on a real
| vector, giving a real vector as output. You can perform a DCT on
| a finite sequence of analog values if you really wanted to, by
| performing a specific weighted sum of the values to yield a new
| sequence of analog values.
| bob1029 wrote:
| The DCT is really neat, but the actual compression magic comes
| from a combination of side effects that occur after you apply it:
|
| 1. The DCT (II) packs lower frequency coefficients into the top-
| left corner of the block.
|
| 2. Quantization helps to zero out many higher frequency
| coefficients (toward bottom-right corner). This is where your
| information loss occurs.
|
| 3. Clever zig-zag scanning of the quantized coefficients means
| that you wind up with long runs of zeroes.
|
| 4. Zig-zag scanned blocks are RLE coded. This is the first form
| of actual compression.
|
| 5. RLE coded blocks are sent through huffman or arithmetic
| coding. This is the final form of actual compression (for intra-
| frame-only/JPEG considerations). Additional compression occurs in
| MPEG, et. al. with interframe techniques.
| kappi wrote:
| DCT is now replaced by Hadamard Transform which can be
| implemented by additions/subtractions and don't have the drift
| problem of DCT. HT was considered before DCT, but during that
| time DCT was picked because of better perceptual quality. Later
| during H.264 standardization, HT replaced DCT and is now used
| in all video codecs instead of DCT.
| mbtwl wrote:
| Nope.
|
| X265/HEVC
| https://en.m.wikipedia.org/wiki/High_Efficiency_Video_Coding
|
| Also not true for X266/VVC.
| aidenn0 wrote:
| AV1 also uses DCT and DST, but not Hadamard.
| kappi wrote:
| correct, it is integer DCT. Lot of techniques adopted from
| the integer transform of H.264. That's what I meant, not
| the floating point DCT proposed in 70s.
| aidenn0 wrote:
| Interestingly enough, JPEG XR used a form of the Hadamard
| Transformation, but JPEG XL (which is newer) uses DCT and
| Haar transforms.
|
| [edit]
|
| Combined with the information from sibling comments, it seems
| that the Hadamard transform was something used in standards
| developed in the '00s but not since.
| pornel wrote:
| The "actual compression magic" has been used before DCT in
| other codecs, but applied directly to pixels gave lousy
| results.
|
| You can also look at 90's software video codecs developed when
| DCT was still too expensive for video. They had all kinds of
| approaches to quantization and entropy coding, and they all
| were a pixelated mess.
|
| DCT is the key ingredient that enabled compression of
| photographic content.
| HarHarVeryFunny wrote:
| What's so special about DCT for image compression?
|
| The main idea of lossy image compression is throwing away
| file detail, which means converting to frequency domain and
| throwing away high frequency coefficients. Conceptually FFT
| would work fine for this, so use of DCT instead seems more
| like an optimization rather than a key component.
| dilippkumar wrote:
| Hey! Nice to see this here.
|
| My graduate thesis advisor was a coinventor of the DCT [0]. I
| miss my grad school days - he was a great advisor.
|
| [0]. https://en.wikipedia.org/wiki/K._R._Rao
| bob1029 wrote:
| I really like the book he co-authored with P. Yip [0]. Grabbed
| a copy on AbeBooks a few years ago while working on a custom
| codec. Excellent coverage of the transform from many angles,
| including reference diagrams of how to implement the various
| transforms in software/hardware and ~200 pages worth of
| discussion around applications.
|
| [0]: https://dl.acm.org/doi/10.5555/96810
| trhway wrote:
| The first layer of the visual cortex (and what the input layers
| convolutional kernels in visual NN converge to) are those Gabor
| kernels - cosine multiplied by exponentially decreasing amplitude
| thus de-facto limiting the spatial attention of the given neuron
| to a spot.
| max_ wrote:
| One thing I recommend people to do is study compression
| algorithms like Jpeg.
|
| I find the relationship between compression algos & cognitive
| science very interesting.
| drunkspider wrote:
| What's the relationship between compression algorithms and
| cognitive science?
| tedd4u wrote:
| "Lossless" compression is based on information that can be
| discarded without negative consequences because it cannot be
| perceived by humans. The data is real and there, you just
| can't see it or hear it. If you can quantify what information
| humans can't perceive, you can discard it, leaving less data
| and possibly more amenable data for a subsequent lossless
| compression phase. MP3, JPEG, MPEG all benefit from this
| understanding of the human perceptual system.
| omneity wrote:
| You're talking about lossy compression. Specifically
| perceptual lossy compression[0].
|
| Lossless compression is entirely reversible. Nothing is
| lost and nothing is discarded, perceived or not, like zip.
|
| 0: https://arxiv.org/abs/2106.02782
| hnlmorg wrote:
| You have it backwards there. You're describing lossy
| compression.
|
| Lossless is formats like Flac and zip. Lossless compression
| basically stores the same data in more efficient (from a
| file size perspective) states rather than discarding stuff
| that isn't perceived.
|
| The clue is in the name of the term: "lossy" means you lose
| data. "Lossless" means you don't lose data. So if a zip
| file was lossy, you'd never be able to decompress it.
| Whereas you cannot restore data you've lost from an MP3.
| nayuki wrote:
| Perhaps
| https://en.wikipedia.org/wiki/Human_visual_system_model ,
| https://en.wikipedia.org/wiki/Psychoacoustics
| max_ wrote:
| There are many other resources.
|
| But this is an example https://archive.is/KShWY#9
| laidoffamazon wrote:
| Extremely impressive, done while doing research at Kansas State
| University with a PhD from the University of New Mexico. I don't
| know if any new major advancements have come from people from
| state schools today.
| mkoubaa wrote:
| Is this sarcastic?
| duped wrote:
| This is why it's important to pay attention in linear algebra
| class as a CS undergrad!
| kleiba wrote:
| Wikipedia writes: "Ahmed developed a practical DCT algorithm with
| his PhD students T. Raj Natarajan, Wills Dietrich, and Jeremy
| Fries, and his friend Dr. K. R. Rao at the University of Texas at
| Arlington in 1973." [1]
|
| So perhaps it would fair to give due credit to the co-workers as
| well.
|
| [1] https://en.wikipedia.org/wiki/Discrete_cosine_transform
___________________________________________________________________
(page generated 2024-08-20 23:00 UTC)