[HN Gopher] Perceptually lossless (talking head) video compressi...
___________________________________________________________________
Perceptually lossless (talking head) video compression at 22kbit/s
Author : skandium
Score : 179 points
Date : 2024-11-08 07:30 UTC (15 hours ago)
(HTM) web link (mlumiste.com)
(TXT) w3m dump (mlumiste.com)
| andrewstuart wrote:
| The more magic AI makes, the less magical the world becomes.
| andai wrote:
| ?
| EarlKing wrote:
| Clearly Sauron is a jealous ringmaker and doesn't like hobbits
| using his ring to shitpost.
| Joel_Mckay wrote:
| Probably just disappointed at the wasted bandwidth:
|
| 24fps * 52 facial 3D marker * 16bit packed delta planar
| projected offsets (x,y) = 19.968 kbps
|
| And this is done in Unreal games on a potato graphics card
| all the time:
|
| https://apps.apple.com/us/app/live-link-face/id1495370836
|
| I am sure calling modern heuristics "AI" gets people excited,
| but it doesn't seem "Magical" when trivial implementations
| are functionally equivalent. =3
| scotty79 wrote:
| I think the point here is to make it photorealistic which
| everything apart from AI still fails at superhard.
| Joel_Mckay wrote:
| Take a minute to look something up first, and then
| formulate a more interesting opinion for us to discuss:
|
| https://www.unrealengine.com/en-US/metahuman
|
| The artifacts in raster image data is nowhere near what a
| reasonable model can achieve even at low resolutions. =3
| scotty79 wrote:
| I know metahuman. As impressive as it is, when you judge
| by the standards of game graphics, if you are ever
| mislead into thinking metahumans are real humans or even
| real physically existing things it's time to see your eye
| doctor (and/or do MRI head scan).
|
| On the other hand AI videos can be easily mistaken for
| people or hyper realistic physical sculptures.
|
| https://img-9gag-
| fun.9cache.com/photo/aYQ776w_460svvp9.webm
|
| There's something basic about how light works that
| traditional computer graphics still fails to grasp.
| Looking at its productions and comparing it to what AI
| generates is like looking at output of amateur and an
| artist. Sure, maybe artist doesn't always draw all 5
| fingers but somehow captures the essence of the image in
| seemingly random arrangement of light and dark strokes,
| while amateur just tries to do their best but fails in
| some very significant ways.
| Joel_Mckay wrote:
| "AI" videos make many errors all the time, but most
| people are not aware of what to look for... Undetectable
| CGI is done in film/games all the time, and indeed it
| takes talent to hide the fact it is fake.
|
| One could rely on the media encoder to garble output
| enough to look more plausible (people on potato devices
| are used to looking at garbage content.) However, at the
| end of the day the "uncanny valley" effect takes over
| every-time even for live action data in a auto-generated
| asset, as the missing data can't be "Magically" recovered
| with 100% certainty.
|
| Bye =3
| scotty79 wrote:
| Undetectable CGI in games ... right. I don't think you
| are a gamer.
|
| In movies it can be done with enough of manual tweaking
| by artists and a lot of photographic content around to
| borrow sense of reality from it.
|
| "Potato" devices by which I assume you mean average
| phones, currently have better resolutions than PCs had
| very recently and a lot still do (1080p).
|
| And a photo on 480p still looks more real than anything
| CGI (not AI).
|
| Your signature is hilarious. I won't comment about the
| reasons because I don't want this whole thread to get
| flagged.
| Joel_Mckay wrote:
| I think most "AI" slop content falls under this
| phenomena:
|
| https://www.youtube.com/watch?v=vJG698U2Mvo
|
| Several 8bit games had their own aesthetic charm, but
| were at least fun...
|
| Cheers, =3
| satvikpendem wrote:
| > Any sufficiently advanced technology is indistinguishable
| from magic.
|
| - Arthur C. Clarke
| HPsquared wrote:
| This is the power of numerical methods.
| andrewstuart wrote:
| There's a finite amount of magic and if AI borrows it here
| then it must be repaid there.
| psychoslave wrote:
| The greatest feat ever: let magic disappear before wonder of
| understanding.
| xyzsparetimexyz wrote:
| Oh shut up. There's plenty of awful uses for ai but this isn't
| one of them
| andai wrote:
| What did you mean by this?
| AndrewVos wrote:
| Elon weirdly looks more human than usual in the AI version!
| LeoPanthera wrote:
| This is very impressive, but "perceptually lossless" isn't a
| thing and doesn't make sense. It means "lossy".
| high_byte wrote:
| why not? if you change one pixel by one pixel brightness unit
| it is perceptually the same.
|
| for the record, I found liveportrait to be well within the
| uncanny valley. it looks great for ai generated avatars, but
| the difference is very perceptually noticeable on familiar
| faces. still it's great.
| codeflo wrote:
| GP is correct, that's the definition of "lossy". We don't
| need to invent ever new marketing buzzwords for well-
| established technical concepts.
| AndrewDucker wrote:
| GP is incorrect.
|
| There is "Is identical", "looks identical" and "has lost
| sufficient detail to clearly not be the original." - being
| able to differentiate between these three states is useful.
| Rygian wrote:
| Lossless means "is identical".
|
| The other two are variations of lossy.
|
| Calling one of them "perceptually lossless" is cheating,
| to the disadvantage of algorithms that honestly advertise
| themselves as lossy while still achieving "looks
| identical" compression.
| protimewaster wrote:
| It's a well established term, though. It's been used in
| academic works for a long time (since at least 1970), and
| it's basically another term for the notion of
| "transparency" as it relates to data compression.
| TeMPOraL wrote:
| I honestly don't notice this anymore. Advertisers have
| been using such language since time immemorial, to the
| point it's pretty much a rule that an adjective with a
| qualifier means "not actually ${adjective}, but kind of
| like it in ${specific circumstances}". So "perceptually
| lossless" just means "not actually lossless, except you
| couldn't tell it from truly lossless just by looking".
| tialaramex wrote:
| Importantly the first one is parameterless, but the
| second and third are parameterized by the audience. For
| example humans don't see colour very well, some animals
| have much better colour gamut, while some can't
| distinguish colour at all.
| travisjungroth wrote:
| Perceptually lossless (nature for dogs) video compression
| at 15bit/s.
| protimewaster wrote:
| But this marketing term has been regularly used in academic
| papers for nearly 50 years (or probably more), so it seems
| like it should get a pass IMO.
|
| It's also used in the first paragraph of the Wikipedia
| article on the term "transparency" as it relates to data
| compression.
| Dylan16807 wrote:
| It is in no way the _definition_ of lossy. It is a _subset_
| of lossy. Most lossy image /video compression has visible
| artifacting, putting it outside the subset.
| LegionMammal978 wrote:
| For one, it doesn't obey the transitive property like a truly
| lossless process should: unless it settles into a fixed
| point, a perceptually lossless copy of a copy of a copy,
| etc., will eventually become perceptually different. E.g.,
| screenshot-of-screenshot chains, each of which visually
| resembles the previous one, but which altogether make the
| original content unreadable.
| _ZeD_ wrote:
| also are .mp3, yet they are hardly discernible from the
| originals
| rini17 wrote:
| not at 22kbit :)
| bityard wrote:
| Ability to tell MP3 from the original source was always
| dependent on encoder quality, bitrate, and the source
| material. In the mid 2000's, I tried to encode all of my
| music as MP3. Most of it sounded just fine because
| pop/rock/alt/etc are busy and "noisy" by design. But some
| songs (particularly with few instruments, high dynamic range,
| and female vocals) were just awful no matter how high I
| cranked the bitrate. And I'm not even an "audiophile,"
| whatever that means these days.
|
| No doubt encoders and the codecs themselves have improved
| vastly since then. It would be interesting to see if I could
| tell the difference in a double-blind test today.
| comboy wrote:
| These are always fun
|
| https://abx.digitalfeed.net/
|
| https://www.npr.org/sections/therecord/2015/06/02/411473508
| /...
| unshavedyak wrote:
| iirc there's "easy" (though i don't know them) tests to
| validate if the signal is lossless or not. When played over
| speakers for humans, at least.
|
| I always intend to figure out how that works, because i
| don't feel a lot of audiophiles are actually speaking truth
| in many cases lol. Still, i don't know - i can't remember
| my sources to figure it out for myself :/
| Dwedit wrote:
| Lossy audio formats suddenly become very discernible once you
| subtract the left channel from the right channel. Try that
| with Lossless audio vs MP3, Vorbis, Opus, AAC, etc. You're
| listening to only the errors at that point.
| lifthrasiir wrote:
| It is definitely a thing given a good perceptual metric. The
| metric even doesn't have to be very accurate if the distortion
| is highly bounded, like only altering the lowermost bit. It is
| unfortunate that most commonly used distortion metrics like
| PSNR are not really that, though.
| rini17 wrote:
| But that's mathematically impossible, to restore signal from
| extremely low bitrate stream with any highly bounded
| distortion. Perhaps only if you have highly restricted set of
| posible input, which online meetings aren't.
| lifthrasiir wrote:
| > Perhaps only if you have highly restricted set of posible
| input, which online meetings aren't.
|
| Are you sure? After all, you can effectively summarize
| meetings in a plain text which is extremely restricted in
| comparison to the original input. Guaranteed, exact manner
| of speech and motions and all subtleties should be also
| included to be fair, but that information is still far
| limited to fill the 20 kbps bandwidth.
|
| We need far more bandwidth only because we don't yet have
| an efficient way to reconstruct the input faithfully from
| such highly condensed information. Whenever we actually
| could, we ended up having a very efficient lossy algorithm
| that still preserves enough information for us human.
| Unless you are strictly talking about the lossless
| compression---which is however very irrelevant in this
| particular topic---, we should expect much more compression
| in the future even though that might not be feasible today.
| rini17 wrote:
| Okay, did not know you measure distortion that way.
| rob74 wrote:
| Yeah, all lossy compression could be called "perceptually
| lossless" if the perception is bad enough...
| bux93 wrote:
| A family member of mine didn't see the point of 1080p. Turned
| out they needed cataract surgery and got fancy replacement
| lenses in their eyes. After that, they saw the point.
| Dylan16807 wrote:
| Needing to define "perception" is a much weaker criticism
| than "isn't a thing and doesn't make sense".
|
| It's easy enough to specify an average person looking very
| closely, or a 99th percentile person, or something like that,
| and show the statistics backing it up.
| tatersolid wrote:
| I read "perceptually lossless" to be equivalent to
| "transparent", a more common phrase used in the audio/video
| codec world. It's the bitrate/quality at which some large
| fraction of human viewers can't distinguish a losslessly-
| encoded sample and the lossy-encoded sample, for some large
| fraction of content (constants vary in research papers).
|
| As an example, crf=18 in libx264 is considered "perceptually
| lossless" for most video content.
| Bjartr wrote:
| It may sound like marketing wank, but it does a appear to be an
| established term of art in academia as far back as 1997 [1]
|
| It just means that a person can't readily distinguish between
| the compressed image and the uncompressed image. Usually
| because it takes some aspect(s) of the human visual system into
| account.
|
| [1]
| https://scholar.google.com/scholar?hl=en&as_sdt=0%2C22&q=per...
| k__ wrote:
| Is this the real-time discussion all over again?
| Ladsko wrote:
| Can you propose a better term for the concept then? Perceiving
| something as lossless is a real world metric that has a proper
| use case. "Perceptually lossless" does not try to imply that it
| is not lossy.
| ComplexSystems wrote:
| The term for this is "transparency." A codec is "transparent"
| if people can't tell the difference between the original and
| the compressed version.
| edflsafoiewq wrote:
| "Transparency" is a fairly annoying term for this in
| image/video because of the obvious polysemy.
| unshavedyak wrote:
| So it would be `transparent lossy compression`? To this
| layman `perceptually lossless` sounds more clear, but i
| understand the issue with the name.
| ranger_danger wrote:
| As there are several patents, published studies, IEEE papers
| and thousands of google results for the term, I think it's safe
| to say that many people do not agree with your interpretation
| of the term.
|
| "As a rule, strong feelings about issues do not emerge from
| deep understanding." -Sloman and Fernbach
| Brian_K_White wrote:
| It means what it already says for itself, and does not need
| correcting into incorrectness.
|
| "no perceived loss" is a perfectly internally consistent and
| sensible concept and is actually orthogonal to whether it's
| actually lossless or lossy.
|
| For instance an actually lossless block of data could be
| perceptually lossy if displayed the wrong way.
|
| In fact, even actual lossless data is always actually lossy,
| and only ever "perceptually lossless", and there is no such
| thing as actually lossless, because anything digital is always
| only a lossy approximation of anything analog. There is loss
| both at the ADC and at the DAC stage.
|
| If you want to criticize a term for being nonsense misleading
| dishonest bullshit, then I guess "lossless" is that term, since
| it never existed and never can exist.
| unshavedyak wrote:
| Similar to your points, i also expect `perceptually lossless`
| to be a valid term in the future with respect to AI. Ie i can
| imagine a compression which destroys detail, but on the
| opposite end it uses "AI" to reconstruct detail. Of course
| though, the AI is hallucinating the detail, so objectively it
| is lossy but perceptibly it is lossless because you cannot
| know which detail is incorrect if the ML is doing a good job.
|
| In that scenario it certainly would not be `transparent` ie
| visually without any lossy artifacts. But your perception of
| it would look lossless.
|
| The future is going to be weird.
| rowanG077 wrote:
| Why don't you think it's a thing? A trivial example is audio. A
| ton of audio speakers can produce frequencies people cannot
| hear. If you have an unprocessed audio recording from a high
| end microphone one of the first compressions things you can do
| is clip of imperceptible frequencies. A form of compression.
| red0point wrote:
| > But one overlooked use case of the technology is (talking head)
| video compression.
|
| > On a spectrum of model architectures, it achieves higher
| compression efficiency at the cost of model complexity. Indeed,
| the full LivePortrait model has 130m parameters compared to
| DCVC's 20 million. While that's tiny compared to LLMs, it
| currently requires an Nvidia RTX 4090 to run it in real time (in
| addition to parameters, a large culprit is using expensive
| warping operations). That means deploying to edge runtimes such
| as Apple Neural Engine is still quite a ways ahead.
|
| It's very cool that this is possible, but the compression use
| case is indeed .. a bit far fetched. A insanely large model
| requiring the most expensive consumer GPU to run on both ends and
| at the same time being limited in bandwidth so much (22kbps) is a
| _very_ limited scenario.
| jl6 wrote:
| 130m parameters isn't insanely large, even for smartphone
| memory. The high GPU usage is a barrier at the moment, but I
| wouldn't put it past Apple to have 4090-level GPU performance
| in an iPhone before 2030.
| gambiting wrote:
| One cool use would be communication in space - where it's
| feasible that both sides would have access to high-end compute
| units but have a very limited bandwidth between each other.
| bliteben wrote:
| Wonder if its better than a single color channel hologram
| though
| JamesLeonis wrote:
| Increasingly mobile networks are like this. There are all
| kinds of bandwidth issues, especially when customers are
| subject to metered pricing for data.
| bityard wrote:
| Bandwidth is not the limitation in space comms, latency is.
| cogman10 wrote:
| Underwater communications, on the other hand, could use
| this.
|
| Though, I somewhat doubt even 22kbps is available
| generally.
| omh wrote:
| One use case might be if you have limited bandwidth, perhaps
| only a voice call, and want to join a video conference. I could
| imagine dialling in to a conference with a virtual face as an
| improvement over no video at all.
| loa_in_ wrote:
| Staying in contact with someone for hours on metered mobile
| internet connection comes to mind. Low bandwidth translates to
| low total data volume over time. If I could be video chatting
| on one of those free internet SIM cards that's a breakthrough.
| loudmax wrote:
| The trade-off may not be worth it today, but the processing
| power we can expect in the coming years will make this
| accessible to ordinary consumers. When your laptop or phone or
| AR headset has the processing power to run these models, it
| will make more efficient use of limited bandwidth, even if more
| bandwidth is available. I don't think available bandwidth will
| scale at the same rate as processing power, but even if it
| does, the picture be that much more realistic.
| Vecr wrote:
| Fire Upon the Deep had more or less this. Story important, so I
| won't say more. That series in general had absolutely brutal
| bandwidth limitations.
| pastelsky wrote:
| Did not expect to see Emraan Hashmi in this post!
| shaan7 wrote:
| Indeed! Bollywood makes it to HN xD
| JimDabell wrote:
| I got some interesting replies when I suggested this technique
| here:
|
| https://news.ycombinator.com/item?id=22907718
| antiquark wrote:
| Not quite lossless... look at the bicycle seat behind him. When
| he tilts his head, the seat moves with his hair.
| manmal wrote:
| His gaze also doesn't quite match.
| hinkley wrote:
| Why is nobody noticing the eyes?? This is important!
|
| I feel like I'm taking crazy pills.
| olddustytrail wrote:
| Read the text underneath the image and you'll understand.
| hinkley wrote:
| No, I really don't. He acknowledges it's not in keeping
| with the title or the thesis and then just sort of waves
| it off.
|
| Smells like rationalization to me.
| skandium wrote:
| Well, this isn't probably a problem with the model, but
| the source frame having wrong eye gaze. Besides,
| perceptually lossless need not be defined in a side-by-
| side comparison context. If you were only viewing the
| right hand side video, how could you tell the eye gaze is
| off? The point was more on that the movement looks
| natural, unlike almost all neural avatars up to this
| year.
| manmal wrote:
| Your argumentation does make sense to me; but it also
| makes the term lossless pull a lot of weight. Lossless in
| video encoding is usually defined by zero difference
| between source and target.
| metaphor wrote:
| Very noticeable jitter in bicycle front tire too.
| gwd wrote:
| This reminds me of a scene in "A Fire Upon the Deep" (1992) where
| they're on a video call with someone on another spaceship; but
| something seems a bit "off". Then someone notices that the actual
| bitrate they're getting from the other vessel is tiny -- far
| lower than they should be getting given the conditions -- and so
| most of what they're seeing on their own screens isn't actual
| video feed, but their local computer's reconstruction.
| miohtama wrote:
| And also it was a deep fake.
|
| BTW This is the best sci-fi book ever.
| Retric wrote:
| Might be better if you like space opera style really soft
| science fiction. I really didn't enjoy it.
| gwd wrote:
| A friend of mine and I both read it about the same time and
| discussed it afterwards. I thought it was pretty good, he
| thought it was not that great. What we agreed on was that
| in spite of there being many fantastic aspects to the book,
| on the whole it failed to be an awesome novel.
|
| Definitely worth giving it a try if you're a programmer,
| just for the fact that it's written by another programmer:
| the opening scene where they find a bunch of rules written
| down and just follow them reminds me of ACPI; the
| discussion of public-key cryptography and shipping drives
| full of one-time-pad around the galaxy; the "compression
| scheme" with the video.
| Boxxed wrote:
| I agree that it was good but not particularly great. A
| Deepness in the Sky, however, is fantastic -- similar in
| many aspects but just flat out better all around.
| mercutio2 wrote:
| Fascinating. Vinge is about the furthest from "soft" sci-fi
| I can think of. We must have very different definitions of
| what makes something soft.
|
| It's certainly true that Vinge doesn't spend much time on
| the engineering details, but I find him unusually clear on
| "imagine if we had this kind of impossible-now technology,
| but the rest of what we know about physics remained, how
| would people behave?"
|
| He was, after all, a physics professor.
|
| Rainbow's End is much clearer on this than his distant
| future stuff, of course.
| opo wrote:
| >He was, after all, a physics professor.
|
| Actually, he was a mathematics and computer science
| teacher at San Diego State University.
|
| https://en.wikipedia.org/wiki/Vernor_Vinge
| Retric wrote:
| Soft vs hard is based on how closely the world tracks
| with modern physics/science. As such even just FTL is
| soft, let alone everything else that doesn't fit.
| jrussino wrote:
| > Soft vs hard is based on how closely the world tracks
| with modern physics/science
|
| Maybe it's not productive to quibble about definitions
| like this, but FWIW I don't agree with this criteria. I
| would argue Greg Egan's work, for example, is just about
| the "hardest" sci-fi there is, and yet much of that work
| takes place in universes that are entirely unlike our
| own.
|
| Personally, I think what makes for "hard" sci-fi is that
| the rules of the universe are well-laid-out and
| consistent, and that the story springs (at least in some
| significant part) out of the consequences of those rules.
| That may mean a story set in the "future", where we have
| new technology or discover new physics, or "alternate
| universe" sci-fi like Egan's.
| Retric wrote:
| If changing the laws of the universe is fine, then
| nothing gets excluded even Harry Potter. It's one of
| those definitions that allows anything and ultimately
| only feels fine because you're adding some other
| criteria.
|
| In defense of hard science fiction, it's a meaningful
| category to talk about even if it's not something you
| personally care about. People often want to weaken it but
| that just opens a door for a new category say "scientific
| science fiction" and we are back to square one.
|
| Asking questions like what does AGI look like when they
| can't just magically solve all issues can be fun. Hand
| waving the singularly as some religious event can also
| make interesting stories but so is considering how chaos
| theory limits what computation can actually achieve.
| com2kid wrote:
| > If changing the laws of the universe is fine, then
| nothing gets excluded even Harry Potter.
|
| Greg Egan's law changes are on the level of "I consulted
| with a bunch of theoretical physics professors and asked
| them what the implication of tweaking this one
| fundamental constant would be, then I spent years
| meticulously crafting a world that takes into account
| those implications, and I had others physics professors
| check over my work to make sure it was within the bounds
| of actuality, and then I wrote a story about characters
| in this new world."
|
| > Asking questions like what does AGI look like when they
| can't just magically solve all issues can be fun.
|
| Greg Egan actually has a great book about this!
| Permutation City. CPU cycles aren't unlimited, and there
| are tons of ethical problems being confronted with the
| entire "simulate a person" thing.
| exe34 wrote:
| > If changing the laws of the universe is fine, then
| nothing gets excluded even Harry Potter
|
| the laws of the universe in Harry Potter are so fickle
| and ever changing with the plot line that to me, it can
| only be considered soft. compare with Egan who takes a
| given cosmology and then works 100% within that world.
| that's hard.
| Retric wrote:
| That's why I brought HP up.
|
| Characters don't necessarily know the underlying rules of
| the universe they live in. Further the rules can change
| over time. So there's infinitely many possible underlying
| physical rules that fit any possible work of fiction.
| Trying to work out what they might be can be fun.
|
| Which is why it's really other criteria people use when
| they think changes to physical laws are still hard
| science fiction.
| jamiek88 wrote:
| That is simply your personal definition, right?
|
| You don't claim to be definitive?
| com2kid wrote:
| > Fascinating. Vinge is about the furthest from "soft"
| sci-fi I can think of. We must have very different
| definitions of what makes something soft.
|
| That award goes to Greg Egan who has full list of
| citations on his website for each of his novels, as well
| as a list of mathematicians and physicists he requested
| help from.
|
| If you want to read books that occasionally delve into
| pages of equations, Greg Egan is the author for you!
| (Seriously though, really good books, and the
| implications of his "what-ifs" are pretty damn cool)
| loxias wrote:
| Seconding this, Greg Egan is one of the best of all time.
|
| The short stories "Luminous" and "Dark Integers", the
| novels "Diaspora" and "Schild's Ladder". So good.
|
| qntm (another author) hits somewhat similarly.
| exe34 wrote:
| i might have to have another go at dichronauts. that one
| broke my mind a few pages in and I had to stop.
| aaronblohowiak wrote:
| It uses technological differences as key plot and setting
| components not just space as sea, so it is sci fi but it is
| improbable in many ways so yea "soft" sci fi or more
| speculative fiction
| lern_too_spel wrote:
| The softness is deceptive. Hard concepts about
| communication and different types of brains are essential
| to the plot.
| jf wrote:
| I beg to differ. A Deepness In The Sky is the best sci-fi
| book ever.
| space_fountain wrote:
| I think I agree both books were good and "A Deepness In The
| Sky" was better, but I would warn everyone that I thought
| both books used dramatic irony (showing us that characters
| were evil while hiding this from main characters) to hold
| attention to a degree that I kind of hated. And in "A
| Deepness In The Sky" sexual violence was used repeatedly to
| illustrate how evil the main characters were. I found it
| unnecessarily and a bit in poor taste.
|
| On the other hand I think both books developed ideas
| wonderfully and there are bits of them I keep coming back
| to, even if I'll probably never reread them
| janandonly wrote:
| I came here to reply just this exactly and found a fellow geek
| beat me to it. Indeed a brilliant book.
| Rebelgecko wrote:
| Was that the same book that had the concept of (paraphrasing
| using modern terminology) doing interstellar communications by
| sending back and forth LLMs trained on the people who wanted to
| talk, prompted to try and get a good business deal or whatever?
| initramfs wrote:
| nice feature for low bandwidth 4G cell systems.
|
| Reminds me of the video chat in Metal Gear Solid 1
| https://youtu.be/59ialBNj4lE?t=21
| hinkley wrote:
| Nice feature for many to one video conferencing as well. Though
| I don't know if the organizers will agree.
| dormento wrote:
| Now that you mention it, it never occurred to me that Snake's
| radio transmitted video as well. "Did you like my new
| sunglasses?"
|
| If you could reserve a small portion of the radio bandwidth to
| broadcast a thumbnail + low bandwidth compressed representation
| of the face movements, you could technically have something
| similar without encoding any video (think low res, eye + mouth
| movements).
| MayeulC wrote:
| I like how the saddle in the background moves with the
| reconstructed head; it probably works better with uncluttered
| backgrounds.
|
| This is interesting tech, and the considerations in the
| introduction are particularly noteworthy. I never considered the
| possibility of animating 2D avatars with no 3D pipeline at all.
| up2isomorphism wrote:
| "Perceptually lossless" is an oxymoron.
| Brian_K_White wrote:
| There is no oxymoron in "no perceived loss".
| ranger_danger wrote:
| As there are several patents, published studies, IEEE papers
| and thousands of google results for the term, I think it's safe
| to say that many people do not agree with your interpretation
| of the term.
| hinkley wrote:
| You're still listening to vinyl, arntcha?
|
| Lossiness definitely matters when you're doing forensics. But
| not for consumers.
|
| If you just want to bop to Taylor who the fuck cares. The iPod
| ended that argument. Yes I can be a perfectionist, or I can
| have one thousand songs in my pocket. That was more than half
| of your collection for many people at the time.
| jacobgorm wrote:
| Related Show HN https://news.ycombinator.com/item?id=31516108
| hinkley wrote:
| The second example shown is not perceptually lossless, unless
| you're so far on the spectrum you won't make eye contact even
| with a picture of a person. The reconstructed head doesn't look
| in the same direction as the original.
|
| However is does raise an interesting property in that if you
| _are_ on the spectrum or have ADHD, you only need one headshot of
| yourself staring directly at the camera and then the capture
| software can stop you from looking at your taskbar or off into
| space.
| DCH3416 wrote:
| > unless you're so far on the spectrum you won't make eye
| contact even with a picture of a person.
|
| I don't know. I think you'd be surprised.
|
| That's already kind of an issue with vloggers. Often they're
| looking just left or right of the camera at a monitor or
| something.
| zbobet2012 wrote:
| These sorts of models pop here quite a bit, and they ignore
| fundamental facts of video codecs (video specific lossy
| compression technologies).
|
| Traditional codecs have always focused on trade offs among encode
| complexity, decode complexity, and latency. Where complexity =
| compute. If every target device ran a 4090 at full power, we
| could go far below 22kbps with a traditional codec techniques for
| content like this. 22kbps isn't particularly impressive given
| these compute constraints.
|
| This is my field, and trust me we (MPEG committees, AOM) look at
| "AI" based models, including GANs constantly. They don't yet look
| promising compared to traditional methods.
|
| Oh and benchmarking against a video compression standard that's
| over twenty years old isn't doing a lot either for the
| plausibility of these methods.
| skandium wrote:
| This is my field as well, although I come from the neural
| network angle.
|
| Learned video codecs definitely do look promising: Microsoft's
| DCVC-FM (https://github.com/microsoft/DCVC) beats H.267 in BD-
| rate. Another benefit of the learned approach is being able to
| run on soon commodity NPUs, without special hardware
| accommodation requirements.
|
| In the CLIC challenge, hybrid codecs (traditional + learned
| components) are so far the best, so that has been a letdown for
| pure end to end learned codecs, agree. But something like H.267
| is currently not cheap to run either.
| smokel wrote:
| Why so sour? This particular article doesn't seem to ignore a
| lot, it even references the Nvidia work that inspired it, as
| well as a recent benchmark.
|
| Someone was just having fun here, it's not as if they present
| it as a general codec.
| tommiegannert wrote:
| Now that we're moving towards context-specific compression
| algorithms, can we please use WASM as the file header for these
| media files, instead of inventing something new. :)
___________________________________________________________________
(page generated 2024-11-08 23:01 UTC)