[HN Gopher] Linux kernel VP9 codec V4L2 control interface
___________________________________________________________________
Linux kernel VP9 codec V4L2 control interface
Author : mfilion
Score : 77 points
Date : 2021-09-14 13:54 UTC (9 hours ago)
(HTM) web link (lkml.iu.edu)
(TXT) w3m dump (lkml.iu.edu)
| CameronNemo wrote:
| Unfortunately userspace (e.g. VAAPI, ffmpeg) support for this is
| not done. Until VAAPI support is implemented, videos in Firefox
| will be unaccelerated. I think it is the same deal for Chrome.
| miduil wrote:
| Wouldn't the gstreamer support that is mentioned by the path-
| description directly enable hardware acceleration in Firefox?
| Or do I misunderstand to what extend Firefox is using gstreamer
| at the moment?
| CameronNemo wrote:
| Firefox does not use gstreamer at all AFAIK.
| miduil wrote:
| Ah I confused it with ffmpeg which is also vaapi of course.
|
| Also gstreamer somehow was added for something but that was
| 7 years ago, I guess getting video decoding running was
| different story
|
| https://wiki.mozilla.org/index.php?title=Special:Search&lim
| i...
| fragileone wrote:
| HW acceleration on Linux was fixed about a year ago
| https://9to5linux.com/firefox-81-enters-beta-gpu-acceleratio...
| [deleted]
| CameronNemo wrote:
| Firefox uses VA-API. That library does not support this
| hardware.
|
| Edit: as explained below, the linked work is for specific ARM
| hardware like the rk3399 SoC.
| zajio1am wrote:
| > Until VAAPI support is implemented, videos in Firefox will be
| unaccelerated. I think it is the same deal for Chrome.
|
| That is an issue of Firefox, other software (mplayer, mpv)
| support VAAPI for many years. And with youtube-dl integration
| in mpv, why even play videos in Firefox?
| CameronNemo wrote:
| Nope not an issue in Firefox, an issue in VAAPI. Firefox
| supports VAAPI just fine, VAAPI does not support this
| hardware/API. Considering it is a new API, I am still holding
| out hope that support gets added.
| megous wrote:
| https://github.com/noneucat/libva-v4l2-request#branch=fix-
| ke...
|
| And there's probably some branch somewhere that supports
| VP9 too.
| fguerraz wrote:
| The usefulness of hardware acceleration for video decoding is
| highly debatable.
|
| 1) It's not always much more energy efficient, but it sometimes
| is, but less than you'd think, GPUs need power too
|
| 2) It increases greatly the complexity of client software that
| has to implement both accelerated and unaccelerated decoding,
| leading to poorer software quality
|
| 3) Drivers quality is usually terrible: lists of working
| hardware/software combinations have to be maintained and in
| some cases, holes in sandboxes have to be punched [1]
|
| 4) HW support usually lags behind state of the art encoding.
| Youtube is already using av1, but the vast majority of devices
| won't support it in hardware before something else comes up
|
| 5) Highly optimised decoders, such as dav1d, are extremely
| effective and save bandwidth and power compared to HW VP9.
|
| EDIT: I'm mostly talking about the desktop/laptop use case here
| were things are very fragmented. On a mobile phone where
| manufacturers control hardware and software end to end, that's
| a different story.
|
| [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1698778
| FpUser wrote:
| >"2) It increases greatly the complexity of client software
| that has to implement both accelerated and unaccelerated
| decoding, leading to poorer software quality"
|
| I happen to have my own product having just that - software
| and hw accelerated decoding. It plays videos in few
| resolutions and presence of HW acceleration allowed me to
| play 4K videos (first on the market in my segment) with close
| to 0% CPU consumption on low end PCs. Competitors at that
| stage would not even dream about offering 4K content.
|
| As to "poorer software quality" - please do not play FUD. I
| just looked at the source code - the HW accelerated path
| (decodes from source to DirectX texture) added miniscule 1200
| lines of code good chunk of which are headers / declarations.
| The software is being used by tens of thousands of clients
| and I have about zero reports where enabling HW decoding has
| lead to error.
| zajio1am wrote:
| > The usefulness of hardware acceleration for video decoding
| is highly debatable.
|
| Disagree. On low-end hardware the advantages are clear. On my
| older Intel NUC i can play 1080p H.264 (using mpv) hw-
| accelerated with 15% cpu load, or software decoded with 75%
| cpu load. In the first case the NUC is silent, in the second
| case core temperature is rising and eventually its fan starts
| spinning.
| antisthenes wrote:
| > On my older Intel NUC i can play 1080p H.264 (using mpv)
| hw-accelerated with 15% cpu load, or software decoded with
| 75% cpu load
|
| These numbers are meaningless without measuring watt-hours
| used for the task.
|
| I was able to play 1080p H.264 video with hardware
| acceleration on a 8800 GS with an Athlon X2 5000 with about
| the same CPU utilization, back in 2008-2009. There was a
| special library (shareware) that enabled HW acceleration
| way before it was commonplace on integrated GPUs. Forgot
| what it was called, but it was Nvidia/CUDA only.
|
| That was 12+ years ago.
|
| Obviously GPUs have become more efficient since then, but
| so have the CPUs. It also matters how the video stream was
| encoded for efficiency. It's entirely possible that under
| certain options, hardware decoding's advantages are almost
| entirely negated.
| kimixa wrote:
| Also there's "levels" of hardware acceleration - using
| CUDA (or any other shader-level acceleration) will always
| be less efficient than a dedicated hardware block.
|
| And there's multiple steps in decoding a video - some
| steps in some codecs may fit different acceleration
| schemes better, so it may not be worth the hardware cost
| for a full pipeline decode at some point, but then later
| transistors are cheaper, or new hw decode techniques
| discovered, so more steps can be done in dedicated
| hardware blocks. Also those hardware blocks may have hard
| limits - if it can only (say) cope with 1080p60 at a
| certain profile level for a codec, trying to do something
| more than that will likely just completely skip the HW
| block - it's hard to do any kind of "hybrid" decode if
| it's not a whole pipeline step.
|
| "HW Video Decode Acceleration" isn't a simple boolean.
| kllrnohj wrote:
| > The usefulness of hardware acceleration for video decoding
| is highly debatable.
|
| No it isn't. There's a reason it's used on 99% of consumer
| devices. Hardware companies are generally not in the business
| of adding to the BOM cost for no reason. Linux alone is the
| outlier.
|
| > It's not always much more energy efficient, but it
| sometimes is, but less than you'd think, GPUs need power too
|
| "As you can see a GPU enabled VLC is 70% more energy
| efficient than using the CPU!"
|
| https://devblogs.microsoft.com/sustainable-software/vlc-
| ener...
|
| chrome-hw showing 1/4th the power consumption of chrome-sw on
| the same video on more recent Apple M1:
| https://singhkays.com/blog/apple-silicon-m1-video-power-
| cons...
|
| Also hardware decoders have consistent performance, which is
| not true of CPU-based decoders. This is especially
| problematic & obvious at high resolutions. Windows & MacOS
| ultrabooks can do 4k video all day long without an issue.
| Linux ultrabooks get noticeably choppy at 1440p and 4k is
| right out.
|
| This is also why you'll find ultra-low end SoCs regularly
| prioritizing hardware decoders over faster CPUs, notably
| those in every smart TV & the majority of TV streaming
| dongles/sticks/boxes. Which really shouldn't be surprising,
| fixed-function hardware has _always_ been drastically more
| efficient than programmable hardware, and video has changed
| nothing about that.
|
| > 2) It increases greatly the complexity of client software
| that has to implement both accelerated and unaccelerated
| decoding, leading to poorer software quality
|
| Sounds like a job for a library, which is how every other OS
| makes this a non-issue.
|
| > 4) HW support usually lags behind state of the art
| encoding. Youtube is already using av1, but the vast majority
| of devices won't support it in hardware before something else
| comes up
|
| Youtube also still uses VP9 so that power efficiency didn't
| regress on existing hardware, and mid-tier TV SoCs with AV1
| decoder support are already here (such as the Amlogic
| S905X4). Sony's 2021 BRAVIA XR line also has HW AV1 decoders
| up to 4k.
|
| > 5) Highly optimised decoders, such as dav1d, are extremely
| effective and save bandwidth and power compared to HW VP9.
|
| Care to back that up with a source? All I can find is
| statements that dav1d decoders are fast, but I can't find any
| evidence they are efficient. The only thing I can find is
| this: https://visionular.com/en/av1-encoder-optimization-
| from-the-...
|
| which has dav1d using more power than ffmpeg-h264 but less
| than openhevc, but those are also software decoders which
| similar to the above take _significantly_ more power than
| hardware decoders for the same codecs.
| [deleted]
| tau255 wrote:
| Disagree.
|
| I can run multiple 1080p twitch streams with mpv using
| streamlink and setting appropriate decoder flags while using
| chromium to watch even one stream puts a lot of strain on my
| laptop and gets fan running immediately.
|
| So from my perspective it is very usefull to offload video
| decoding to gpu and leave cpu cycles for other work. Is it
| more energy efficient? Never checked that but gpu fan does
| not really spin any faster and looking at the temperature
| graphs it does not seem it really strains it.
|
| I tried enabling gpu acceleration for browser (chromium
| based) and I still don't really know why it is so flaky and
| unreliable.
| pantalaimon wrote:
| > 2) It increases greatly the complexity of client software
| that has to implement both accelerated and unaccelerated
| decoding, leading to poorer software quality
|
| Only if you are not using any abstraction layers. GStreamer
| should take care of using a hardware decoder if available,
| otherwise fall back to software decoding.
| brigade wrote:
| Hybrid decoders that use GPU shaders are somewhat rare; HW
| decoding pretty much always means "ASIC". And ASIC power draw
| for decoders is typically in the <1W range.
|
| For dav1d, even YouTube-tier 1080p SW decoding is using +4-5W
| on my laptop, and 4k60 is +15-20W.
| fguerraz wrote:
| Yes, again, I'm talking about PCs here, where it's usually
| implemented in shaders.
| CameronNemo wrote:
| But the email is about ARM SoCs with dedicated VPU IP
| blocks.
| kllrnohj wrote:
| No it isn't. "NVDEC" is an actual ASIC block in the GPU
| silicon. It's not "shaders". Same with AMD's VCN. And
| Intel's QuickSync.
|
| If it was just shaders then there'd be basically no
| concerns with driver quality or hardware support, just
| like there aren't with CPU decoders.
| brigade wrote:
| So was I? Which phone can even achieve a 20W power
| draw...
|
| The only hybrid VP9 decoders were AMD's that only
| supported Windows, which they stopped shipping years ago
| (any current/Linux AMD drivers that support VP9 decoding
| only do so via an ASIC), and Intel's that was only
| supported on 3 generations of GPUs (Gen7.5, Gen8, and
| Gen9) and is obsoleted with an ASIC in Gen9.5.
| AshamedCaptain wrote:
| > ASIC power draw for decoders is typically in the <1W
| range.
|
| Many times even "standalone" HW decoders use or share GPU
| components (e.g., almost always the memory). Just bumping
| the memory controller clock up of the GPU already consumes
| >10W on my system.
| Arnavion wrote:
| >HW decoding pretty much always means "ASIC"
|
| Indeed. For example, hardware decoding is the difference
| between choppy video and smooth video on the PinePhone
| because the CPU isn't powerful enough and the GPU is
| useless for decoding.
|
| (And to fguerraz's edit that their comment doesn't apply to
| mobile phones "where manufacturers control hardware and
| software end to end", the manufacturer does not control the
| software on the PinePhone.)
| [deleted]
| megous wrote:
| This is kernel API for VPUs not for GPUs.
|
| Power reduction is not really questionable. You can't really
| achieve smooth playback at full-res without VPU on devices
| where these things are used.
| qwerty456127 wrote:
| I just wonder how does presence of a niche codec in the kernel
| affect the kernel size and performance.
|
| I would be glad if everybody would use it so it would be
| mainstream but the reality is H.264 and H.265.
| rjsw wrote:
| This is a hardware driver that conforms to a standard
| interface, it doesn't implement a VP9 codec in software in the
| kernel.
| marcodiego wrote:
| This is a driver interface. If it is not standardized, we get
| that ugly situation where an userspace app only works with
| hardware from a specific vendor.
| dcgudeman wrote:
| youtube uses VP9 so I wouldn't call it niche.
| _joel wrote:
| Yes, unless you force it to h264 it'll default to vp9, as,
| well, google.
| qwerty456127 wrote:
| Many videos are not available in VP9. I have noticed a
| couple of years ago when I had to use vanilla Ubuntu
| without "install 3-rd party software" checkbox checked
| during installation - Firefox refused to play many YouTube
| videos.
|
| It also supposedly makes sense to force H.264 to increase
| chances of hardware acceleration being used.
| rememberlenny wrote:
| Could someone explain the significance of this, why it took so
| long, and what it opens open?
| rkangel wrote:
| VP9 is an open source, royalty free video codec. It is
| developed by Google to provide a free alternative to things
| like H264 and H265. Implementing codecs for both of those
| require paying licence fees.
|
| Codecs can be in software, but can also be implemented in
| hardware which is much more power efficient. This change
| enables the Linux kernel to use hardware VP9 decoders so that
| software can decode (play) VP9 video much more efficiently when
| that hardware is available.
| [deleted]
| maggit wrote:
| It looks like it implements hardware acceleration of the VP9
| codec for some specific hardware (Rockchip VDEC and Hantro G2).
| This opens up playing, for example, lots of YouTube videos with
| less CPU usage on devices with that hardware. I can't comment
| on whether or not it "took so long" as I have no idea which
| hardware this is.
|
| The title makes it out to be something fundamental in Linux,
| but this is just one driver becoming more complete.
| megous wrote:
| It's a new media subsystem userspace API, not just a new
| driver, and the API will be stable from the get go, instead
| of languishing in the staging area, like the H.264 one.
| CameronNemo wrote:
| Is the h264 one going to move out of staging anytime soon?
| marcodiego wrote:
| AFAIK rk3399 is especial in this area: its codecs need no binary
| blobs. This means it can encourage other vendors to do the same
| and get ryf-certified. ARM SBCs based on rk3399 can become the
| only modern affordable systems with ryf certification.
| rjsw wrote:
| I don't think the Allwinner codecs need binary blobs.
| Teknoman117 wrote:
| Now if only the standard release for PineBook Pro would use
| newer than a 5.7 kernel so we could get hardware codecs.
|
| Might just have to sit down and figure out how to cross compile
| Gentoo for it.
| marcodiego wrote:
| Try armbian
| yjftsjthsd-h wrote:
| You don't need to go full Gentoo to install a custom kernel.
| From Manjaro you could even pull the linux-mainline AUR
| package and just build+install that (or linux-git or any of
| the others), if you want the easy way out.
| Teknoman117 wrote:
| It was more of a "there are a ton of packages which also
| need to be patched" kind of thing as well.
| CameronNemo wrote:
| I mean compiling your own kernel and compiling the OS are
| very different. I am running 5.14 on my PBP right now.
|
| No support for external displays, though.
___________________________________________________________________
(page generated 2021-09-14 23:01 UTC)