[HN Gopher] GTK: Introducing Graphics Offload
___________________________________________________________________
GTK: Introducing Graphics Offload
Author : signa11
Score : 249 points
Date : 2023-11-18 08:31 UTC (14 hours ago)
(HTM) web link (blog.gtk.org)
(TXT) w3m dump (blog.gtk.org)
| ng55QPSK wrote:
| Is the same infrastructure available in Windows and MacOS?
| diath wrote:
| The last paragraph says:
|
| > At the moment, graphics offload will only work with Wayland
| on Linux. There is some hope that we may be able to implement
| similar things on MacOS, but for now, this is Wayland-only. It
| also depends on the content being in dmabufs.
| knocte wrote:
| From the article:
|
| > What are the limitations?
|
| > At the moment, graphics offload will only work with Wayland
| on Linux. There is some hope that we may be able to implement
| similar things on MacOS, but for now, this is Wayland-only. It
| also depends on the content being in dmabufs.
| PlutoIsAPlanet wrote:
| macOS supports similar things in its own native stack, but
| GTK doesn't make use of it.
| pjmlp wrote:
| Nope, it is yet another step making Gtk only relevant for Linux
| development.
| andersa wrote:
| Has it been relevant for something else before?
| pjmlp wrote:
| Yes, back when G stood for Gimp, and not GNOME.
| danieldk wrote:
| Even in those days, Gtk+ applications were quite horrible
| on non-X11 platforms. GTK has never been a good cross-
| platform toolkit in contrast to e.g. Qt.
| pjmlp wrote:
| I guess people enjoying GIMP and Inkscape would beg to
| differ.
| danieldk wrote:
| Maybe it's a thing on Windows (I don't know), but I've
| never seen anyone use GIMP or Inkscape on macOS. I'm
| pretty sure they exist somewhere, but all Mac users I
| know use Photoshop, Pixelmator or Affinity Photo rather
| than GIMP.
| ognarb wrote:
| Inskape devs have a lot of trouble making their app work
| on other OS.
| jdub wrote:
| Supporting a feature on one platform does not make a toolkit
| less relevant or practical on another platform.
| pjmlp wrote:
| Except this has been happening for quite a while, hence why
| a couple of cross platform projects have migrated from Gtk
| to Qt, including Subsurface, a bit ironically, given the
| relationship of the project to Linus.
| jdub wrote:
| Gtk has always been primarily built by and for Linux
| users.
| pjmlp wrote:
| The GIMP Tolkit was always cross platform, the GNOME
| Tolkit not really.
| jdub wrote:
| That is ahistorical, and the misnaming doesn't help make
| your point.
| pjmlp wrote:
| As former random Gtkmm contributor, with articles on the
| The C/C++ User Journal, I am not the revisionist here.
| DonHopkins wrote:
| What's a Tolkit? And why two of them? I thought GTK was
| the Toolkit, GIMP was the Image Manipulation Program, and
| Gnome was the desktop Network Object Model Environment.
| Am I a revisionist here? (I certainly have my
| reservations about them!)
| pjmlp wrote:
| GTK stand for The GIMP Toolkit, as it was originally used
| to write GIMP, which actually started as a MOTIF
| application.
|
| When GNOME adopted GTK as its foundation, there was a
| clear separation between GTK and the GNOME libraries,
| back in the 1.0 - 2.0 days.
|
| Eventually GNOME needs became GTK roadmap.
|
| The rest one can find on the history books.
| DonHopkins wrote:
| Dude, I know. I've been implementing user interface
| toolkits since the early 80's, but I've still never heard
| of a "Tolkit", which you mentioned twice, so I asked you
| what it was -- are you making a silly pun like "Tollkit"
| for "Toolkit" or "Lamework" for "Framework" or "Bloatif"
| for "Motif" and I'm missing it? No hits on urban
| dictionary, even. And also you still haven't explained
| whether I'm a revisionist or not.
|
| Just like you, I love to write articles about user
| interface stuff all the time, too. Just in the past week:
|
| My enthusiastic but balanced response to somebody who
| EMPHATICALLY DEMANDED PIE MENUS ONLY for GIMP, and who
| loves pie fly, but pushed my button by defending the name
| GIMP by insisting that instead of the GIMP project simply
| and finally conceding its name is offensive, that our
| entire society adapt by globally re-signifying a widely
| known offensive hurtful word (so I suggested he first go
| try re-signifying the n-word first, and see how that
| went):
|
| https://news.ycombinator.com/item?id=38233793
|
| (While I would give more weight to the claim that the
| name GIMP is actually all about re-signifying an
| offensive term if it came from a qualified and empathic
| and wheelchair using interface designer like Haraldur
| Ingi Thorleifsson, I doubt that's actually the real
| reason, just like it's not white people's job to re-
| signify the n-word by saying it all the time...)
|
| Meet the man who is making Iceland wheelchair accessible
| one ramp at a time:
|
| https://scoop.upworthy.com/meet-the-man-who-is-making-
| icelan...
|
| Elon Musk apologises after mocking laid-off Twitter
| worker, questioning his disability:
|
| https://www.abc.net.au/news/2023-03-08/elon-musk-
| haraldur-th...
|
| The article about redesigning GIMP we were discussing
| credited Blender with being the first to show what mouse
| buttons do what at the bottom of the screen, which
| actually the Lisp Machine deserves credit for, as far as
| I know:
|
| https://news.ycombinator.com/item?id=38237231
|
| I made a joke about how telling GIMP developers to make
| it more like Photoshop was like telling RMS to develop
| Open Software for Linux, instead of Free Software for
| GNU/Linux, and somebody took the bait so I flamed about
| the GIMP developer's lack of listening skills:
|
| https://news.ycombinator.com/item?id=38238274
|
| Somebody used the phrase "Easy as pie" in a discussion
| about user interface design so I had to chime in:
|
| https://news.ycombinator.com/item?id=38239113
|
| Discussion about HTML Web Components, in which I confess
| my secret affair with XML, XSLT, obsolete proprietary
| Microsoft technologies, and Punkemon pie menus:
|
| https://news.ycombinator.com/item?id=38253752
|
| Deep interesting discussion about Blender 4.0 release
| notes, focusing on its historic development and its
| developer's humility and openness to its users'
| suggestions, in which I commented on its excellent Python
| integration.
|
| https://news.ycombinator.com/item?id=38263171
|
| Comment on how Blender earned its loads of money and
| support by being responsive to its users.
|
| https://news.ycombinator.com/item?id=38232404
|
| Dark puns about user interface toolkits and a cheap shot
| at Motif, with an analogy between GIMP and Blender:
|
| https://news.ycombinator.com/item?id=38263088
|
| A content warning to a parent who wanted to know which
| videos their 8-year-old should watch on YouTube to learn
| Blender:
|
| https://news.ycombinator.com/item?id=38288629
|
| Posing with a cement garden gnome flipping the bird with
| Chris Toshok and Miguel de Icaza and his mom at GDC2010:
|
| https://www.facebook.com/photo/?fbid=299606531754&set=a.5
| 173...
|
| https://www.facebook.com/photo/?fbid=299606491754&set=a.5
| 173...
|
| https://www.facebook.com/photo/?fbid=299606436754&set=a.5
| 173...
| vore wrote:
| It was clearly a typo you could choose to ignore
| charitably instead of nitpick. Also, what is the rest of
| this comment and how is it related to GTK?
| DonHopkins wrote:
| Because he was incorrectly nitpicking himself, and was
| wrong to call somebody else a revisionist without citing
| any proof, while he was factually incorrect himself, and
| offering an appeal to authority of himself as a writer
| and "random Gtkmm contributor" instead. I too have lots
| of strong opinions about GTK, GNOME, and GIMP, so I am
| happy for the opportunity to write them up, summarize
| them, and share them.
|
| You'll have to read the rest of the comment and follow
| the links to know what it says, because I already wrote
| and summarized it, and don't want to write it again just
| for you, because I don't believe you'd read it a second
| time if you didn't read it the first time. Just use
| ChatGPT, dude.
|
| Then you will see that it has a lot to do with GTK and
| GNOME and GIMP, even including exclusive photos of Miguel
| de Icaza and his mom with a garden gnome flipping the
| bird.
| pjmlp wrote:
| Oopsie I touched a nerve.
| DonHopkins wrote:
| You HAD to mention MOTIF! ;) There's a reason I call it
| BLOATIF and SLOWTIF...
|
| https://donhopkins.medium.com/the-x-windows-
| disaster-128d398...
|
| >The Motif Self-Abuse Kit
|
| >X gave Unix vendors something they had professed to want
| for years: a standard that allowed programs built for
| different computers to interoperate. But it didn't give
| them enough. X gave programmers a way to display windows
| and pixels, but it didn't speak to buttons, menus, scroll
| bars, or any of the other necessary elements of a
| graphical user interface. Programmers invented their own.
| Soon the Unix community had six or so different interface
| standards. A bunch of people who hadn't written 10 lines
| of code in as many years set up shop in a brick building
| in Cambridge, Massachusetts, that was the former home of
| a failed computer company and came up with a "solution:"
| the Open Software Foundation's Motif.
|
| >What Motif does is make Unix slow. Real slow. A stated
| design goal of Motif was to give the X Window System the
| window management capabilities of HP's circa-1988 window
| manager and the visual elegance of Microsoft Windows. We
| kid you not.
|
| >Recipe for disaster: start with the Microsoft Windows
| metaphor, which was designed and hand coded in assembler.
| Build something on top of three or four layers of X to
| look like Windows. Call it "Motif." Now put two 486 boxes
| side by side, one running Windows and one running
| Unix/Motif. Watch one crawl. Watch it wither. Watch it
| drop faster than the putsch in Russia. Motif can't
| compete with the Macintosh OS or with DOS/Windows as a
| delivery platform.
| smoldesu wrote:
| > Eventually GNOME needs became GTK roadmap.
|
| Exactly? If you're still holding out for GTK to be a non-
| Linux toolkit in 2023 then you're either an incredibly
| misguided contributor and/or ignorant of the history
| behind the toolkit. The old GTK does not exist anymore,
| you either use GNOME's stack or you don't.
| chrismorgan wrote:
| GNOME co-opted and sabotaged GTK for anyone that's not
| GNOME. GTK used to be capable of being fairly OS-neutral,
| and was certainly quite neutral within Linux and so
| became the widget toolkit of choice for diverse desktop
| environments and worked well thus; but over time GNOME
| has taken it over completely, and the desires of other
| desktop environments are utterly ignored. The GNOME
| Foundation has become a very, very bad custodian for GTK.
|
| As you say, the old GTK is dead. GNOME murdered it. I
| mourn it.
| smoldesu wrote:
| Yeah, I don't disagree with anything you've said. Still
| though, I use GTK because it works and think the pushback
| against it is silly. GTK was never destined to be the
| cross-platform native framework. If that was attainable,
| people would have forked GTK 2 (for what?) or GTK 3 (too
| quirky). Now we're here, and the only stakeholders on the
| project is the enormously opinionated GNOME team.
|
| They've made a whole lot of objective and subjective
| missteps in the past, but I don't think it's fair to
| characterize them as an evil party here. They did the
| work, they reap the rewards, and they take the flak for
| the myriad of different ways the project could/should
| have gone.
| jamesfmilne wrote:
| macOS has IOSurface [0], so it can be done there too. It would
| require someone to implement it for GTK.
|
| [0] https://developer.apple.com/documentation/iosurface
| audidude wrote:
| When I wrote the macOS backend and GL renderer I made them
| use IOSurface already. So it's really a matter of setting up
| CALayer automatically the same way that we do it on Linux.
|
| I don't really have time for that though, I only wrote the
| macOS port because I had some extra holiday hacking time.
| torginus wrote:
| On Windows and DirectX, you have the concept of Shared Handles,
| which are essentially handles you can pass between process
| boundaries. It also comes with a mutex mechanism to signal who
| is using the resource at the moment. Fun fact - Windows at the
| kernel level works with the concept of 'objects', which can be
| file handles, window handles, threads, mutexes, or in this
| case, textures, which are reference counted. Sharing a
| particular texture is just exposing the handle to multiple
| processes.
|
| A bit of reading if you are interested:
|
| https://learn.microsoft.com/en-us/windows/win32/direct3darti...
| diath wrote:
| I wonder if there are plans to make it work with X11 in the
| future, I've yet to see the benefit of trying to switch to
| Wayland on my desktop, it just doesn't work as-is the way my 8
| year old setup works.
| knocte wrote:
| I doubt they have the energy to backport bleeding edge tech.
| PlutoIsAPlanet wrote:
| This is one of the benefits of the Wayland protocol over X,
| being able to do this kind of thing relatively
| straightforwardly.
|
| Once support for hardware planes becomes more common in Wayland
| compositors, this can be tied to ultimately allow no-copy
| rendering to the display for non-fullscreen applications, which
| for video playback (incl. likes of Youtube) equals to reduced
| CPU & GPU usage and less power draw, as well as reduced
| latency.
| AshamedCaptain wrote:
| > This is one of the benefits of the Wayland protocol over X
|
| What.
|
| The original design of X actually encouraged a separate
| surface / Window for each single widget on your UI. This was
| actually removed in Gtk+3 ("windowless widgets"). And now
| they are bringing it back just for wayland ("subsurfaces").
| As far as I can read, it is practically the same concept.
| tadfisher wrote:
| The original design of X had clients send drawing commands
| to the X server, basically treating the server as a remote
| Cairo/Skia-like 2D rasterizer, and subwindows were a cheap
| way to avoid pixel-level damage calculations. This was
| obviated in the common case by the Xdamage extension. Later
| use of windows as a rendering surface for shared
| client/server buffers was added with Xshm, then for device
| video buffers with Xv.
|
| GTK3 got rid of windowed widgets because Keith Packard
| introduced the Xrender extension, which basically added 2D
| compositing to X, which was the last remaining use for
| subwindows for every widget.
| AshamedCaptain wrote:
| This is completely wrong. Xrender is completely
| orthogonal to having windows or not. Heck, Xrender takes
| a _window_ as target -- Xrender is just an extension to
| allow more complicated drawing commands to be sent to the
| server (like alpha composition). You make your toolkit's
| programmer's life more complicated, not less, by having
| windowless widgets (at the very minimum you now have to
| complicate your rendering & event handling code with
| offsets and clip regions and the like).
|
| The excuse that was used when introducing windowless
| widgets is to reduce tearing/noise during resizing, as
| Gtk+ had trouble synchronizing the resizing of all the
| windows at the same time.
| play_ac wrote:
| >Xrender is completely orthogonal to having windows or
| not. Heck, Xrender takes a _window_ as target -- Xrender
| is just an extension to allow more complicated drawing
| commands to be sent to the server (like alpha
| composition).
|
| Yes, that's the point. When you can tell Xrender to
| efficiently composite some pixmaps then there's really no
| reason to use sub-windows ever.
|
| >You make your toolkit's programmer's life more
| complicated, not less, by having windowless widgets (at
| the very minimum you now have to complicate your code
| with offsets and clip regions and the like).
|
| No, you still had to have offsets and clip regions before
| too because the client still had to set and update those.
| And it was more complicated because when you made a sub-
| window every single bit of state like that had to be
| synchronized with the X server and repeatedly copied over
| the wire. With client-side rendering everything is simply
| stored in the client and never has to deal with that
| problem.
| AshamedCaptain wrote:
| > When you can tell Xrender to efficiently composite some
| pixmaps then there's really no reason to use sub-windows
| ever.
|
| There is, or we would not be having subsurfaces on
| Wayland or this entire discussion in the first place.
|
| Are you seriously arguing that the only reason to using
| windows in Xorg is to have composition? People were using
| Xshape/Xmisc and the like to handle the lack of alpha
| channels in the core protocol? This is not what I
| remember. I would be surprised if Xshape even worked on
| non-top level windows. heck, even MOTIF had windowless
| widgets (called gadgets iirc), and the purpose most
| definitely was not composition-related.
| audidude wrote:
| Drawables on Drawables doesn't help here at all.
|
| Sure it lets you do fast 2d acceleration but we don't use
| 2d accel infrastructure anywhere anymore.
|
| Subsurfaces have been in Wayland since the beginning of
| the protocol.
|
| This is simply getting them to work on demand so we can
| do something X (or Xv) could never do since Drawables
| would get moved to new memory (that may not even be
| mappable on the CPU-side) on every frame.
|
| And that's to actually use the scanout plane correctly to
| avoid powering up the 3d part of the GPU when doing video
| playback on composited systems.
| hurryer wrote:
| No screen tearing is a major benefit of using a compositor.
| mrob wrote:
| And screen tearing is a major benefit of not using a
| compositor. There's an unavoidable tradeoff between image
| quality and latency. Neither is objectively better than the
| other. Xorg has the unique advantage that you can easily
| switch between them by changing the TearFree setting with
| xrandr.
| RVuRnvbM2e wrote:
| It's not unique. Wayland has a tearing protocol.
|
| https://gitlab.freedesktop.org/wayland/wayland-
| protocols/-/t...
| mrob wrote:
| This is something that every application has to opt in to
| individually. It's not a global setting like TearFree.
| RVuRnvbM2e wrote:
| This is just untrue.
| mrob wrote:
| From the XML file that describes the protocol:
|
| "This global is a factory interface, allowing clients to
| inform which type of presentation the content of their
| surfaces is suitable for."
|
| Note that "global" refers to the interface, not the
| setting. Which Wayland compositor has the equivalent
| feature of "xrandr --output [name] --set TearFree off"?
| kaba0 wrote:
| Which is the correct default? No application should
| unknowingly render half-ready frames, that's stupid. The
| few niches where it makes sense (games, 3D applications)
| can opt into it, and do their own thing.
| badsectoracula wrote:
| That is subjective, i do not want the input latency
| induced by synchronizing to the monitor's refresh rate in
| my desktop as it makes it feel sluggish. The only time i
| want this is when i watch some video (and that is of
| course only when i actively watch the video, sometimes i
| put a video at the background when i do other stuff) - so
| for my case the correct default is to have this disabled
| with the only exception being when watching videos.
| kaba0 wrote:
| Okay, and then the program will get an event, that
| propagates down to the correct component, which reacts
| some way, everything that changes due to that are
| damaged, and every damaged component is re-rendered with
| synchronization from the framework itself. It has to be
| specifically coded (e.g. text editor directly writing to
| the same buffer the rendered character) to actually make
| efficient use of tearing, it will literally just tear
| otherwise with zero benefits.
| mrob wrote:
| You don't need to do anything special. Just render to the
| front buffer immediately and don't worry about the
| current scanout position. If it's above the part of the
| screen you're updating, great, you saved some latency. If
| it's below, latency is the same as if you waited for
| vsync. And if it's in the middle, you at least get a
| partial update early.
| JelteF wrote:
| Could you explain in what scenario you think it is better
| to have a display show two half images slightly faster
| (milliseconds) than one full one?
| mrob wrote:
| Text editing. I mostly work on a single line at a time.
| The chance of the tear ending up in that line is low. And
| even if it does, it lasts only for a single screen
| refresh cycle, so it's not a big deal.
|
| And you're not limited to two images. As the frame rate
| increases, the number of images increases and the tear
| becomes less noticeable. Blur Busters explains:
|
| https://blurbusters.com/faq/benefits-of-frame-rate-above-
| ref...
| kaba0 wrote:
| As the frame rate increases, the latency decreases,
| making it a non-issue. I rather chose this option, over
| blinking screens.
| mrob wrote:
| The minimum latency is bottlenecked by the monitor unless
| you allow tearing.
| AshamedCaptain wrote:
| Many technologies have been invented to allow to display
| "two half images slightly faster", such as interlaced
| scanning...
|
| Most humans will actually prefer the "slightly faster"
| option. (Obviously if you can do both, then they'd prefer
| that; but given the trade-off...)
| andreyv wrote:
| First person shooters. Vertical synchronization causes a
| noticeable output delay.
|
| For example, with a 60 Hz display and vsync, game actions
| might be shown up to 16 ms later than without vsync,
| which is ages in FPS.
| badsectoracula wrote:
| Input latency. I find the forced vsync by compositors
| annoying even when doing simple stuff like moving or
| resizing windows - it gives a sluggish feel to my
| desktop. This is something i notice even on a high
| refresh rate monitor.
| beebeepka wrote:
| Personally - gaming. Never liked vsync
| maccard wrote:
| > I've yet to see the benefit of trying to switch to Wayland on
| my desktop
|
| how about Graphics Offload?
| diath wrote:
| This feature would be nice-to-have but is not impactful
| enough (at least to me) to outweigh the cons of having to
| switch to Wayland, which would include migrating my DE and
| getting accustomed to it as well as looking for replacement
| applications for these that do not work properly with Wayland
| (most notably ones that deal with global keyboard hooks).
| Admittedly I have never tried XWayland which I think could
| potentially solve some of these issues.
| maccard wrote:
| I think if your waiting for a magic bullet of a feature to
| upgrade you might be waiting a long time, and even Wayland
| will be replaced at that point. Instead, look at the
| combination of features (like this) and think about it and
| future upgrades. I think you're right that xwayland is
| probably a compromise for now if you need it for things
| like global shortcuts.
| mnd999 wrote:
| If it worked exactly the same there would indeed be no benefit.
| If you're happy with that you have then there's no reason to
| switch.
| aktuel wrote:
| I am sorry to tell you that X11 is completely unmaintained by
| now. So the chances of that happening are zero.
| NGRhodes wrote:
| FYI 21.1.9 was released less than a month ago
| (https://lists.x.org/archives/xorg/2023-October/061515.html),
| they are still fixing bugs.
| dralley wrote:
| You mean, they're still fixing critical CVEs.
| badsectoracula wrote:
| Which are still bugs.
|
| Also only two of the four changes mentioned in the mail
| are about CVEs.
| AshamedCaptain wrote:
| Frankly, it was X11 which introduced "Graphics Offload" in the
| first place, with stuff like XV, chroma keying, and hardware
| overlays. Then compositors came and we moved to
| texture_from_surface extensions and uploading things into GPUs.
| This is just the eternal wheel of reinventing things in
| computing (TM) doing yet another iteration and unlikely to give
| any tangible benefits over the situation from decades ago.
| play_ac wrote:
| No, nothing like this exists in X11. Xorg still doesn't
| really have support for non-RGB surfaces. DRI3 gets you part
| of the way there for attaching GPU buffers but the way
| surfaces work would have to be overhauled to work more like
| Wayland, where they can be any format supported by the GPU.
| There isn't any incentive to implement this in X11 either
| because X11 is supposed to work over the network and none of
| this stuff would.
|
| Yes, you're technically right that this would have been
| possible years ago but it wasn't actually ever done, because
| X11 never had the ability to do it at the same time as using
| compositing.
| AshamedCaptain wrote:
| > Xorg still doesn't really have support for non-RGB
| surfaces
|
| You really need to add context to these statements, because
| _right now_ I am using through Xorg a program which uses a
| frigging colormap, which is as non-RGB as it gets. The
| entire reason Xlib has this "WhitePixel" and XGetPixel and
| XYPixmap and other useless functions which normally fetch a
| lot of ire is because it tries to go out of its way to
| support practically other-wordly color visuals and image
| formats. If anything, I'd say it is precisely RGB which has
| the most problems with X11, specially when you go more than
| 24bpp.
|
| > there for attaching GPU buffers
|
| None of this is about the GPU, but about about directly
| presenting images for _hardware_ composition using direct
| scan-out, hardware layers or not. Exactly what Xv is about,
| and the reason Xv supports formats like YUV.
|
| > There isn't any incentive to implement this in X11 either
| because X11 is supposed to work over the network and none
| of this stuff would
|
| As if that prevented any of the extensions done to X11 in
| the last three decades, including Xv.
| kaba0 wrote:
| There are plenty of wheel reinventions in IT, but let's not
| pretend that modern graphics are anything like it used to be.
| We have 8k@120Hz screens now, the amount of pixels that have
| to be displayed in a short amount of time is staggering.
| AshamedCaptain wrote:
| At the same time, you also have hardware that can push
| those pixels without problem. When X and these technologies
| were introduced, the hardware was not able to store the
| entire framebuffer for one screenful in memory, let alone
| two. Nowadays you are able to store a handful at the very
| minimum. Certainly there's a different level of performance
| here in all parts, but the concepts have changed very
| little, and this entire article kind of shows it.
| audidude wrote:
| This would require protocol changes for X11 at best, and nobody
| is adding new protocols. Especially when nobody does Drawable
| of Drawables anymore and all use client-side drawing with Xshm.
|
| You need to dynamically change stacking of subsurfaces on a
| per-frame basis when doing the CRTC.
| AshamedCaptain wrote:
| I really don't see why it would need a new protocol. You can
| change stacking of "subsurfaces" in the traditional X11
| fashion and you can most definitely do "drawables of
| drawables". At the very least I'd bet most clients still
| create a separate window for video content.
|
| I agree though it would require a lot of changes to the
| server and no one is in the mood (like, dynamically decide
| whether I composite this window or push it to a Xv port or
| hardware plane? practically inconceivable in the current
| graphics stack, albeit it is not a technical X limitation
| per-se). This entire feature is also going to be pretty
| pointless in Wayland desktop space either way because no one
| is in the mood either -- your dmabufs are going to end up in
| the GPU anyway for the foreseeable future, just because of
| the complexity of liftoff, variability of GPUs, and the like.
| audidude wrote:
| > I really don't see why it would need a new protocol.
|
| You'll need API to remap the Drawable to the scanout plane
| from a compositor on a per-frame basis (so when submitting
| the CRTC) and the compositor isn't in control of the CRTC.
| So...
| AshamedCaptain wrote:
| This assumes it would be the role of the window manager
| or compositor rather than the server who decides that,
| which I didn't think that way. But I guess it'd make
| sense (policy vs mechanism). "per-frame basis" I don't
| see why and it just mirrors Wayland concepts. Still, as
| protocols go, it's quite a minor change, and one
| applications don't necessarily have to support.
| kelnos wrote:
| I would very much doubt it. This would likely require work on
| Xorg itself (a new protocol extension, maybe; I don't believe
| X11 supports anything but RGB, [+A, with XRender] for windows,
| and you'd probably need YUV support for this to be useful),
| which no one seems to care to do. And the GTK developers seem
| to see their X11 windowing backend as legacy code that they
| want to remove as soon as they can do so without getting too
| many complaints.
| MarcusE1W wrote:
| Is this something where it would be helpful if the Linux
| (environment) developers worked together? Like the (graphics)
| kernel, GTK, KDE, Wayland, ... guys all in one room (or video
| conference) to discuss requirements iron out one graphics
| architecture that is efficient and transparent?
|
| I think it's good that different graphics systems exist but it
| feels unnecessary that every team has to make their own
| discoveries how to handle the existing pieces.
|
| If work were coordinated at least on the requirements and
| architecture level then I think a lot of synergies could be
| achieved. After that everyone can implement the architecture the
| way that works best for their use case, but on some common
| elements could be relied on.
| rawoul wrote:
| They are:
|
| https://indico.freedesktop.org/event/4/
|
| https://emersion.fr/blog/2023/hdr-hackfest-wrap-up/
|
| https://gitlab.freedesktop.org/wayland/wayland-protocols/-/w...
|
| ...
| jdub wrote:
| They do, it's just not hugely visible. Two great conferences
| where some of that work happened were linux.conf.au and the
| Linux Plumbers Conference.
| dontlaugh wrote:
| That's exactly how Wayland came to be.
| BearOso wrote:
| That's exactly what happened. This is the original intent for
| subsurfaces. A bunch of Wayland developers got together and
| wrote the spec a long time ago. The only thing happening now is
| Gtk making use of them transparently in the toolkit.
|
| Subsurfaces didn't have bug-free implementations for a while,
| so maybe some people avoided them. But I know some of us
| emulator programmers have been using them for output
| (especially because they can update asynchronously from the
| parent surface), and I think a couple media players do, too.
| It's not something that most applications really need.
| jiehong wrote:
| I'm not sure I understand why an overlay allows partial
| offloading while rounding the corner of the video does not.
|
| Couldn't the rounded corners of a video also be an overlay?
|
| I'm sure I'm missing something here, but the article does not
| explain that point.
| andyferris wrote:
| I think it's that the _window_ has rounded corners, and you
| don't want the content appearing outside the window.
| audidude wrote:
| No, you can already be sure it's the right size. This has to
| do with what it takes to occlude the rounded area from the
| final display.
| orra wrote:
| I'd love to know the answer to that. This is fantastic work,
| but it'd be a shame for it to be scunnered by rounded corners.
| phkahler wrote:
| Because the UX folks want what they want. I want my UI out of
| the way, including the corners of video and my CPU load.
| audidude wrote:
| Easily solved by black bars just like people are used to on
| a TV. I assume most video players will do this when in
| windowed mode.
| orra wrote:
| That's a good observation. Plus, you'll get black bars
| anyway, if you resize the window to not be the same
| aspect ratio as the video.
| play_ac wrote:
| >Couldn't the rounded corners of a video also be an overlay?
|
| No because the clipping is done in the client after the content
| is drawn. The client doesn't have the full screen contents. To
| make it work with an overlay, the clipping would have to be
| moved to the server. There could be another extension that lets
| you pass an alpha mask texture to the server to use as a clip
| mask. But this doesn't exist (yet?)
| audidude wrote:
| If you have the video extend to where the corners are rounded,
| you must use a "rounded clip" on the video ontop of the shadow
| region (since they butt).
|
| That means you have to power up the 3d part of the GPU to do
| that (because the renderer does it in shaders).
|
| Where as if you add some 9 pixels of black above/below to
| account for the rounded corner, there is no clipping of the
| video and you can use hardware scanout planes.
|
| That's important because keeping the 3d part of the GPU turned
| off is a huge power savings. And the scanout plane can already
| scale for you to the correct size.
| unwind wrote:
| Very cool!
|
| I think I found a minor typo:
|
| _GTK 4.14 will introduce a GtkGraphicsOffload widget, whose only
| job it is to give a hint that GTK should try to offload the
| content of its child widget by attaching it to a subsurface
| instead of letting GSK process it like it usually does._
|
| I think that "GSK" nesr the end should just be "GTK". It's not a
| very near miss on standard qwerty, though ...
| pja wrote:
| GSK is the GTK Scene Graph Kit:
| https://en.wikipedia.org/wiki/GTK_Scene_Graph_Kit
| unwind wrote:
| Wow thanks, TIL! Goes to show how far removed I am from GTK
| these days I guess. :/
| caslon wrote:
| Not a typo: https://en.wikipedia.org/wiki/GTK_Scene_Graph_Kit
| ahartmetz wrote:
| It is strange that the article doesn't compare and contrast to
| full-screen direct scanout, which most X11 and presumably Wayland
| compositors implement, e.g. KDE's kwin-wayland since 2021:
| https://invent.kde.org/plasma/kwin/-/merge_requests/502
|
| Maybe that is because full-screen direct scanout doesn't take
| much (if anything) in a toolkit, it's almost purely a compositor
| feature.
| kaba0 wrote:
| Is there a significant difference? Hardware planes are
| basically that, just optionally not full-screen.
| smallstepforman wrote:
| BeOS and Haiku allowed exposure to kernel graphics buffers
| decades ago (https://www.haiku-os.org/legacy-
| docs/bebook/BDirectWindow.ht...), which bypass the roundtrip to
| compositor and back. A companion article describing the design is
| here (https://www.haiku-os.org/legacy-
| docs/benewsletter/Issue3-12....)
|
| After 25 years, GTK joins the party ...
|
| With no native video drivers, moving a Vesa Haiku window with
| video playback still seems smoother than doing the same in
| Gnome/Kde under X11.
| pjmlp wrote:
| Like most OSes where the desktop APIs are part of the whole
| developer experience, and not yet another pluggable piece.
| tmountain wrote:
| BeOS was truly ahead of its time. It's a shame that it didn't
| get more traction.
| ris wrote:
| There must be a name for the logical fallacy I see whenever
| someone pines over a past "clearly superior" technology that
| wasn't adopted or had its project cancelled, but I can never
| grasp it. I guess the closest thing to it is an unfalsifiable
| statement.
|
| The problem comes from people remembering all of the positive
| traits (often just _promises_ ) of a technology but either
| forgetting all the problems it had or the technology never
| being given enough of a chance for people to discover all the
| areas in which it is a bit crap.
|
| BeOS? Impressive technology demo. Missing _loads_ of features
| expected in a modern OS, even at that time. No multi-user
| model even!
|
| This is also a big phenomenon in e.g. aviation. The greatest
| fighter jet _ever_ is always the one that (tragically!) got
| cancelled before anyone could discover its weaknesses.
| darkwater wrote:
| This, plus nostalgia pink glasses plus rooting for the
| underdog (who lost) plus my niche tech is better than your
| mainstream one.
| AshamedCaptain wrote:
| > There must be a name for the logical fallacy I see
| whenever someone pines over a past "clearly superior"
| technology that wasn't adopted or had its project
| cancelled, but I can never think of it. I guess the closest
| thing to it is an unfalsifiable statement.
|
| You can run Haiku today, so it's hardly unfalsifiable, nor
| an effect of nostalgia or whatever way you want to phrase
| it.
|
| > BeOS? Impressive technology demo. Missing loads of
| features expected in a modern OS, even at that time. No
| multi-user model even!
|
| "Even at that time" is just false. multi-user safe OSes
| abound in the 90s?
| dihrbtk wrote:
| Windows NT...?
| ris wrote:
| > You can run Haiku today, so it's hardly unfalsifiable
|
| Excellent, so let's falsify it: how come 20 years later
| the best thing people really have to say about BeOS/Haiku
| is that is has smooth window dragging?
|
| > multi-user safe OSes abound in the 90s?
|
| Windows NT.
|
| Linux and at least two other free unixes.
|
| Countless proprietary unixes.
|
| VMS.
|
| The widespread desktop OSs in the 90s were not considered
| serious OSs even then, more accidents of backward-
| compatibility needs.
| tialaramex wrote:
| The thing about Haiku is that their plan (initially as
| "OpenBeOS") was since they're just re-doing BeOS and some
| of BeOS is Open Source they'll just get some basics
| working then they're right back on the horse and in a few
| years they'll be far ahead of where BeOS was.
|
| _Over two decades later_ they don 't have a 1.0 release.
| cmrdporcupine wrote:
| Exactly this. I'm as big a fan of "alternative tech
| timelines" as the next nerd, but I also can see in
| retrospect why we have the set of compromises we have
| today, and all along I watched the intense efforts people
| made to navigate the mindbogglingly complicated minefield
| of competing approaches and political players and technical
| innovations that were on the scene.
|
| People have been working damned hard to build things like
| Gtk, Qt, etc. not to mention Wayland, etc. all the while
| maintaining compatibility etc and I personally am happy for
| their efforts.
|
| BeOS/HaikuOS is a product of a mid-90s engineering scene
| that predates the proliferation of GPUs, the web, and the
| set of programming languages that we work with today.
| There's nothing wrong with it in that context, but it's
| also not "better." Just different compromises.
|
| The other one I see nostalgia nerds reach for is the Amiga.
| A system _highly_ coupled to a set of custom chips that
| only made sense when RAM was as fast (or faster) than the
| CPU, whose OS had no memory protection, and which was
| easily outstripped in technical abilities by the early 90s
| by PCs with commodity ISA cards, etc. because of the
| development of economics of scale in computer
| manufacturing, etc. It was ahead of its time for about 2-3
| years in the mid 80s, but in a way that was a dead end.
|
| Anyways, what we have right now is a messy set of
| compromises. It doesn't hurt to go looking for
| simplifications, but it _does_ hurt to pretend that the
| compromises don 't exist for a reason.
|
| EDIT: I would add though that "multiuser" as part of a
| desktop (or especially mobile) OS has maybe proven to be a
| pointless thing. The vast majority of Linux machines out
| there in the world are run in a single user fashion, even
| if they are capable of multiuser. Android phones,
| Chromebooks, desktop machines, and even many servers --
| mostly run with just one master user. And we've also seen
| how rather not-good the Unix account permissions model is
| in terms of security, and how weak its timesharing of
| resources etc is in context of today's needs -- hence the
| development of cgroups, containers, and virtual machine /
| hypervisor etc.
| BenjiWiebe wrote:
| They run with one master user perhaps, but they have
| multiple users at one time anyways.
| cmrdporcupine wrote:
| I mean, in those cases they're almost always just using
| multiple users as a proxy for job authorization levels,
| not people.
|
| Anybody who is serious about securing a single physical
| machine for multiuser access isn't doing it through
| multiple OS accounts, and is slicing it up by VMs,
| instead.
|
| I _do_ have a home (Windows) computer that gets used by
| multiple family members through accounts, but I think
| this isn 't a common real world use case.
| pjmlp wrote:
| Unfortunately being technology superior isn't enough to
| win.
|
| Money, marketing, politics, wanting to be part of the
| crowd, usually play a bigger role.
| timetraveller26 wrote:
| I was greatly impressed with BeOS filesystem SQL'esque
| indexing and querying.
| fulafel wrote:
| You can bypass the compositor with X11 hw acceleration features
| too. But what about Wayland? I thought apps always go through
| the compositor there. As shown eg in the diagram at
| https://www.apertis.org/architecture/wayland_compositors/
|
| Drawing directly from app to kernel graphics buffers (or hw
| backed surfaces) and participating in composition are
| orthogonal I think. The compositor may be compositing the
| kernel or hw backed surfaces.
| arghwhat wrote:
| Bypassing _composition_ is a key feature in Wayland. The
| protocols are all written with zero-copy in mind. The X11
| tricks are already there, forwarding one clients buffers
| straight to hardware, while the end-game with libliftoff is
| offloading multiple subsurfaces directly to individual planes
| at once.
|
| The _compositor_ is the entire display server in Wayland, and
| is also the component responsible for bypassing composition
| when possible.
| fulafel wrote:
| In my meager understanding which I'm happy to be corrected
| about: In a windowed scenario (vs fullscreen), in both the
| X direct-rendering and Wayland scenarios the application
| provides a (possibly gpu backed) surface that the
| compositor uses as a texture when forming the full screen
| video output.
|
| In a full-screen case AFAIK it's possible to skip the
| compositing step with X11, and maybe with Wayland too.
|
| "Zero copy" seems a bit ambiguous term in graphics because
| there's the kind of copying where whole screen or window
| sized buffers are being copied about, and then there are
| compositing operations where various kinds of surfaces are
| inputs to the final rendering, where also pixels are copied
| around possibly several times in shaders but there aren't
| necessary extra texture sized intermediate buffers
| involved.
| arghwhat wrote:
| > In a full-screen case AFAIK it's possible to skip the
| compositing step with X11, and maybe with Wayland too.
|
| This is the trivial optimization all Wayland compositors
| do.
|
| The neater trick is to do this for non-fullscreen content
| - and even just parts of windows - using overlay planes.
| Some compositors have their own logic for this, but
| libliftoff aims to generalize it.
|
| Zero-copy is not really ambiguous, but to clarify:
| Wayland protocols are designed to maximize the cases
| where a buffer rendered by a client can be presented
| directly by the display hardware as-is (scanned out),
| without any intermediate operations on the content.
|
| Note "maximize" - the content must be compatible with
| hardware capabilities. Wayland provides hints to stay
| within capabilities, but a client might pick a render
| buffer modifier/format that cannot be scanned out by the
| display hardware. GPUs have a _lot_ of random
| limitations.
| AshamedCaptain wrote:
| To this day, moving a Haiku window under a 5k unaccelerated EFI
| GOP framebuffer still feels _significantly_ faster than doing
| the same under Windows 11, and everything KWin's X/Wayland has
| to offer on the same hardware (AMD Navi 2 GPU).
|
| > BeOS and Haiku allowed exposure to kernel graphics buffers
| decades ago
|
| In any case, X also allowed this for ages, with XV and XShm and
| the like. Of course then everyone got rid of this in order to
| have the GPU in the middle for fancier animations and whatnot,
| and things went downhill since.
| baybal2 wrote:
| Android has horrific UI latency despite heavily employing
| hardware acceleration.
|
| Enlightenments EFL was exclusively software for a long time,
| but is buttery smooth despite largely using redraw most of
| the time.
|
| Hardware acceleration does not compensate for the lack of
| hard computer science knowledge about efficient memory
| operations, caching, and such
| play_ac wrote:
| >In any case, X also allowed this for ages, with XV and XShm
| and the like.
|
| No, XShm doesn't do that and the way XV does it is completely
| dependent on drivers. If you're using Glamour then XV won't
| use overlays at all. XShm uses a buffer created in CPU memory
| allocated by the client that the X server then has to copy to
| the screen.
|
| > Of course then everyone got rid of this in order to have
| the GPU in the middle for fancier animations
|
| No, for video, the GPU is used in the middle so you can do
| post-processing without copying everything into main memory
| and stalling the whole pipeline. I'd like to see an actual
| benchmark for how a fullscreen 5k video with post-processing
| plays on Haiku without any hardware acceleration.
| AshamedCaptain wrote:
| > XShm uses a buffer created in CPU memory allocated by the
| client that the X server then has to copy to the screen.
|
| Fair enough. Even with XShmCreatePixmap, you are still
| never simply mmaping the card's actual entire framebuffer,
| unlike what BDirectWindow allows (if https://www.haiku-
| os.org/legacy-docs/benewsletter/Issue3-12.... is to
| believed, which is closer to something like DGA). In XShm,
| the server still has to copy your shared memory segment to
| the actual framebuffer.
|
| (sorry for previous answer here, I misunderstood your
| comment)
|
| > No, for video, the GPU is used in the middle so you can
| do post-processing without copying everything into main
| memory and stalling the whole pipeline.
|
| Depends on what you mean by "post-processing". You can do
| many types of card-accelerated zero-copy post-processing
| using XV: colorspace conversion, scaling, etc. At the time,
| scaling the video in software or even just doing an extra
| memory copy per frame would have tanked frame rate -- Xine
| can be used to watch DVDs in PentiumII-level hardware.
| Obviously you cannot put the video in the faces of rotating
| 3D cubes, but this is precisely what I call "fancy
| animations".
| play_ac wrote:
| >colorspace conversion, scaling
|
| There's a lot more than that. Please consider installing
| the latest version of VLC or something like that and
| checking all the available post-processing effects and
| filters. These aren't "fancy animations" and they're not
| rotating 3D cubes, they're basic features that a video
| player is required to support now. If you want to support
| arbitrary filters then you need to use the GPU. All these
| players stopped using XV ages ago, on X11 you'll get the
| GPU rendering pipeline too because of this.
|
| I don't really see what's the point of making these
| condescending remarks like trying to suggest that
| everyone is stupid and is only interested in making
| wobbly windows and spinning cubes. Those have never been
| an actual feature of anything besides Compiz, which is a
| dead project.
| AshamedCaptain wrote:
| I don't see what you mean by "condescending remarks", but
| I do think it is stretching it to claim "arbitrary
| filters" is a "basic feature that a video player is
| required to support now". As a consumer, I have
| absolutely _never_ used any such video filters, doubt
| most consumers are even aware of them, have seen few
| video players which support them, and most definitely I
| have no idea what they are in VLC. Do they even enable
| any video filters by default? The only video filter I
| have sometimes used is deinterlacing which doesn't really
| fit well in the GPU rendering pipeline anyway but fits
| very nicely in fixed hardware. So yes, I hardly see the
| need to stop using native accelerated video output and
| fallback to GPU just in case someone wants to use such
| filters. This is how I end up with a card which consumes
| 20W just for showing a static desktop on two monitors.
|
| Anyway, discussing about this is besides the point, and
| forgive me from the rant above.
|
| If you really need GPU video filters then the GPU is
| obviously going to be the best way to implement them,
| there's no discussion possible about that. But the entire
| point of TFA is to (dynamically) go back to a model where
| the GPU is _not_ in the middle. And that model -- sans
| GPU -- happens to match what Xv was doing and is actually
| faster and less power consuming than to always blindly
| use the GPU which is where we are now post-Xv.
| DonHopkins wrote:
| That's also how SunView worked on SunOS in 1982, and Sun's
| later GX graphics accelerated framebuffer driver worked in the
| 90's. The kernel managed the clipping list, and multiple
| processes shared the same memory, locking and respecting the
| clipping list and pixel buffers in shared memory (main memory,
| not GPU memory!), so multiple process could draw on different
| parts of the screen efficiently, without incurring system calls
| and context switches.
|
| https://en.wikipedia.org/wiki/SunView
|
| Programmers Reference Manual for the Sun Window System, rev C
| of 1 November 1983: Page 23, Locking and Clipping:
|
| http://bitsavers.trailing-edge.com/pdf/sun/sunos/1.0/800-109...
|
| But GPUs change the picture entirely. From what I understand by
| reading the article, GTK uses GL to render in the GPU then
| copies the pixels into main memory for the compositor to mix
| with other windows. But in modern GPU-first systems, the
| compositor is running in the GPU, so there would be no reason
| to ping-pong the pixels back and forth between CPU and GPU
| memory after drawing 3D or even 2D graphics with the GPU, even
| when having different processes draw and render the same
| pixels.
|
| So I'm afraid Wayland still has a lot of catching up to do, if
| it still uses a software compositor, and has to copy pixels
| back from the GPU that it drew with OpenGL. (Which is what I
| interpret the article as saying.)
|
| More recently (on an archeological time scale, but for many
| years by now), MacOS, Windows, iOS, and Android have all
| developed ways of sharing graphics between multiple processes
| not only in shared main CPU memory, but also on the GPU, which
| greatly accelerates rendering, and is commonly used by web
| browsers, real time video playing and processing tools, desktop
| window managers, and user interface toolkits.
|
| There are various APIs to pass handles to "External" or "IO
| Surface" shared GPU texture memory around between multiple
| processes. I've written about those APIs on Hacker News
| frequently over the years:
|
| https://news.ycombinator.com/item?id=13534298
|
| DonHopkins on Jan 31, 2017 | parent | context | favorite | on:
| Open-sourcing Chrome on iOS
|
| It's my understanding that only embedded WKWebViews are allowed
| to enable the JIT compiler, but not UIWebViews (or in-process
| JavaScriptCore engines). WKWebView is an out-of-process web
| browser that uses IOSurface [1] to project the image into your
| embedding application and IPC to send messages.
|
| So WKWebView's dynamically generated code is running safely
| firewalled in a separate address space controlled by Apple and
| not accessible to your app, while older UIWebViews run in the
| address space of your application, and aren't allowed to write
| to code pages, so their JIT compiler is disabled.
|
| Since it's running in another process, WkWebView's
| JavaScriptEngine lacks the ability to expose your own Objective
| C classes to JavaScript so they can be called directly [2], but
| it does include a less efficient way of adding script message
| handlers that call back to Objective C code via IPC [3].
|
| [1] https://developer.apple.com/reference/iosurface
|
| [2]
| https://developer.apple.com/reference/javascriptcore/jsexpor...
|
| [3]
| https://developer.apple.com/reference/webkit/wkusercontentco...
|
| https://news.ycombinator.com/item?id=18763463
|
| DonHopkins on Dec 26, 2018 | parent | context | favorite | on:
| WKWebView, an Electron alternative on macOS/iOS
|
| Yes, it's a mixed bag with some things better and others worse.
| But having a great JavaScript engine with the JIT enabled is
| pretty important for many applications. But breaking up the
| browser into different processes and communicating via messages
| and sharing textures in GPU memory between processes
| (IOSurface, GL_TEXTURE_EXTERNAL_OES, etc) is the inextricable
| direction of progress, what all the browsers are doing now, and
| why for example Firefox had to make so many old single-process
| XP-COM xulrunner plug-ins obsolete.
|
| IOSurface:
|
| https://developer.apple.com/documentation/iosurface?language...
|
| https://shapeof.com/archives/2017/12/moving_to_metal_episode...
|
| GL_TEXTURE_EXTERNAL_OES:
|
| https://developer.android.com/reference/android/graphics/Sur...
|
| http://www.felixjones.co.uk/neo%20website/Android_View/
|
| pcwalton on Dec 27, 2018 | prev [-]
|
| Chrome and Firefox with WebRender are going the opposite
| direction and just putting all their rendering in the chrome
| process/"GPU process" to begin with.
|
| DonHopkins on Dec 27, 2018 | parent [-]
|
| Yes I know, that's exactly what I meant by "breaking up the
| browser into different processes". They used to all be in the
| same process. Now they're in different processes, and
| communicate via messages and shared GPU memory using platform
| specific APIs like IOSurface. So it's no longer possible to
| write an XP/COM plugin for the browser in C++, and call it from
| the renderer, because it's running in a different process, so
| you have to send messages and use shared memory instead. But
| then if the renderer crashes, the entire browser doesn't crash.
|
| https://news.ycombinator.com/item?id=20313751
|
| DonHopkins on June 29, 2019 | parent | context | favorite | on:
| Red Hat Expecting X.org to "Go into Hard Maintenan...
|
| Actually, Electron (and most other web browsers) on the Mac
| OS/X and iOS use IOSurface to share zero-copy textures in GPU
| memory between the render and browser processes. Android and
| Windows (I presume, but don't know name of the API, probably
| part of DirectX) have similar techniques. It's like shared
| memory, but for texture memory in the GPU between separate
| heavy weight processes. Since simply sharing main memory
| between processes wouldn't be nearly as efficient, requiring
| frequent uploading and downloading textures to and from the
| GPU.
|
| Mac OS/X and iOS IOSurface:
|
| https://developer.apple.com/documentation/iosurface?language...
|
| http://neugierig.org/software/chromium/notes/2010/08/mac-acc...
|
| https://github.com/SimHacker/UnityJS/blob/master/notes/IOSur...
|
| Android SurfaceTexture and GL_TEXTURE_EXTERNAL_OES:
|
| https://developer.android.com/reference/android/graphics/Sur...
|
| https://www.khronos.org/registry/OpenGL/extensions/OES/OES_E...
|
| https://docs.google.com/document/d/1J0fkaGS9Gseczw3wJNXvo_r-...
|
| https://github.com/SimHacker/UnityJS/blob/master/notes/ZeroC...
|
| https://github.com/SimHacker/UnityJS/blob/master/notes/Surfa...
|
| https://news.ycombinator.com/item?id=25997356
|
| DonHopkins on Feb 2, 2021 | parent | context | favorite | on:
| VideoLAN is 20 years old today
|
| >Probably do a multi-process media player, like Chrome is
| doing, with parsers and demuxers in a different process, and
| different ones for decoders and renderers. Knowing that you
| probably need to IPC several Gb/s between them. Chrome and
| other browsers and apps, and drivers like virtual webcams, and
| libraries like Syphon, can all pass "zero-copy" image buffers
| around between different processes by sharing buffers in GPU
| memory (or main memory too of course) and sending IPC messages
| pointing to the shared buffers.
|
| That's how the browser's web renderer processes efficiently
| share the rendered images with the web browser user interface
| process, for example. And how virtual webcam drivers can work
| so efficiently, too.
|
| Check out iOS/macOS's "IOSurface":
|
| https://developer.apple.com/documentation/iosurface
|
| >IOSurface Share hardware-accelerated buffer data (framebuffers
| and textures) across multiple processes. Manage image memory
| more efficiently.
|
| >Overview: The IOSurface framework provides a framebuffer
| object suitable for sharing across process boundaries. It is
| commonly used to allow applications to move complex image
| decompression and draw logic into a separate process to enhance
| security.
|
| And Android's "SurfaceTexture" and GL_TEXTURE_EXTERNAL_OES:
|
| https://developer.android.com/reference/android/graphics/Sur...
|
| >The image stream may come from either camera preview or video
| decode. A Surface created from a SurfaceTexture can be used as
| an output destination for the android.hardware.camera2,
| MediaCodec, MediaPlayer, and Allocation APIs. When
| updateTexImage() is called, the contents of the texture object
| specified when the SurfaceTexture was created are updated to
| contain the most recent image from the image stream. This may
| cause some frames of the stream to be skipped.
|
| https://source.android.com/devices/graphics/arch-st
|
| >The main benefit of external textures is their ability to
| render directly from BufferQueue data. SurfaceTexture instances
| set the consumer usage flags to GRALLOC_USAGE_HW_TEXTURE when
| it creates BufferQueue instances for external textures to
| ensure that the data in the buffer is recognizable by GLES.
|
| And Syphon, which has a rich ecosystem of apps and tools and
| libraries:
|
| http://syphon.v002.info
|
| >Syphon is an open source Mac OS X technology that allows
| applications to share frames - full frame rate video or stills
| - with one another in realtime. Now you can leverage the
| expressive power of a plethora of tools to mix, mash, edit,
| sample, texture-map, synthesize, and present your imagery using
| the best tool for each part of the job. Syphon gives you
| flexibility to break out of single-app solutions and mix
| creative applications to suit your needs.
|
| Of course there's a VLC Syphon server:
|
| https://github.com/rsodre/VLCSyphon
| mananaysiempre wrote:
| > From what I understand by reading the article, GTK uses GL
| to render in the GPU then copies the pixels into main memory
| for the compositor to mix with other windows.
|
| This seems very strange to me. It's how things would work
| with wl_shm, which is the baseline pixel-pushing interface in
| Wayland, but AFAIU Gtk uses EGL / Mesa, which in turn uses
| Linux dmabufs, which is how you do hardware-accelerated
| rendering / DRI on Linux today in general.
|
| However, _how_ precisely Linux dmabufs work in a DRI context
| is not clear to me, because the documentation is lacking, to
| say the least. It seems that you can ask map to dmabufs into
| memory, and you can create EGLSurfaces from them, but are
| they always mapped into CPU memory (if only kernel-side), or
| can they be bare GPU memory handles until the user asks to
| map them?
|
| I'd hope for the latter, and if so, the only thing the work
| discussed in the article avoids is extra _GPU_ -side blits
| (video decoding buffer to window buffer to screen), which is
| non-negligible but not necessarily the end of the world.
| rjsw wrote:
| Linux on ARM SoCs with HW video decoders that are separate
| to the GPU can use the V4L2 API to avoid some copying. The
| decoder writes a frame to a buffer that the GPU can see
| then you use GL to get the GPU to merge it into the
| framebuffer.
| DonHopkins wrote:
| I misunderstood the article saying "exports the resulting
| texture" as meaning it exported it to CPU memory, but
| audidude explained how it actually works.
|
| I believe GL_TEXTURE_EXTERNAL_OES is an Android-only OpenGL
| extension that takes the place of some uses of DMABUF but
| is not as flexible and general.
|
| ChatGPT seems to know more about them, but I can't
| guarantee how accurate and up-to-date it is:
|
| https://chat.openai.com/share/abff036b-3020-4093-a13b-86cbf
| 0...
|
| The tricky bit may be teaching pytorch to accept dmabuf
| handles and read and write dmabuf GPU buffers. (And ffmpeg
| too!)
| audidude wrote:
| I can't respond to everything incorrect in this, because it's
| way to long to read. But from the very start...
|
| Also, I wrote a significant part of GTK's current OpenGL
| renderer.
|
| > But GPUs change the picture entirely. From what I
| understand by reading the article, GTK uses GL to render in
| the GPU then copies the pixels into main memory for the
| compositor to mix with other windows.
|
| This is absolutely and completely incorrect. Once we get
| things into GL, the texture is backed by a DMABUF on Linux.
| You never read it back into main memory. That would be very,
| very, very stupid.
|
| > But in modern GPU-first systems, the compositor is running
| in the GPU, so there would be no reason to ping-pong the
| pixels back and forth between CPU and GPU memory after
| drawing 3D or even 2D graphics with the GPU, even when having
| different processes draw and render the same pixels.
|
| Yes, the compositor is running in the GPU too. So of course
| we just tell the compositor what the GL texture id is and it
| composites if it cannot map that texture (again, because it's
| really a DMABUF) as a toplevel plane for hardware scanout
| _without_ using 3d capabilities at all.
|
| That doesn't mean unaccelerrated. It means it doesn't power
| up the 3d part of the GPU. It's the fastest way in/out with
| the least power. You can avoid "compositing" from a
| compositor too when things are done right.
|
| > So I'm afraid Wayland still has a lot of catching up to do,
| if it still uses a software compositor, and has to copy
| pixels back from the GPU that it drew with OpenGL. (Which is
| what I interpret the article as saying.)
|
| Again, completely wrong.
|
| > Check out iOS/macOS's "IOSurface":
|
| Fun fact, I wrote the macos backend for GTK too. And yes, it
| uses IOSurface just like DMABUF works on Linux.
| kaba0 wrote:
| I'm not the parent poster, I'm just tryin go to grab this
| opportunity that I "met" someone so familiar with GTK :)
|
| Could you please share your opinion on the toolkit, and its
| relation to others? Also, I heard that there were quite a
| lot of tech debt in GTK3 and part of the reason why GTK4
| came as a bigger update is to fix those -- what would you
| say, was it successful? Or is there still some legacy
| decisions that harm the project somewhat?
| audidude wrote:
| > Also, I heard that there were quite a lot of tech debt
| in GTK3 and part of the reason why GTK4 came as a bigger
| update is to fix those
|
| GTK 3 itself was trying to lose the tech debt of 2.x
| (which in turn 1.x). But they were still all wrapping a
| fundamentally crap API of X11 for graphics in this
| century.
|
| GTK 4 changed that, and it now wraps a Wayland model of
| API. That drastically simplified GDK, which is why I
| could write a macOS backend in a couple of weeks.
|
| It also completely changed how we draw. We no longer do
| immediate mode style (in the form of Cairo) and instead
| do a retained mode of draw commands. That allows for lots
| of new things you just couldn't do before with the old
| drawing model. It will also allow us to do a lot more fun
| things in the future (like threaded/tiled renderers).
|
| The APIs all over the place were simplified and focused.
| I can't imagine writing an application the size of GNOME
| Builder again with anything less than GTK 4.
|
| Hell, last GNOME cycle I rewrote Sysprof from scratch in
| a couple months, and it's become my co-pilot every day.
| kaba0 wrote:
| Thanks for the comment and for your work!
| DonHopkins wrote:
| Thank you for the correction, it's a relief! That's nice
| work.
|
| I'm sorry, I misinterpreted the paragraph in the article
| saying "exports" as meaning that it exports the pixels from
| GPU memory to CPU memory, not just passing a reference like
| GL_TEXTURE_EXTERNAL_OES and IOSurface does.
|
| >GTK has already been using dmabufs since 4.0: When
| composing a frame, GTK translates all the render nodes
| (typically several for each widget) into GL commands, sends
| those to the GPU, and mesa then exports the resulting
| texture as a dmabuf and attaches it to our Wayland surface.
|
| Perhaps I'd have been less confused if it said "passes a
| reference handle to the resulting texture in GPU memory"
| instead of "exports the resulting texture", because
| "exports" sounds expensive to me.
|
| Out of curiosity about the big picture, are dmabufs a Linux
| thing that's independent of OpenGL, or independent of the
| device driver, or build on top of GL_TEXTURE_EXTERNAL_OES,
| or is GL_TEXTURE_EXTERNAL_OES/SurfaceTexture just an
| Android or OpenGL ES thing that's an alternative to dmabufs
| in Linux? Do they work without any dependencies on X or
| Wayland or OpenGL, I hope? (Since pytorch doesn't use
| OpenGL.)
|
| https://source.android.com/docs/core/graphics/arch-st
|
| One practical non-gui use case I have for passing
| references to GPU textures between processes on Linux is
| pytorch: I'd like to be able to decompress video in one
| process or docker container on a cloud instance with an
| NVidia accelerator, and then pass zero-copy references to
| the resulting frames into another process (or even two --
| each frame of video needs to be run through two different
| vision models) in another docker container running pytorch,
| sharing and multitasking the same GPU, possibly sending
| handles through a shared local file system or ipc (like how
| IOSurface uses Mach messages to magically send handles, or
| using unix domain sockets or ZeroMQ or something like
| that), but I don't know if it's something that's supported
| at the Linux operating system level (ubuntu), or if I'd
| have to drop down to the NVidia driver level to do that.
|
| NVidia has some nice GPU video decompressor libraries, but
| they don't necessarily play well with pytorch in the same
| process, so I'd like to run them (or possibly ffmpeg) in a
| different process, but on the same GPU. Is it even
| possible, or am I barking up the wrong tree?
|
| It would be ideal if ffmpeg had a built-in "headless" way
| to perform accelerated video decompression and push out GPU
| texture handles to other processes somehow, instead of
| rendering itself or writing pixels to files or touching CPU
| memory.
| audidude wrote:
| > Out of curiosity about the big picture, are dmabufs a
| Linux thing that's independent of OpenGL, or independent
| of the device driver,
|
| They are independent of the graphics subsystem altogether
| (although that is where they got their start, afaik).
| Your webcam also uses DMABUF. So if you want to display
| your webcam from a GTK 4 application, this
| GtkGraphicsOffload will help you take that DMABUF from
| your camera (which may not be mappable on CPU memory, but
| can DMA pass to your GPU), and display it in a GTK
| application. It could either be composited on the GPU, or
| mapped directly to scanout if the right conditions are
| met.
|
| I wrote a library recently (libmks) and found the
| culprits in Qemu/VirGL/virtio_gpu that were preventing
| passing a DMABUF from inside a guest VM to the host. That
| stuff is all fixed now so theoretically you could even
| have a webcam in a VM which then uses a GTK 4 application
| to render with VirGL and the compositor submit the scene
| to the host OS which itself can set the planes correctly
| to get the same performance as if it were in the host OS.
|
| > I'd like to be able to decompress video in one process
| or docker container on a cloud instance with an NVidia
| accelerator, and then pass zero-copy references
|
| If you want this stuff with NVidia, and you're a
| customer, I highly suggest you tell your NVidia
| representative this. Getting them to use DMABUF in a
| fashion that can be used from other sub-systems would be
| fantastic.
|
| But at it's core, if you were using Mesa and open drivers
| for some particular piece of hardware, yes it's capable
| of working given the right conditions.
| play_ac wrote:
| No, that isn't allowing exposure to kernel graphics buffers.
| That's allowing clients to draw to the main framebuffer with no
| acceleration at all. If you're memory mapping pixels into user
| space and drawing with the CPU then you're necessarily leaving
| kernel space. Around the same time X11 had an extension called
| DGA that did the same thing. It was removed because it doesn't
| work correctly when you have hardware acceleration.
|
| So the optimization only makes sense for a machine like yours
| with no native drivers. With any kind of GPU acceleration it
| will actually make things much slower. GTK doesn't do this
| because it would only be useful for that kind of machine
| running around 25 years ago.
| DonHopkins wrote:
| Shared kernel graphics buffers in main or memory mapped
| framebuffer device memory are one thing (common in the early
| 80's, i.e. 1982 SunView using /dev/fb), but does it expose
| modern shared GPU texture buffers to multiple processes,
| which are a whole other ball game, and orders of magnitude
| more efficient, by not requiring ping-ponging pixels back and
| forth between the CPU and GPU when drawing and compositing or
| crossing process boundaries?
| AshamedCaptain wrote:
| > With any kind of GPU acceleration it will actually make
| things much slower. GTK doesn't do this because it would only
| be useful for that kind of machine running around 25 years
| ago.
|
| Precisely one of the points of TFA is be able to use the "25
| year old" hardware overlay support whenever possible (instead
| of the GPU) in order to save power, like Android (and classic
| Xv) does.
| ris wrote:
| > After 25 years, GTK joins the party ...
|
| I mean.. shall we start a list of ways in which BeOS/Haiku have
| yet to "join the party" that the linux desktop has managed?
|
| Juggling the various needs of one hell of a lot more users
| across a lot more platforms, with a lot more API-consuming apps
| to keep working on systems which are designed with a lot more
| component independence is a much harder problem to solve.
| CyberDildonics wrote:
| Are you mixing up number of users with technical
| sophistication?
| pengaru wrote:
| Wasn't this one of the main security issues with the BeOS
| architecture?
|
| It's not the same thing as what's being done here via Wayland.
| BeOS is more YOLO style direct access of the framebuffer
| contents, without solving any of the hard problems (I don't
| think it was really possible to do properly using the available
| hardware at the time).
| thriftwy wrote:
| I can't imagine anybody passing video frames one by one by a
| system call as a n array of pixels.
|
| I believe neither Xv nor GL-based renderers do that, even before
| we discuss hw accel.
| chlorion wrote:
| Emulators for older systems very often do this!
|
| The older consoles like the NES had a pixel processing unit
| that generated the picture, you need to emulate it's state and
| interaction with the rest of the system, possibly cycle-by-
| cycle, which makes it not possible to do on the GPU as a shader
| or whatever.
|
| This is kind of a niche use case for sure but it's interesting.
| thriftwy wrote:
| They usually use libSDL instead of GTK, though.
| donatj wrote:
| Is this the same sort of thing Windows 95-XP had before Vista
| added DWM?
|
| Back when videos wouldn't show up in screen shots and their
| position on screen could sometimes gets out of sync with the
| window they were playing in?
|
| I never fully understood what was happening but my theory at the
| time was that the video was being sent to the video card separate
| from the rendered windows.
| superkuh wrote:
| By going wayland only Gtk is becoming less of a GUI toolkit and
| more of just an extremely specific and arbitrary lib for GNOME
| desktop environment.
| mixedCase wrote:
| Gtk is not going wayland only.
| superkuh wrote:
| https://www.phoronix.com/news/GTK5-Might-Drop-X11 "Red Hat's
| Matthias Clasen opened an issue on GTK entitled "Consider
| dropping the X11 backend""
| https://gitlab.gnome.org/GNOME/gtk/-/issues/5004
| walteweiss wrote:
| Everyone goes Wayland, not just Gnome, X is obsolete.
| superkuh wrote:
| And the handful of feature incompatible waylands are feature
| incomplete. You still can't keyboard/mouse share under any of
| them. That's just one of innumberable things you can't do.
| walteweiss wrote:
| I assume that will happen over time, won't it?
| freedomben wrote:
| > _By going wayland only_
|
| You mean this one very small piece? That seems a bit hyperbolic
| amelius wrote:
| This sounds so overly complicated considering that you can do all
| this in HTML without much effort.
| freedomben wrote:
| I wonder why then, don't they just implement GTK with HTML?
|
| Probably because HTML is at the very _top_ of the stack, while
| this is much lower... Without everything below it on the stack,
| HTML is just a text file.
| amelius wrote:
| You're reading into it too much.
|
| All I meant was: if the API of HTML is simple, then why does
| GTK's API have to be that complicated.
| kaba0 wrote:
| This is literally an implementation detail, the API part is
| a single flag you can set if you want your subsurface to
| potentially make use of this future.
| bogwog wrote:
| Semi-related question: Are there any real benefits to having the
| compositor deal with compositing your buffers as opposed to doing
| it yourself? Especially if you're already using hardware
| acceleration, passing your buffers to the system compositor seems
| like it could potentially introduce some latency.
|
| I guess it would allow the user/system to customize/tweak the
| compositing process somehow, but for what purpose?
| kaba0 wrote:
| Any kind of post-process effect like transparency, zoom; many
| stuff like window previews, overviews screens (these are
| sometimes possible without as well), and tear-freedom.
| neurostimulant wrote:
| Rounded corners seems like a feature that has unexpectedly high
| performance penalty but the ui designers refused to let it go.
| bee_rider wrote:
| Is it possible that they are just the well know representative
| example? I vaguely suspect they that is the case, but I can't
| think of the broader class they are an example of, haha.
|
| The play button they show seems to be a good one, though. It is
| really nice to have it overlaid on the video.
| DonHopkins wrote:
| Fortunately the Play button disappears when the video starts
| playing, so it has no effect on the frame rate!
|
| Or instead of a triangular Play button, you could draw a big
| funny nose in some position and orientation, and the game
| would be to pause the video on a frame with somebody's face
| in it, with the nose in just the right spot.
|
| I don't know why the vlc project is ignoring my prs.
| solarkraft wrote:
| It's something I as a user would also refuse to let go, given
| that the performance penalty is reasonably small (I think it
| is).
| torginus wrote:
| I think the point is that it's not - rather than just copying
| a rectangular area to the screen, you have to go through the
| intermediate step of rendering everything to a temporary
| buffer, and compositing the results via a shader.
| chris_wot wrote:
| But... the example given shows that they place the video
| frame behind the window an make the front window
| transparent except for the round play button. This
| apparently offloads the frame... so why not just do the
| same for rounded corners?
|
| What am I missing?
| DonHopkins wrote:
| It's not like crazy out of control avant garde different
| thinking UI designers haven't and totally ruined the user
| interface of a simple video player ever before!
|
| Interface Hall of Shame - QuickTime 4.0 Player (1999):
|
| http://hallofshame.gp.co.at/qtime.htm
| bsder wrote:
| Professional designers mostly cut their teeth on physical
| objects and physical objects almost _never_ have sharp corners.
|
| This then got driven into the ground with the "Fisher-Price
| GUI" that is the norm on mobile because you can't do anything
| with precision since you don't have a mouse.
|
| I would actually really like to see a UI with just rectangles.
| Really. It's okay, designers. Take a deep breath and say: "GUIs
| aren't bound by the physical". BeOS and MacOS used to be very
| rectangular. Give us a nice nostaligia wave of fad design with
| rectangles, please.
|
| Animations and drop shadows are another thing I'd like to see
| disappear.
| twoodfin wrote:
| Rounded corners for windows have been in the Macintosh
| operating system since the beginning.
|
| https://www.folklore.org/StoryView.py?story=Round_Rects_Are_.
| ..
| sylware wrote:
| steam deck wayland compositor is built on drm(dmabufs)/vulkan.
|
| (Sad it is written in c++)
|
| But what did surprise me even more: I was expecting GTK, and
| moreover the 4, to be on par with valve software.
|
| On linux, I would not event think to code a modern system
| hardware accelerated GFX component which is not
| drm(dmabuf)/vulkan.
| charcircuit wrote:
| This blog post is not talking about Mutter, GNOME's compositor.
| GTK's hardware acceleration had already been using dmabufs
| before adding this graphics offload feature.
| sylware wrote:
| But as the article states, with GL, not vulkan.
|
| Unless the article is obsolete itself?
| charcircuit wrote:
| Sure, but OpenGL itself is still useful and used in modern
| software.
| sylware wrote:
| It is legacy and started to be retired.
|
| Not to mention GL is a massive and gigantic kludge/bloat
| compared to vulkan (mostly due to the glsl compiler). So
| this is good to let it go.
| tristan957 wrote:
| If you have spare time, the GTK maintainers want people
| to work on the Vulkan renderer. Benjamin Otte and Georges
| Stavracas Neto have put in a bit of effort to make the
| Vulkan renderer better.
|
| GL is only deprecated on Mac from what I understand.
| charcircuit wrote:
| I don't think Valve's window toolkit ever supported
| vulkan. Steam no longer uses OpenGL because they replaced
| their window toolkit with Chrome.
|
| >It is legacy and started to be retired.
|
| The standard itself, but the implementations are still
| being maintained and new extensions are being added. It
| is still a solid base to build upon.
| ori_b wrote:
| The thing that's always felt slow to me in GTK was resizing
| windows, not getting pixels to the screen. I'm wondering if
| adding all these composited surfaces adds a cost when resizing
| the windows and their associated out of process surfaces.
| rollcat wrote:
| More likely it removes costs. This is very specifically an
| optimization.
___________________________________________________________________
(page generated 2023-11-18 23:00 UTC)