[HN Gopher] Ubershaders: A Ridiculous Solution to an Impossible ...
       ___________________________________________________________________
        
       Ubershaders: A Ridiculous Solution to an Impossible Problem (2017)
        
       Author : Grognak
       Score  : 49 points
       Date   : 2024-05-16 15:52 UTC (1 days ago)
        
 (HTM) web link (dolphin-emu.org)
 (TXT) w3m dump (dolphin-emu.org)
        
       | dang wrote:
       | Discussed at the time:
       | 
       |  _Ubershaders: A Ridiculous Solution to an Impossible Problem_ -
       | https://news.ycombinator.com/item?id=14884992 - July 2017 (88
       | comments)
        
       | GaggiX wrote:
       | The shader compilation stutter reminds me of a video I recently
       | saw where a developer solved the problem by running a large
       | portion of his game during its first loading:
       | https://youtu.be/oG-H-IfXUqI
       | 
       | The developer register himself playing the game and during the
       | first loading of the game, the entire gameplay is replayed at
       | high speed in the background on the machine.
        
       | corysama wrote:
       | The pixel shading of the GameCube were slower than that of the OG
       | Xbox. But, it was quite a bit more flexible. Specifically, the
       | GameCube could load a couple textures, do a bit of math, then use
       | that math to load some more texels. The Xbox could only load
       | textures as the starting instructions before doing math and tried
       | to make up for that with a few "do very specific math and load
       | textures in a single instruction" ops.
       | 
       | But, still... Both GPUs were pretty well suited for this
       | ubershader approach because they had a small, fixed limit on the
       | number of instructions they could run. And, very strictly defined
       | functionality for each instruction. They weren't really "shaders"
       | as much as highly flexible fixed function stages that you could
       | reasonably wedge in a text shader compiler as a front end and
       | only get a moderate to high amount of complaints about how strict
       | and limited the rules were for the assembly. I recall that both
       | shading units could reasonably be fully specified as C structs
       | that you manually packed into the GPU registers instead of using
       | a shader compiler at all.
        
         | bitwize wrote:
         | ISTR the GC pipeline being fixed-function while the Xbox had a
         | full-fat GPU (GeForce 3 variant) -- one of the reasons why the
         | Xbox absolutely smoked the other sixth-gen consoles in terms of
         | performance. Was I wrong?
        
           | phire wrote:
           | Extremely common misconception (even among developers on
           | those platforms)
           | 
           | In reality, the OG Xbox and GameCube GPUs are almost
           | identical in pixel shading capabilities (Though the
           | gamecube's vertex shading pipeline is legitimately fixed
           | function, but very flexible).
           | 
           | Despite their roughly equal capabilities, they were exposed
           | with very different APIs. The xbox used the new-fangled
           | "Shader" style API that Microsoft was introducing to the
           | industry at the time, while TEV used a very extended version
           | of the older "Texture Environment" style API that was
           | introduced with DirectX 7 and OpenGL 1.3.
        
           | corysama wrote:
           | The Xbox had fairly capable vertex shaders. But, phire's
           | comment does a better job of explaining the pixel
           | capabilities both machines than I did.
        
         | phire wrote:
         | _> The Xbox could only load textures as the starting
         | instructions before doing math and tried to make up for that
         | with a few  "do very specific math and load textures in a
         | single instruction" ops._
         | 
         | If you look closely, the TEV actually shares the same
         | limitation, it's just that the traditional representation
         | interleaves the texture fetch and math instructions (Because
         | the 3rd texture fetch "instruction" always feeds into the 3rd
         | math "instruction", for example). There are two independent
         | execution units, separated by a fifo and no way to backfeed
         | from the math back to texture fetch.
         | 
         | The two GPUs are roughly equivalent. The only reason the OG
         | Xbox is consider to "have pixel shaders" is that they were
         | exposed with a pixel shader API, while TEV was only ever
         | exposed with a "texture environment" based API. They are both
         | clearly register combiners, with no control flow, but they sit
         | right in the middle as GPUs were transitioning from register
         | combiners to "proper" pixel shaders. The team that designed
         | GameCube's GPU went on to develop the first DirectX 9 GPU.
         | 
         | I'm pretty sure the Xbox's pixel pipeline is slightly more
         | capable (and it also has programmable vertex shaders). But
         | developers all abandoned the xbox in 2005. TEV has a much
         | better reputation for being flexible because TEV was used in
         | the Wii all the way to ~2013. And graphics developers who were
         | exposed to much better shaders on the Xbox, PS3 and PC got very
         | good at back porting those modern techniques to the more
         | limited Wii. More than one studio created un-offical shader
         | compilers for the Wii, so they could share the same shaders
         | across PS3/Xbox/Wii/PC.
         | 
         |  _> I recall that both shading units could reasonably be fully
         | specified as C structs that you manually packed into the GPU
         | registers instead of using a shader compiler at all._
         | 
         | Yeah, not that they ever exposed that API.
         | 
         | The GameCube had great support for recording display lists, so
         | you could record a display list while you called the API
         | commands to configure TEV and then call that display list later
         | to quickly load the "shader". Some games even saved those
         | display lists to disc (or maybe generated them from scratch
         | with external tools) as a form of offline shader compilation.
        
       | phire wrote:
       | Has it really been 9 years since I started working on
       | Ubershaders?
       | 
       | I'm a little surprised no better solution has come along. Vulkan
       | didn't even exist back then (and DirectX 12 had only just
       | released) but instead of making things better, it digs it's feet
       | even deeper into the assumption that all shaders will be known
       | ahead of time (resulting in long "shader recompilation" dialogs
       | on startup on many games).
       | 
       | I've been tempted to build my own fast shader compiler into
       | Dolphin for many common GPU architectures. Hell, it wouldn't even
       | be a proper compiler, more of a templated emitter as all shaders
       | fit a pattern. Register allocation and scheduling could all be
       | pre-calculated.
       | 
       | But that would be even more insane than ubershaders, as it would
       | be one backend per gpu arch. And some drivers (like Nvidia) don't
       | provide a way to inject pre-compiled shader binaries.
       | 
       | On the positive side, ubershaders do solve the problem, and
       | modern GPU drivers do a much better job at accepting ubershaders
       | than they did 9 years ago. Though that's primarily because (as
       | far as I'm aware) examples of Dolphin's ubershader have made
       | their way into every single shader compiler test suite.
        
       ___________________________________________________________________
       (page generated 2024-05-17 23:00 UTC)