[HN Gopher] GPU Architecture Types Explained
___________________________________________________________________
GPU Architecture Types Explained
Author : asicsp
Score : 111 points
Date : 2021-07-20 13:50 UTC (9 hours ago)
(HTM) web link (rastergrid.com)
(TXT) w3m dump (rastergrid.com)
| Pieman103021 wrote:
| Archived link -
| https://web.archive.org/web/20210720135744/https://rastergri...
| nspattak wrote:
| apparently the poor web site felt some of the infinite hacker
| news exposure love :)
| dragontamer wrote:
| This seems like a misnomer. This seems more like rendering API
| architectures more so than GPU-architecture.
|
| Which is still important: Immediate mode vs Tile-based is a big
| shift in overall style. And GPU-hardware is designed for
| particular software architectures (because the CPU will be
| inevitably invoking calls in a certain pattern).
|
| But it'd probably be more accurate to call this blogpost
| "Rendering Architecture Types Explained" moreso than "GPU
| Architecture". A modern GPU running DirectX 9.0 or OpenGL 2.0
| would still be immediate mode for example.
| [deleted]
| oflordal wrote:
| No, this is about HW architectures. While they are likely
| evolving towards one a other there are tile based (like
| Imagination and ARM Mali) And immediate mode (Nvidia AMD) that
| both implement the same APIs (OpenGL, Vulkan etc). All these HW
| architectures are modern and in use.
| opencl wrote:
| Basically all modern GPU architectures implement tiled
| rasterization. NVIDIA has been doing it since Maxwell (2014)
| and AMD has been doing it since Vega (2017). Even Intel has
| been doing it for a few years now starting with their Gen 11
| (2019) GPUs.
| Arelius wrote:
| Those are going to require some serious citations. I'm
| quite sure most desktop GPUs don't run as tiled renderers
| at least under normal circumstances.
| brigade wrote:
| Section 5.2 of Intel's Gen11 architecture manual [1]
|
| (yes, PTBR is only enabled on passes the driver thinks
| will benefit from it)
|
| [1] https://software.intel.com/content/dam/develop/extern
| al/us/e...
| ryuuchin wrote:
| > Specifically, Maxwell and Pascal use tile-based
| immediate-mode rasterizers that buffer pixel output,
| instead of conventional full-screen immediate-mode
| rasterizers.
|
| https://www.realworldtech.com/tile-based-rasterization-
| nvidi...
|
| He describes it as "tile-based immediate mode" in the
| article and the video should go into more detail about
| it. It's been a while since I watched it.
| cma wrote:
| The parent article already discusses that article, saying
| those GPUs don't use TBR in areas where the primitive
| count is too high or something:
|
| > Another class of hybrid architecture is one that is
| often referred to as tile-based immediate-mode rendering.
| As dissected in this article[1], this hybrid architecture
| is used since NVIDIA's Maxwell GPUs. Does that mean that
| this architecture is like a TBR one, or that it shares
| all benefits of both worlds? Well, not really...
|
| What the article and the video fails to show is what
| happens when you increase the primitive count.
| Guillemot's test application doesn't support large
| primitive counts, but the effect is already visible if we
| crank up both the primitive and attribute count. After a
| certain threshold it can be noted that not all primitives
| are rasterized within a tile before the GPU starts
| rasterizing the next tile, thus we're clearly not talking
| about a traditional TBR architecture.
|
| [1] https://www.realworldtech.com/tile-based-
| rasterization-nvidi...
| monocasa wrote:
| Classic TBDRs typically require multiple passes on tiles
| with large primitive counts as well. Each tile's buffer
| containing binned geometry generally has a max size, with
| multiple passes required if that buffer size is exceeded.
| Arelius wrote:
| Yeah, please see
| https://news.ycombinator.com/item?id=27898421
|
| Having watched the video, I'm fairly certain what is
| being observed is not really tiled.
|
| I'm not however sure what a "tile-based immediate-mode
| rasterizers that buffer pixel output", but I think that's
| enough qualifications to make it somewhat meaningless.
| All modern gpu's dispatch thread groups that could look
| like "tiles" and have plenty of buffers, likely including
| buffers between fragment output, and render target
| output/color blending, But that doesn't make it a
| tiled/deferred renderer.
| monocasa wrote:
| AMD has even talked publicly about how their rasterizer
| can run in a TBDR mode that they call DSBR.
|
| https://pcper.com/2017/01/amd-vega-gpu-architecture-
| preview-...
| monocasa wrote:
| Interestingly, Nvidia has been using tile based rasterizers
| for a bit too. https://www.techpowerup.com/231129/on-nvidias-
| tile-based-ren...
| Arelius wrote:
| It's been often quoted that Nvidia has switched to tile
| based for their Desktop renderers, but I haven't seen a
| source that confirms this. I suspect this is speculation
| due to changes in raster order that produce side-effects
| that look tiled even though they aren't.
| ribit wrote:
| This has been empirically tested on multiple occasions.
| There is an article on realwordtechnologies discussing
| this, and the results have been related for newer AMD
| GPUs as well. I have a little tool for macOS that tests
| these things out, and the Navi GPU on my MacBook is
| definitely a tiler (the Gen10 Intel GPU is not).
| [deleted]
| [deleted]
| lmeyerov wrote:
| Agreed. For non-movies/games people -- think ML, neural
| networks, simulations, ETL -- this is far from how we think
| about them. Instead, focus is much more on thread divergence,
| NUMA memory models, consistency models, hw/sw schedulers,
| latency hiding, growing variety of DMA modes, funny ISA stacks,
| etc. The rendering pipeline is a tiny bit relevant for GPGPU
| people, e.g., if you're trying to do 1990s style shoehorning of
| it into antiquated webgl 1/2 rendering primitives because
| google/apple won't let you do the real thing.
| ribit wrote:
| I think that the article focuses too much on the academic
| distinction between immediate renders and tilers but fails short
| to discuss how these techniques relate to real-world GPUs. For
| example, the fact that all contemporary AMD and Nvidia gaming
| GPUs are tilers with large tiles (that's one of the key reasons
| why Maxwell and Navi got a big boost in performance). Or that
| many mainstream mobile GPUs employ various hacks (e.g. vertex
| shader splitting) in order to simplify the architecture, but
| which ultimately blocks their ability to scale to more advanced
| applications. Notably missing any mention of TBDR which currently
| powers the fastest low-power mobile and desktop GPUs on the
| market.
| phire wrote:
| Regarding Maxwell and Navi: Actually, that's not true.
|
| The micro-benchmark that suggested Maxwell (and later) was a
| tiled deferred gpu was actually measuring something else. Each
| GPC gets assigned different sceenspace areas, and concurrency
| rules between the areas is relaxed (unless explicitly required
| by shader atomics).
|
| The result looks somewhat like tiled deferred rendering in that
| micro-benchmark. But it's still very much immediate mode.
|
| A similar thing happened with Navi.
|
| However, there are mobile GPUs (Qualcomm's Adreno) that
| dynamically switch between tiled deferred mode and immediate
| mode on a per renderpass basis, depending on what driver
| heuristics suggest will be faster.
| Jasper_ wrote:
| When did Adreno gain a deferred more? Back when I was talking
| to Rob Clark in 2014 or so, it sounded like it was all
| immediate per-tile.
| phire wrote:
| GPU terminology is confusing at times.
|
| Imgtec and Apple use the term "Tile-Based Deferred
| Rendering" to mean a combination of tiling and deferred
| shading. Because that's what their GPUs do.
|
| Other vendors, like qualcomm [1] still use the term
| "Deferred" in regards to their Tile-Based Rendering, simply
| because the draw calls are deferred. It doesn't mean
| deferred shading.
|
| Every company appears to make the the terminology as they
| go. I found an early presentation from ARTX [2] and they
| are using database terminology to describe what we now call
| vertex buffers.
|
| [1] https://developer.qualcomm.com/docs/adreno-
| gpu/developer-gui...
|
| [2] http://www.graphics.stanford.edu/courses/cs448a-01-fall
| /lect...
| cma wrote:
| >For example, the fact that all contemporary AMD and Nvidia
| gaming GPUs are tilers with large tiles (that's one of the key
| reasons why Maxwell and Navi got a big boost in performance)
|
| There's a whole section on it near the end:
|
| "Another class of hybrid architecture is one that is often
| referred to as tile-based immediate-mode rendering. As
| dissected in this article, this hybrid architecture is used
| since NVIDIA's Maxwell GPUs."
|
| >Notably missing any mention of TBDR which currently powers the
| fastest low-power mobile and desktop GPUs on the market.
|
| Another section mentions:
|
| "There's a long-standing myth (that luckily slowly disappears)
| that deferred rendering techniques are not suitable for TBR
| GPUs. "
| qd6pwu4 wrote:
| 503 Service Unavailable
| squarefoot wrote:
| 2021: the year GPUs became unavailable, just like websites
| about them.
___________________________________________________________________
(page generated 2021-07-20 23:01 UTC)