[HN Gopher] Raytracing on AMD's RDNA 2/3, and Nvidia's Turing an...
___________________________________________________________________
Raytracing on AMD's RDNA 2/3, and Nvidia's Turing and Pascal
Author : treesciencebot
Score : 74 points
Date : 2023-03-22 19:00 UTC (4 hours ago)
(HTM) web link (chipsandcheese.com)
(TXT) w3m dump (chipsandcheese.com)
| matthewfcarlson wrote:
| I've often wondered why Nvidia cards are generally so much better
| at rendering scenes in Blender's cycles renderer (a raytracing
| engine). The benchmarks on Blender's website are really telling
| (https://opendata.blender.org/benchmarks/query/?group_by=devi...)
| by the fact that the only non Nvidia entry on the first page is
| the AMD 2X EPYC 9654 96-Core.
|
| This really lays out the decisions that Nvidia made compared to
| AMD and how their approach tends to hide some of the shortcoming
| of GPUs (latency and utilization).
| zokier wrote:
| That is more of a software (ecosystem) thing. Nvidias CUDA and
| OptiX are well beyond anything AMD has to offer. In Cycles
| case, I believe that on Nvidia it is taking good advantage of
| RT cores while on AMD they are completely unused which has
| predictable effect on performance. Even ignoring the RT cores I
| suspect the Nvidia code path is likely far more optimized than
| AMD one
|
| https://www.phoronix.com/news/AMD-HIP-RT-Blender-3.5-Plans
| dotnet00 wrote:
| Plus on AMD's side we have their inability to commit to fully
| supporting any specific system long term, limiting open
| source interest in doing things for them.
| Melatonic wrote:
| I thought AMD was all in on OpenCL?
| dotnet00 wrote:
| Nope, their OpenCL support has been kind of stagnant for
| a while, especially on Windows. On top of that, part of
| why Blender dropped its OpenCL supporting renderer was
| that AMD's OpenCL was still pretty buggy, making the
| renderer a pain to maintain.
|
| Lately their focus is ROCm and their CUDA equivalent
| language, but it also has limited official hardware
| support and AFAIK the Windows SDK for it is still not
| public.
|
| Similar commitment issues have plagued their custom
| renderers.
| my123 wrote:
| AMD's OpenCL driver is significantly _worse_ than NVIDIA
| 's since ages afaik...
| br1 wrote:
| Interesting that card/drivers customize so much of ray tracing,
| like rasterization in pre vulkan/metal/d3d12 or even fixed
| function gpu days.
| ladberg wrote:
| Would love to see a more in-depth article on BVH construction
| itself! I'm decently familiar with the main concepts but have no
| clue what the current SOTA looks like (is that even public
| info?).
|
| BVH construction is my favorite question to ask in interviews
| because there's no single best solution and it mostly relies on
| mathy heuristics to get a decent tree. You can also always devote
| more time to making a more optimal tree but there's a tradeoff
| where it'll eventually take more time than it saves in
| raytracing.
| frogblast wrote:
| This is the best I've found that covers recent developments:
|
| https://meistdan.github.io/publications/bvh_star/paper.pdf
| shmerl wrote:
| Ray tracing on Linux for CP2077 with 7900 XTX is still barely
| usable, but it's getting better.
|
| I'd say RDNA 3 is not really giving useful ray tracing on for
| example 2560x1440 unless you use upscaling to speed it up. May be
| in a few GPU generations ray tracing will become usable with
| native resolutions.
| sylware wrote:
| I did not get into the real details yet, but mesa radv pulls that
| horrible glslang due to some shaders related to acceleration
| structures.
|
| Personnaly, I am a dev, then I patch to compile out all that (and
| all the tracers at the same time) since ray tracing has currently
| a ridiculous ratio benefits/technical costs.
|
| This defeats the very purpose of vulkan spirv: getting rid of
| those horrible high level shader compilers from the driver stack
| and keep them contained at the application level.
|
| It seems beyond clumsy, but as I said, I need to get into the
| details of "why" those shaders in the first place, and then why
| they are not written directly in RDNA assembly or SPIR-V assembly
| (that would require an "assembler" coded in simple and plain C).
| TazeTSchnitzel wrote:
| Generating a ray tracing acceleration structure is very
| complex, who'd want to implement that in assembly language?
| pixelesque wrote:
| I suspect the reason the author is seeing very shallow trees for
| Nvidia might be because the lower levels are done fully behind
| the scenes:
|
| https://forums.developer.nvidia.com/t/extracting-bvh-from-op...
|
| As someone who deals with BVHs a lot for ray intersection, I find
| it pretty difficult to believe that leaf nodes with that number
| of primitives will be anywhere near performant, even with fast
| dedicated hardware like the RT cores.
|
| It's true that the Nvidia cards have better intersection
| performance than ray/box tests, but I don't believe it's in the
| 100x ratio range which I suspect would be needed if the BVHs were
| that shallow and leaf nodes that large.
| TinkersW wrote:
| Isn't wide BVH how embree works, 1 ray vs SIMD width boxes..
| maybe Nvidia is simply doing the same thing but with the wider
| GPU SIMD(32 I believe).
| berkut wrote:
| Yes, but normally 4- or 8-wide is the norm: the wider you go
| the more sorting you have to do to traverse things in order
| or find the nearest hit which has an overhead (hardware may
| help with this, but it's still an overhead).
|
| Previous indications from Nvidia about their BVHs don't seem
| to show anything about very shallow trees for any of the BVH
| algorithms that OptiX supports (scroll to bottom for reverse
| visualisation of a BVH hierarchy on top of the Stanford Bunny
| model): https://drive.google.com/file/d/1B5fNRFwv2LsGlCBJ8oKY
| RiiDUtL...
| frogblast wrote:
| I strongly suspect the reason Nvidia trees are so shallow is
| that NSight simply isn't showing the actual tree structure,
| probably because Nvidia considers that proprietary. It appears
| to just list all the leafs of a tree in a big flat list. But
| there definitely is a tree in there.
| Arrath wrote:
| I'm very curious to see it unrolled down to its actual
| structure.
| kevingadd wrote:
| Perhaps the rest of it isn't a tree and is some other
| optimized data structure? Like some sort of spatial hash or
| sort
___________________________________________________________________
(page generated 2023-03-22 23:01 UTC)