[HN Gopher] Show HN: I've made a Monte-Carlo raytracer for glTF ...
___________________________________________________________________
Show HN: I've made a Monte-Carlo raytracer for glTF scenes in
WebGPU
This is a GPU "software" raytracer (i.e. using manual ray-scene
intersections and not RTX) written using the WebGPU API that
renders glTF scenes. It supports many materials, textures, material
& normal mapping, and heavily relies on multiple importance
sampling to speed up convergence.
Author : lisyarus
Score : 125 points
Date : 2024-12-26 17:24 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| modeless wrote:
| Do you have a link that runs in the browser?
| lisyarus wrote:
| Nope, this project is desktop-only
| modeless wrote:
| You should try building it with Emscripten. SDL2 is
| supported.
| crazygringo wrote:
| This is a completely side question, but just because it always
| astonishes me how "real" raytraced scenes can look in terms of
| lighting, but it's too complex/slow for video games.
|
| How far have we gotten in terms of training AI models on
| raytraced lighting, to simulate it but fast enough for video
| games? Training an AI not on rendered scenes from any particular
| viewpoint, but rather on how light and shadows would be "baked
| into" textures?
|
| Because what raytracing excels at is the overall realism of
| diffuse light. And it seems like the kind of thing AI would be
| good at learning?
|
| I've always though, e.g. when looking at the shadows trees cast,
| I couldn't care less if the each leaf shape in the shadow is
| accurate or entirely hallucinated. The important things seem to
| be a combination of the overall light diffusion, combined with
| correct nearby shadow shapes for objects. Which is seems AI would
| excel at?
| Etheryte wrote:
| At any reasonable quality, AI is even more expensive than
| raytracing. A simple intuition for this is the fact that you
| can easily run a raytracer on consumer hardware, even if at low
| FPS, meanwhile you need a beefy setup to run most AI models and
| they still take a while.
| toshinoriyagi wrote:
| While some very large models may need beefy hardware, there
| are multiple forms of deep learning used for similar
| purposes:
|
| Nvidia's DLSS is a neural network that upscales images so
| that games may be rendered quickly at lower resolutions, and
| than upscaled to the display resolution in less total time
| than rendering natively at the display resolution.
|
| Nvidia's DLDSR downscales a greater-than-native resolution
| image faster than typical downscaling algorithms used in DSR.
|
| Nvidia's RTX HDR is a post-processing filter that takes an
| sRGB image and converts it to HDR.
|
| So, it is very likely that a model that converts rasterized
| images to raytraced versions is possible, and fast. The most
| likely road block is the lack of a quality dataset for
| training such a model. Not all games have ray tracing, and
| even fewer have quality implementations.
| mywittyname wrote:
| > So, it is very likely that a model that converts
| rasterized images to raytraced versions is possible, and
| fast.
|
| How would this even work and not just be a DLSS derivative?
|
| The magic of ray tracing is the ability to render light
| sources and reflections that are not in the scene. So where
| is the information coming from that the algorithm would use
| to place and draw the lights, shadows, reflections, etc?
|
| I'm not asking to be snarky. I can usually "get there from
| here" when it comes to theoretical technology, but I can't
| work out how a raster image would contain enough data to
| allow for accurate ray tracing to be applied for objects
| whose effects are only included due to ray tracing.
| jsheard wrote:
| To be clear DLSS is a very different beast than your
| typical AI upscaler, it uses the principle of temporal
| reuse where _real_ samples from previous frames are
| combined with samples from the current frame in order to
| converge towards a higher resolution over time. It 's not
| guessing new samples out of thin air, just guessing whether
| old samples are still usable, which is why DLSS is so fast
| and accurate compared to general purpose AI upscalers and
| why you can't use DLSS on images or videos.
| jms55 wrote:
| To add to this, DLSS 2 functions exactly the same as a
| non-ML temporal upscaler does: it blends pixels from the
| previous frame with pixels from the current frame.
|
| The ML part of DLSS is that the blend weights are
| determined by a neural net, rather than handwritten
| heuristics.
|
| DLSS 1 _did_ try and and use neural networks to predict
| the new (upscaled) pixels outright, which went really
| poorly for a variety of reasons I don't feel like getting
| into, hence why they abandoned that approach.
| TuringTest wrote:
| Yeah but that has something to do with
|
| 1) commercial hardware pipelinea being improved for decades
| in handling 3D polygons, and
|
| 2) graphical AI models are trained on understanding natural
| language in addition to rendering.
|
| I can imagine a new breed of specialized generative graphical
| AI that entirely skips language and is trained on stock 3D
| objects as input, which could potentially perform much
| better.
| 8n4vidtmkvmk wrote:
| I'm not convinced. We have "hyper" and "lightning" diffusion
| models that run 1-4 steps and are pretty quick on consumer
| hardware. I really have no idea which would be quicker with
| some optimizations and hardware tailored for the use-case.
| jsheard wrote:
| The hard part is keeping everything coherent over time in a
| dynamic scene with a dynamic camera. Hallucinating vaguely
| plausible lighting may be adequate for a still image, but
| not so much in a game if you hallucinate shadows or
| reflections of off-screen objects that aren't really there,
| or "forget" that off-screen objects exist, or invent light
| sources that make no sense in context.
|
| The main benefit of raytracing in games is that it has
| accurate global knowledge of the scene beyond what's
| directly in front of the camera, as opposed to earlier
| approximations which tried to work with only what the
| camera sees. Img2img diffusion is the ultimate form of the
| latter approach in that it tries to infer _everything_ from
| what the camera sees, and guesses the rest.
| 8n4vidtmkvmk wrote:
| Right, but I'm not actually suggesting we use diffusion.
| At least, not the same models we're using now. We need to
| incorporate a few sample rays at least so that it 'knows'
| what's _actually_ off-screen, and then we just give it
| lots of training data of partially rendered images and
| fully rendered images so that it learns how to fill in
| the gaps. It shouldn 't hallucinate very much if we do
| that. I don't know how to solve for temporal coherence
| though -- I guess we might want to train on videos
| instead of still images.
|
| Also, that new Google paper where it generates entire
| games from a single image has up to 60 seconds of
| 'memory' I think they said, so I don't think the
| "forgetting" is actually that big of a problem since we
| can refresh the memory with a properly rendered image at
| least every that often.
|
| I'm just spitballing here though, I think all of Unreal
| 5.4 or 5.5 has put this into practice already with their
| new lighting system.
| jsheard wrote:
| > We need to incorporate a few sample rays at least so
| that it 'knows' what's actually off-screen, and then we
| just give it lots of training data of partially rendered
| images and fully rendered images so that it learns how to
| fill in the gaps.
|
| That's already a thing, there's ML-driven denoisers which
| take a rough raytraced image and do their best to infer
| what the fully converged image would look like based on
| their training data. For example in the offline rendering
| world there's Nvidia's OptiX denoiser and Intel's OIDN,
| and in the realtime world there's Nvidia's DLSS Ray
| Reconstruction which uses an ML model to do both
| upscaling and denoising at the same time.
|
| https://developer.nvidia.com/optix-denoiser
|
| https://www.openimagedenoise.org
| jampekka wrote:
| The current approach seems to be ray tracing limited/feasible
| number of samples and upsampling/denoising the result using
| neural networks.
| omolobo wrote:
| See: https://research.nvidia.com/labs/rtr/tag/neural-rendering/
|
| Specifically this one, which seems to tackle what you
| mentioned:
| https://research.nvidia.com/labs/rtr/publication/hadadan2023...
| lispisok wrote:
| This is an interesting idea but please no more AI graphics
| generation in video games please. Games dont get optimized
| anymore because devs rely on AI upscaling and frame generation
| to get playable framerates and it makes the games look bad and
| play bad.
| holoduke wrote:
| No its because hardware is not fast enough. Performance
| optimization is a large part of engine development. It
| happens at Epic as well.
| vrighter wrote:
| No, it's because the software they write has higher
| requirements than computers at the time can provide. They
| bite off more than the hardware can chew. And they already
| know exactly just how much the hardware we have can chew...
| yet they do it anyway. You write stuff for the hardware we
| have NOW. "The hardware is not fast enough" is never an
| excuse. The hardware was there first, you write software
| for it. You don't write software for nonexistent hardware
| and then complain that the current one isn't fast enough.
| The hardware is fine (it always is). It's the software
| that's too heavy. If you don't have enough compute power to
| render that particular effect... then maybe don't render
| that particular effect and take technical considerations in
| your art style.
| holoduke wrote:
| I agree thats true.
| nuclearsugar wrote:
| I think there will certainly be an AI 3D render engine at some
| point. But currently AI is used in 3D render engines to assist
| with denoising.
| https://docs.blender.org/manual/en/2.92/render/layers/denois...
| jms55 wrote:
| This was a recent presentation from SIGGRAPH 2024 that covered
| using neural nets to store baked (not dynamic!) lighting https:
| //advances.realtimerendering.com/s2024/#neural_light_g....
|
| Even with the fact that it's static lighting, you can already
| see a ton of the challenges that they faced. In the end they
| did get a fairly usable solution that improved on their
| existing baking tools, but it took what seems like months of
| experimenting without clear linear progress. They could have
| just as easily stalled out and been stuck with models that
| didn't work.
|
| And that was just for static lighting, not every realtime
| dynamic lighting. ML is going to need a lot of advancements
| before it can predict lighting whole-sale, faster and easier
| than tracing rays.
|
| On the other hand ML is really really good at replacing all the
| mediocre handwritten heuristics 3d rendering has. For lighting,
| denoising low-signal (0.5-1 rays per pixel) lighting is a big
| area of research[0] since handwritten heuristics tend to
| struggle with such little amount of data available, along with
| lighting caches[1] which have to adapt to a wide variety of
| situations that again make handwritten heuristics struggle.
|
| [0]: https://gpuopen.com/learn/neural_supersampling_and_denoisi
| ng..., and the references it lists
|
| [1]:https://research.nvidia.com/publication/2021-06_real-time-
| ne...
| holoduke wrote:
| Its still hard to do realtime. You need so much gpu memory that
| a second GPU must be used at least today. The question is what
| gets achieved quicker. Hard calculated simulation or AI post
| processing. Or maybe a combination?
| omolobo wrote:
| It's a mega-kernel, so you'll get poor occupancy past the first
| bounce. A better strategy is to shoot, sort, and repeat, which
| then also allows you to squeeze in an adaptive sampler in the
| middle.
|
| > // No idea where negative values come from :(
|
| I don't know, but:
|
| > newRay.origin += sign(dot(newRay.direction, geometryNormal)) *
| geometryNormal * 1e-4;
|
| The new origin should be along the reflected ray, not along the
| direction of the normal. This line basically adds the normal
| (with a sign) to the origin (intersection point), which seems
| odd.
|
| Poor's man way to find where the negatives come from is to
| max(0,...) stuff until you find it.
| lisyarus wrote:
| > It's a mega-kernel, so you'll get poor occupancy past the
| first bounce
|
| Sure! If you look into the to-do list, there's a "wavefront
| path tracer" entry :)
|
| > new origin should be along the reflected ray
|
| I've found that doing it the way I'm doing it works better for
| preventing self-intersections. Might be worth investigating,
| though.
| omolobo wrote:
| It probably works better when the reflected ray is almost
| tangent to the surface. But that should be an epsilon case.
| TomClabault wrote:
| > A better strategy is to shoot, sort, and repeat
|
| Do we have good sorting strategy whose costs are amortized yet?
| Meister 2020
| (https://meistdan.github.io/publications/raysorting/paper.pdf)
| shows that the hard part is actually to hide the cost of the
| sorting.
|
| > squeeze in an adaptive sampler in the middle. Can you expand
| on that? How does that work? I only know of adaptive sampling
| in screen space where you shoot more or less rays to certain
| pixels based on their estimated variance so far.
| TomClabault wrote:
| After reading this paper a bit more it seems that the it
| focuses on simple scenes and simple materials only, which a
| bit unfortunate. This is exactly where ray reordering
| overhead is going to be the most problematic.
|
| They also do talk about the potential of ray reordering for
| complex scenes and complex materials in the paper (because
| reordering helps with shading divergence since "all"
| reordered rays are pretty much going to hit the same
| material).
|
| So maybe ray reordering isn't dead just yet. Probably would
| have to try that at some point...
| pjmlp wrote:
| WebGPU projects that don't provide browser examples are kind of
| strange, then better use Vulkan or whatever.
| lisyarus wrote:
| See my answer to artemonster above.
| sspiff wrote:
| WegGPU is a way nicer HAL if you're not an experienced graphics
| engineer. So even if you only target desktops, it's a valid
| choice.
|
| On the web, WebGPU is only supported by Chrome-based browser
| engines at this point, and a lot of software developers us
| Firefox (and don't really like encouraging a browser
| monoculture), so it doesn't make a ton of sense to target
| browser based WebGPU for some people at this point.
| lisyarus wrote:
| It's not as much about experience as it is about trade-offs.
| I've worked a lot with Vulkan and it's an incredible API, but
| when you're working alone and you don't have the goal of
| squeezing 250% performance out of your GPU on dozens of
| different GPU architectures, your performance becomes pretty
| much independent of a specific graphics API (unless your API
| doesn't support some stuff like multi-draw-indirect, etc).
| pjmlp wrote:
| The answer is middleware engine, all of them with much nicer
| tooling available, without the constraints of a browser
| sandboxing design, for 2017 graphics APIs minimum common
| denominator.
| artemonster wrote:
| > "GPU "software" raytracer"
|
| > WebGPU
|
| > this project is desktop-only
|
| Boss, I am confused, boss.
| lisyarus wrote:
| I'm using WebGPU as a nice modern graphics API that is at the
| same time much more user-friendly and easier to use compared to
| e.g. Vulkan. I'm using a desktop implementation of WebGPU
| called wgpu, via it's C bindings called wgpu-native.
|
| My browser doesn't support WebGPU properly yet, so I don't
| really care about running this thing in browser.
| firtoz wrote:
| That's a fascinating approach.
|
| And it gets me a bit sad about the state of WebGPU, however
| hopefully that'll be resolved soon... I also on Linux am
| impatiently waiting for WebGPU to be supported on my browser.
| bezdomniy wrote:
| Very cool. I did a similar project with wgpu in Rust -
| https://github.com/bezdomniy/Rengin nice to find your projects to
| see where I can improve!
___________________________________________________________________
(page generated 2024-12-27 23:02 UTC)