[HN Gopher] UE5 Nanite in WebGPU
       ___________________________________________________________________
        
       UE5 Nanite in WebGPU
        
       Author : vouwfietsman
       Score  : 221 points
       Date   : 2024-09-05 17:55 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jsheard wrote:
       | It's cool that it kind of works, but they had to make some nasty
       | compromises to get around WebGPUs lack of 64 bit atomics.
       | Hopefully that will be added as an optional extension at some
       | point, hardware support is almost ubiquitous on desktop-class
       | hardware at least (AMD and Nvidia have had it forever but Apple
       | has only had it since the M3).
        
         | throwaway17_17 wrote:
         | What is the use case for atomics in the rasterizer? I can't
         | figure out what the atomic operations do inside the rendering
         | pipeline. I looked at the GitHub, but couldn't find the place
         | the atomic were hoped for.
        
           | hrydgard wrote:
           | Pack Z and 32-bit color together into a 64-bit integer, then
           | do an atomic min (or max with reversed Z) to effectively do a
           | Z-query and a write really, really fast.
        
             | jsheard wrote:
             | Nanite writes out the ID of the primitive at that pixel
             | rather than the color, but otherwise yeah that's the idea.
             | After rasterization is done a separate pass uses that ID to
             | fetch the vertex data again and reconstruct all of the
             | material parameters, which can be freely written out
             | without atomics since there's exactly one thread per pixel
             | at that point.
        
           | neomantra wrote:
           | Visibility buffer needing atomics is noted briefly in the
           | long README. Link to discussion detailing it:
           | https://github.com/Scthe/nanite-webgpu/issues/1
        
           | jsheard wrote:
           | With traditional hardware rasterization there are specialized
           | hardware blocks which handle atomically updating the
           | framebuffer to whichever sample is currently the closest to
           | the camera, and discarding anything behind that. Nanite does
           | software rasterization instead, and one of their insights was
           | figuring out a practical way to cram all of the data needed
           | for each pixel into just 64 bits (depth in the high bits and
           | everything else in the low bits) which allows them to do
           | efficient depth sorting using min/max atomics from a compute
           | shader instead. The 64 bits are crucial though, that's the
           | absolute bare minimum useful amount of data per pixel so you
           | _really_ need 64 bit atomics. Nanite doesn 't even try to
           | work without them.
           | 
           | To kind of get it working with 32 bit atomics this demo is
           | reducing depth to just 16 bits (not enough to avoid
           | artifacts) and only encoding a normal vector into the other
           | 16 bits, which is why the compute rasterized pixels are
           | untextured. There just aren't enough bits to store any more
           | material parameters or a primitive ID, the latter being how
           | Nanite does it.
        
         | my123 wrote:
         | Since the M2
        
           | jsheard wrote:
           | Right you are, 64 bit atomics were added with the Apple8 GPU
           | but only in M-series chips (M2 and up) and then the Apple9
           | GPU made it universal (A17 Pro and up).
           | 
           | https://developer.apple.com/metal/Metal-Feature-Set-
           | Tables.p...
        
       | moralestapia wrote:
       | Outstanding work. Also, thanks for proving actual demos of the
       | tech. I get 60-120fps on my MBP which is phenomenal given the
       | amount of triangles in the scene.
        
       | macawfish wrote:
       | The camera controls on my phone are very hard to get down
        
         | aaroninsf wrote:
         | Browser/touchpad also :)
        
         | bob1029 wrote:
         | Couldn't we use something like this to provide a more intuitive
         | experience for the mobile web targets?
         | 
         | https://developer.mozilla.org/en-US/docs/Web/API/Device_orie...
        
       | TaylorAlexander wrote:
       | It says my iPhone 12 Pro Max doesn't have WebGPU, but I enabled
       | it in experimental features and another website[1] with WebGPU
       | demos now works. Has anyone gotten this working on iPhone? Would
       | be nice if the web app gave more info on what failed.
       | 
       | [1] https://webgpu.github.io/webgpu-samples/?sample=texturedCube
        
         | KMnO4 wrote:
         | I enabled WebGPU in Safari but I'm seeing a bunch of shader
         | errors.
         | 
         | WebGPU error [init][validation]: 6 errors generated while
         | compiling the shader: 50:22: unresolved call target
         | 'pack4x8snorm' 50:9: cannot bitcast from 'aY=' to 'f32' 54:10:
         | unresolved call target 'unpack4x8snorm' 59:22: unresolved call
         | target 'pack4x8unorm' 59:9: cannot bitcast from 'aY=' to 'f32'
         | 63:9: unresolved call target 'unpack4x8unorm'
        
       | soulofmischief wrote:
       | Is the demo using user agent strings to determine compatibility?
       | That's not good, and feature compatibility should be determined
       | on a case-by-case basis by simply attempting to detect/use the
       | specific feature.
       | 
       | I am on Chromium, not Chrome, and use WebGPU all the time, but
       | the demos tell me to use Chrome, which I cannot do ethically.
       | Would love to try the demos out, this looks like a lot of hard
       | work!
        
         | robin_reala wrote:
         | Don't think so. I'm on a Firefox that has experimental WebGPU
         | support enabled, and it fails with shader compilation errors
         | rather than any message.
        
         | drusepth wrote:
         | If this is the case, I imagine it'd be pretty easy to spoof
         | your UA and see the demo, even from Chromium.
        
         | gpm wrote:
         | > and use WebGPU all the time
         | 
         | I'm curious, what for?
        
           | soulofmischief wrote:
           | I've used it to build and/or run various machine learning
           | models for text generation, speech recognition, image
           | generation, depth estimation, etc. in the browser, in support
           | of an agentic system I've been building out.
           | 
           | Lots of future possibilities as well once support is more
           | ubiquitous!
        
         | Twirrim wrote:
         | It's not working for me on Chrome under Linux, nor on Android,
         | for what it's worth (though Firefox is what I use for
         | practically all my browsing needs). Something really odd with
         | their detection logic.
        
           | pjmlp wrote:
           | WebGPU is not supported on Linux, and it won't for the
           | foreseeable future.
           | 
           | On Android you should have at least Android 12, with good
           | enough Vulkan drivers, not blacklisted.
        
             | sva_ wrote:
             | > WebGPU is not supported on Linux, and it won't for the
             | foreseeable future.
             | 
             | A lot of it runs fine with a flag.
        
         | bakugo wrote:
         | >Is the demo using user agent strings to determine
         | compatibility
         | 
         | >I am on Chromium, not Chrome
         | 
         | Don't know about your build, but I'm using Ungoogled Chromium,
         | and it has the exact same user-agent string as Google Chrome.
         | 
         | Have you enabled the WebGL permission for the site in site
         | settings? I think it was disabled by default for me.
        
       | SaintSeiya wrote:
       | Honest question: It is calim that software rasterizer is faster
       | than hardware one. Can someone explain me why? isn't the purpose
       | of the GPU to accelerate rasterization itself? Unless is a recent
       | algorithm or the "software rasterizer" is actually running on the
       | GPU and not the CPU I don't see how
        
         | janito wrote:
         | I'm also curious. From what I could read in the repository's
         | references, I think that the problem is that the GPU is bad at
         | rasterizing small triangles. Apparently each triangle in the
         | fixed function pipeline generates a batch of pixels to render
         | (16 in one of the slides I saw), so if the triangle covers only
         | one or two pixels, all others in the batch are wasted. I
         | speculate that the idea is to then detect these small triangles
         | and draw them quickly using less pixel shaders (still on the
         | GPU, but without using the graphics specific fixed functions),
         | but I'm honestly not sure I understand what's happening.
        
         | NotGMan wrote:
         | I'm a bit out of the GPU game but so this might be slightly
         | wrong in some places: the issue is in small triangles because
         | you end up paying a huge cost. GPUs ALWAYS shade in 2x2 blocks
         | of pixels, not 1x1 pixels.
         | 
         | So if you have a very small triangle (small as in how many
         | pixels on the screen it covers) that covers 1 pixel you will
         | still pay the price of a 2x2 block (4 pixels instead of 1), so
         | you just wasted 300% of your performance.
         | 
         | Nanite auto-picks the best triangle to minimize this and
         | probably many more perf metrics that I have no idea about.
         | 
         | So even if you do it in software the point is that if you can
         | get rid of that 2x2 block penalty as much as possible you could
         | be faster than GPU doing 2x2 blocks in hardware since pixel
         | shaders can be very expensive.
         | 
         | This issue gets worse the larger the rendering resolution is.
         | 
         | Nanite then picks larger triangles instead of those tiny
         | 1-pixel ones since those are too small to give any visual
         | fidelity anyway.
         | 
         | Nanite is also not used for large triangles since those are
         | more efficient to do in hardware.
        
           | kllrnohj wrote:
           | > So even if you do it in software the point is that if you
           | can get rid of that 2x2 block penalty as much as possible you
           | could be faster than GPU doing 2x2 blocks in hardware since
           | pixel shaders can be very expensive.
           | 
           | Of course the obvious problem with that is if you don't have
           | most of the screen covered in such small triangles then
           | you're paying a large cost for nanite vs traditional means.
        
         | Animats wrote:
         | The answer to that is in this hour-long SIGGRAPH video.[1] Some
         | of the operations needed are not done well, or at all, by the
         | GPU.
         | 
         | [1] https://www.youtube.com/watch?v=eviSykqSUUw
        
           | janito wrote:
           | Here's the relevant part of the (really cool!) video:
           | https://www.youtube.com/watch?v=eviSykqSUUw&t=1888s
        
         | SaintSeiya wrote:
         | thanks all, yes it start making sense now
        
         | bob1029 wrote:
         | I thought it was a software rasterizer running inside fragment
         | shader on the GPU. Not actually on the CPU. I need to watch
         | that video again to be sure, but I cant see how a CPU could
         | handle that many triangles.
        
           | raphlinus wrote:
           | To be precise, this is running in a compute shader
           | (rasterizeSwPass.wgsl.ts for the curious). You can think of
           | that as running the GPU in a mode where it's a type of
           | computer with some frustrating limitations, but also the
           | ability to efficiently run thousands of threads in parallel.
           | 
           | This is in contrast to hardware rasterization, where there is
           | dedicated hardware onboard the GPU to decide which pixels are
           | covered by a given triangle, and assigns those pixels to a
           | fragment shader, where the color (and potentially other
           | things) are computed, finally written to the render target as
           | a raster op (also a bit of specialized hardware).
           | 
           | The seminal paper on this is cudaraster [1], which
           | implemented basic 3D rendering in CUDA (the CUDA of 13 years
           | ago is roughly comparable in power to compute shaders today),
           | and basically posed the question: how much does using the
           | specialized rasterization hardware help, compared with just
           | using compute? The answer is roughly 2x, though it depends a
           | lot on the details.
           | 
           | And those details are important. One of the assumptions that
           | hardware rasterization relies on for efficiency is that a
           | triangle covers dozens of pixels. In Nanite, that assumption
           | is not valid, in fact a great many triangles are
           | approximately a single pixel, and then software/compute
           | approaches actually start beating the hardware.
           | 
           | Nanite, like this project, thus actually uses a hybrid
           | approach: rasterization for medium to large triangles, and
           | compute for smaller ones. Both can share the same render
           | target.
           | 
           | [1]: https://research.nvidia.com/publication/2011-08_high-
           | perform...
        
         | TinkersW wrote:
         | A couple reasons
         | 
         | 1. HW does 2x2 blocks of pixels always so it can have
         | derivatives, even if you don't use them..
         | 
         | 2. Accessing SV_PrimitiveID is surprisingly slow on Nvidia/AMD,
         | by writing it out in the PS you will take a huge perf hit in
         | HW. There are ways to work around this, but they aren't trivial
         | and differ between vendors, and you have to be aware of the
         | issue it in the first place! I think some of the "software" >
         | "hardware" raster stuff may come from this.
         | 
         | The HW shader in this demo looks wonky though, it should be
         | writing out the visibility buffer, and instead it is writing
         | out a vec4 with color data, so of course that is going to hurt
         | perf. Way too many varyings being passed down also.
         | 
         | In a high triangle HW rasterizer you want the visibility buffer
         | PS do a little compute as possible, and write as little as
         | possible, so it should only have 1 or 2 input varyings and
         | simply writes them out.
        
       | moffkalast wrote:
       | > No WebGPU available. Please use Chrome.
       | 
       | Getting that on Chromium, lol.
        
         | gpm wrote:
         | I'm getting that in _google chrome proper_ , but what completes
         | the joke is that in firefox I just get a blank page without the
         | message to use chrome.
         | 
         | Edit: WebGPU in chrome is behind a flag on linux:
         | https://github.com/gpuweb/gpuweb/wiki/Implementation-Status#...
        
       | Animats wrote:
       | Oh, nice. Third party implementations of Nanite playback.
       | 
       | Nanite is a very clever representation of graphics meshes.
       | They're directed acyclic graphs rather than trees. Repetition is
       | a link, not a copy. It's recursive; meshes can share submeshes,
       | which in turn can share submeshes, all the way down. It's also
       | set up for within-mesh level of detail support, so the submeshes
       | drop out when they're small enough. So you can have repetitive
       | content of very large size with a finite amount of data and fast
       | rendering times. The insight is that there are only so many
       | pixels on screen, so there's an upper bound on rendering work
       | really needed.
       | 
       | There's a really good SIGGRAPH video on this from someone at
       | Epic.
       | 
       | Current GPU designs are a mismatch for Nanite, Some new hardware
       | operations are needed to do more of this in the GPU, where it
       | belongs. Whether that will happen, with NVidia distracted by the
       | AI market, is a good question.
       | 
       | The scene needs a lot of instancing for this to pay off. Unreal
       | Engine demos show such things as a hall of identical statues. If
       | each statue was different, Nanite would help far less. So it
       | works best for projects where a limited number of objects are
       | reused to create large areas of content. That's the case for most
       | AAA titles. Watch a video of Cyberpunk 2077, and look for
       | railings and trash heaps. You'll see the same ones over and over
       | in totally different contexts.
       | 
       | Making a nanite mesh is complicated, with a lot of internal
       | offsets for linking, and so far only Unreal Engine's editor does
       | it. With playback now open source, someone will probably do that.
       | 
       | Those internal offsets in the format present an attack surface
       | which probably can be exploited with carefully crafted bad
       | content, like hostile Microsoft Word .doc files.
        
         | turtledragonfly wrote:
         | I think the SIGGRAPH talk you referred to is: "A Deep Dive into
         | Nanite Virtualized Geometry"
         | (https://www.youtube.com/watch?v=eviSykqSUUw)
         | 
         | There's also this short high-level intro (2.5 min) that I
         | thought was decent: "What is virtualized micropolygon geometry?
         | An explainer on Nanite"
         | (https://www.youtube.com/watch?v=-50MJf7hyOw)
        
         | Jasper_ wrote:
         | > Repetition is a link, not a copy. It's recursive; meshes can
         | share submeshes, which in turn can share submeshes, all the way
         | down.
         | 
         | While it does construct a DAG to perform the graph cut, the
         | final data set on disk is just a flat list of clusters for
         | consideration, along with their cutoffs for
         | inclusion/rejection. There seems to be a considerable
         | misunderstanding of what the DAG is used for, and how it's
         | constructed. It's constructed dynamically based on the vertex
         | data, and doesn't have anything to do with how the artist
         | constructed submeshes and things, nor does "repetition become a
         | link".
         | 
         | > The scene needs a lot of instancing for this to pay off.
         | Unreal Engine demos show such things as a hall of identical
         | statues. If each statue was different, Nanite would help far
         | less.
         | 
         | What makes you say this? The graph cut is _different_ for each
         | instance of the object, so they can 't use traditional
         | instancing, and I don't even see how it could help.
        
           | Animats wrote:
           | It may not be based on what the mesh's creator considered
           | repetition, but repetition is encoded within the mesh. Not
           | sure if the mesh builder discovers some of the repetition
           | itself.
           | 
           | Look at a terrain example:
           | 
           | https://www.youtube.com/watch?v=DKvA7NZRUcg
        
             | Jasper_ wrote:
             | I'm not seeing what you claim to be seeing in that demo
             | video. I see a per-triangle debug view, and a per-cluster
             | debug view. None of that is showing repetition.
        
               | Animats wrote:
               | If there wasn't repetition, you'd need a really huge GPU
               | for that scene at that level of detail.
        
               | gmueckl wrote:
               | I don't want to estimate storage space right now, but
               | meshes can be stored very efficiently. For example, I
               | think UE uses an optimization where vertex positions are
               | heavily quantized to just a few bits within the meshlet's
               | bounding box. Index buffers can be constructed to share
               | the same vertices across LOD levels. Shading normals can
               | be quantized quite a bit before shading artifacts become
               | noticeable - if you even need them anymore at that
               | triangle density.
               | 
               | If your triangles are at or below the size of a texel,
               | texture values could even be looked up offline and stored
               | in the vertex attributes directly rather than keeping the
               | UV coordinates around, but that may not be a win.
        
               | jms55 wrote:
               | Not necessarily. Nanite compresses meshes (including in-
               | memory) _very_ heavily, and _also_ streams in only the
               | visible mesh data.
               | 
               | In general, I wouldn't think of Nanite as "one thing".
               | It's a combination of many, many different techniques
               | that add up into some really good technology.
        
         | diggan wrote:
         | > and so far only Unreal Engine's editor does it
         | 
         | Not a major/mainstream engine by any means (a small Rust ECS
         | game engine) but Bevy also supports something similar under the
         | feature name "Virtual Geometry", mentioned here:
         | https://bevyengine.org/news/bevy-0-14/#virtual-geometry-expe...
         | 
         | Also, a technical deep dive into the feature from one of the
         | authors of the feature:
         | https://jms55.github.io/posts/2024-06-09-virtual-geometry-be...
        
         | vinkelhake wrote:
         | > Nanite playback
         | 
         | That's not what this is though. It's an implementation of the
         | techniques/technology used in Nanite. It doesn't load data from
         | Unreal Engine's editor. One of the mentioned goals:
         | Simplicity. We start with an OBJ file and everything is done
         | in the app. No magic pre-processing steps, Blender exports,
         | etc.        You set the breakpoint at loadObjFile() and F10
         | your way till        the first frame finishes.
        
         | pcwalton wrote:
         | > Making a nanite mesh is complicated, with a lot of internal
         | offsets for linking, and so far only Unreal Engine's editor
         | does it.
         | 
         | meshoptimizer [1] is an OSS implementation of meshlet
         | generation, which is what most people think of when they think
         | of "Nanite's algorithm". Bevy, mentioned in a sibling reply,
         | uses meshoptimizer as the generation tool.
         | 
         | (Strictly speaking, "Nanite" is a brand name that encompasses a
         | large collection of techniques, including meshlets, software
         | rasterization, streaming geometry, etc. For clarity, when
         | discussing these concepts outside of the context of the Unreal
         | Engine specifically, I prefer to refer to individual techniques
         | instead of the "Nanite" brand. They're really separate, even
         | though they complement one another. For example, software
         | rasterization can be profitably used without meshlets if your
         | triangles are really small. Streaming geometry can be useful
         | even if you aren't using meshlets. And so on.)
         | 
         | [1]: https://github.com/zeux/meshoptimizer
        
           | jms55 wrote:
           | Small correction: meshoptimizer only does the grouping
           | triangles -> meshlets part, and the mesh simplification.
           | Actually building the DAG, grouping clusters together, etc is
           | handled by Bevy code (I'm the author, happy to answer
           | questions).
           | 
           | That said I do know zeux was interested in experimenting with
           | Nanite-like DAGs directly in meshoptimizer, so maybe a future
           | version of the library will have an end-to-end API.
        
         | jiggawatts wrote:
         | I read through the papers and my impression was that the
         | biggest gains were from quantised coordinates and dynamic LOD
         | for small patches instead of the entire mesh.
         | 
         | The logic behind nanite as I understood it was to keep the mesh
         | accuracy at roughly 1 pixel precision. So for example, a low
         | detail mesh can be used with coordinates rounded to just 10
         | bits (or whatever) if the resulting error is only about half a
         | pixel when perspective projected onto the screen.
         | 
         | I vaguely remember the quantisation pulling double duty: not
         | only does it reduce the data storage size it also helps the LOD
         | generation because it snaps vertices to the same locations in
         | space. The duplicates can then be eliminated.
        
         | hyperthesis wrote:
         | This is like when Joel said git stores diffs.
        
       | devit wrote:
       | Name and description are very confusing and a trademark violation
       | since despite the claims it seems to be completely unrelated to
       | actual Nanite in UE5, just an implementation of something similar
       | by a person unaffiliated with UE5.
       | 
       | There is also Bevy's Virtual Geometry that provides similar
       | functionality and is probably much more useful since it's written
       | in Rust and integrated with a game engine:
       | https://jms55.github.io/posts/2024-06-09-virtual-geometry-be...
        
         | KMnO4 wrote:
         | I don't think it's really an issue. It's clear from the readme
         | that it's an implementation.
         | 
         | If I made an "implementation of OpenAI's GPT-3 in JS" you would
         | understand that to mean I took the architecture from the
         | whitepaper and reimplemented it.
        
       | smartmic wrote:
       | Wow, I can't remember the last time I read a project summary with
       | so much jargon - I literally didn't understand anything:
       | 
       | > UE5's Nanite implementation using WebGPU. Includes the meshlet
       | LOD hierarchy, software rasterizer and billboard impostors.
       | Culling on both per-instance and per-meshlet basis.
        
         | goodcjw2 wrote:
         | Guess this really shows how much domain specific knowledge in
         | the Computer Graphics...
         | 
         | Yet still, this post is now ranked top 1 on HN.
        
         | bogwog wrote:
         | UE5 Nanite -> https://dev.epicgames.com/documentation/en-
         | us/unreal-engine/...
         | 
         | WebGPU -> https://developer.mozilla.org/en-
         | US/docs/Web/API/WebGPU_API
         | 
         | Meshlet -> https://developer.nvidia.com/blog/introduction-
         | turing-mesh-s...
         | 
         | LOD ->
         | https://en.wikipedia.org/wiki/Level_of_detail_(computer_grap...
         | 
         | Software rasterizer ->
         | https://en.wikipedia.org/wiki/Rasterisation ("software" means
         | it runs on the CPU instead of GPU)
         | 
         | Billboard imposters ->
         | https://www.alanzucconi.com/2018/08/25/shader-showcase-satur...
         | 
         | Culling -> https://en.wikipedia.org/wiki/Hidden-
         | surface_determination
        
           | nicebyte wrote:
           | > ("software" means it runs on the CPU instead of GPU)
           | 
           | no, in this context it means that the rasterisation algorithm
           | is implemented in a compute kernel, rather than using the
           | fixed hw built into the gpu. so rasterization still happens
           | on the gpu, just using programmable blocks.
        
       | theogravity wrote:
       | Using latest chrome on M2 Max for the jinx demo:
       | WebGPU error [frame][validation]: Fill size (7398781) is not a
       | multiple of 4        bytes.       - While encoding
       | [CommandEncoder "main-frame-cmd-buffer"].ClearBuffer([Buffer
       | "rasterize-sw"], 0, 7398781).
        
         | stephc_int13 wrote:
         | I have the same error on Windows 11, GPU is a RTX4090. Browser
         | is Edge.
        
       | replete wrote:
       | Intel Mac, Chrome and ungoogled chromium: index.web.ts:159
       | Uncaught (in promise) OperationError: Instance dropped in
       | popErrorScope
        
       | eigenvalue wrote:
       | Whenever I see rendered scenes like this (I.e., lots of
       | repetitive static geometry) I imagine that annoying guy's voice
       | going on about "unlimited detail" from that old vaporware video.
       | I guess nanite really did solve that problem for real, as opposed
       | to whatever that old thing was using (I remember something about
       | oct-trees or something).
        
         | HappMacDonald wrote:
         | I recall those claims being made by a company called
         | "Euclidean", from Australia I think. Online rumors suggested
         | they might have been using octtrees, but later Euclidean videos
         | flatly denied that.
        
           | raphlinus wrote:
           | It's Euclideon. And it is octtrees. My interpretation after
           | reading a _fascinating_ Reddit thread [1] is that these
           | denials were misdirection. There 's definitely new interest
           | in splatting techniques (Gaussian in particular), though
           | they've long been an alternative to triangles in the 3D
           | world. I think it'd be fun to experiment with implementing
           | some of that using modern compute shaders.
           | 
           | [1]: https://www.reddit.com/r/VoxelGameDev/comments/1bz5vvy/a
           | _sma...
        
       | forrestthewoods wrote:
       | Note: this isn't actually UE5 Nanite in WebGPU. It's a totally
       | independent implementation of the same idea as Nanite.
       | 
       | This technique is starting to appear in a variety of places.
       | Nanite definitely made the idea famous, but Nanite is the name a
       | specific implementation, not the name of the technique.
        
       | readyplayernull wrote:
       | Will virtual geometry be integrated into GPUs some day?
        
       | tech-no-logical wrote:
       | getting the message                   No WebGPU available. Please
       | use Chrome.
       | 
       | on chrome (Version 129.0.6668.29 (Official Build) beta (64-bit))
       | , under windows
        
       | jms55 wrote:
       | It's been mentioned a couple of times in this thread, but Bevy
       | also has an implementation of Nanite's ideas (sometimes called
       | Virtual Geometry). I'm the author of that, happy to answer
       | questions :)
       | 
       | As for this project, Scthe did a great job! I've been talking
       | with them about several parts of the process, culminating in some
       | improvements to Bevy's code based on their experience
       | (https://github.com/bevyengine/bevy/pull/15023). Always happy to
       | see more people working on this, Nanite has a ton of cool ideas.
        
       | KronisLV wrote:
       | I wonder how other engines compare when it comes to LODs and
       | similar systems.
       | 
       | Godot has automatic LOD which seems pretty cool for what it is:
       | https://docs.godotengine.org/en/stable/tutorials/3d/mesh_lod...
       | 
       | Unity also has an LOD system, though despite how popular the
       | engine is, you have to create LOD models manually:
       | https://docs.unity3d.com/Manual/LevelOfDetail.html (unless you
       | dig through the asset store and find a plugin)
       | 
       | I did see an interesting approach in a lesser known engine called
       | NeoAxis: https://www.neoaxis.com/docs/html/NeoAxis_Levels.htm
       | however that engine ran very poorly for me on my old RX580,
       | although I haven't tried on my current A580.
       | 
       | As far as I can tell, Unreal is really quite far ahead of the
       | competition when it comes to putting lots of things on the
       | screen, except the downside of this is that artists will be
       | tempted to include higher quality assets in their games, bloating
       | the install sizes quite far.
        
         | kllrnohj wrote:
         | In theory Nanite is superior to precomputed LODs. In practice
         | it's less clear cut as they aren't going to be as good as
         | artist-created LODs and it's not entirely reasonable to expect
         | them to do so. Also the performance cost is _huge_ as Nanite
         | /virtual geometry is a poor fit for modern GPUs. iirc peak fill
         | rate is 1/4th or something like that as GPU rasterization works
         | on 2x2 quads not per-pixel like shaders do.
        
           | jsheard wrote:
           | Hardware-rasterizing small triangles is indeed inefficient
           | due to the 2x2 quad tax, but one of Nanites tent-pole
           | features is a software rasterizer which sidesteps that
           | problem entirely. IIRC they said that for a screen entirely
           | filled with triangles roughly the size of a pixel, their
           | software raster ends up being about 3x faster than using the
           | raster hardware.
        
       | hising wrote:
       | I would love to see this but it wont work on Linux + Chrome even
       | if WebGPU is enabled.
        
       | jesse__ wrote:
       | > If you want to add this tech to the existing engine, I'm not a
       | person you should be asking (I don't work in the industry).
       | 
       | Fucking .. bravo man.
        
       | astlouis44 wrote:
       | Here's an actual implementation of UE5 in WebGPU, for anyone
       | interested.
       | 
       | Just a disclaimer that it will only work on WebGPU-enabled
       | browser on Windows (Chrome, Edge, etc) unfortunately Mac has
       | issues for now. Also, there is no Nanite in this demo, but it
       | will be possible in the future.
       | 
       | https://play.spacelancers.com/
        
         | mdaniel wrote:
         | I was curious what "issues" Mac has, and at least for me it
         | didn't explode for any _good_ reason, it puked trying to
         | JSON.stringify() some capabilities object into localStorage
         | which is a pretty piss-poor reason to bomb loading a webpage,
         | IMHO
        
       ___________________________________________________________________
       (page generated 2024-09-05 23:00 UTC)