[HN Gopher] UE5 Nanite in WebGPU
___________________________________________________________________
UE5 Nanite in WebGPU
Author : vouwfietsman
Score : 221 points
Date : 2024-09-05 17:55 UTC (5 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jsheard wrote:
| It's cool that it kind of works, but they had to make some nasty
| compromises to get around WebGPUs lack of 64 bit atomics.
| Hopefully that will be added as an optional extension at some
| point, hardware support is almost ubiquitous on desktop-class
| hardware at least (AMD and Nvidia have had it forever but Apple
| has only had it since the M3).
| throwaway17_17 wrote:
| What is the use case for atomics in the rasterizer? I can't
| figure out what the atomic operations do inside the rendering
| pipeline. I looked at the GitHub, but couldn't find the place
| the atomic were hoped for.
| hrydgard wrote:
| Pack Z and 32-bit color together into a 64-bit integer, then
| do an atomic min (or max with reversed Z) to effectively do a
| Z-query and a write really, really fast.
| jsheard wrote:
| Nanite writes out the ID of the primitive at that pixel
| rather than the color, but otherwise yeah that's the idea.
| After rasterization is done a separate pass uses that ID to
| fetch the vertex data again and reconstruct all of the
| material parameters, which can be freely written out
| without atomics since there's exactly one thread per pixel
| at that point.
| neomantra wrote:
| Visibility buffer needing atomics is noted briefly in the
| long README. Link to discussion detailing it:
| https://github.com/Scthe/nanite-webgpu/issues/1
| jsheard wrote:
| With traditional hardware rasterization there are specialized
| hardware blocks which handle atomically updating the
| framebuffer to whichever sample is currently the closest to
| the camera, and discarding anything behind that. Nanite does
| software rasterization instead, and one of their insights was
| figuring out a practical way to cram all of the data needed
| for each pixel into just 64 bits (depth in the high bits and
| everything else in the low bits) which allows them to do
| efficient depth sorting using min/max atomics from a compute
| shader instead. The 64 bits are crucial though, that's the
| absolute bare minimum useful amount of data per pixel so you
| _really_ need 64 bit atomics. Nanite doesn 't even try to
| work without them.
|
| To kind of get it working with 32 bit atomics this demo is
| reducing depth to just 16 bits (not enough to avoid
| artifacts) and only encoding a normal vector into the other
| 16 bits, which is why the compute rasterized pixels are
| untextured. There just aren't enough bits to store any more
| material parameters or a primitive ID, the latter being how
| Nanite does it.
| my123 wrote:
| Since the M2
| jsheard wrote:
| Right you are, 64 bit atomics were added with the Apple8 GPU
| but only in M-series chips (M2 and up) and then the Apple9
| GPU made it universal (A17 Pro and up).
|
| https://developer.apple.com/metal/Metal-Feature-Set-
| Tables.p...
| moralestapia wrote:
| Outstanding work. Also, thanks for proving actual demos of the
| tech. I get 60-120fps on my MBP which is phenomenal given the
| amount of triangles in the scene.
| macawfish wrote:
| The camera controls on my phone are very hard to get down
| aaroninsf wrote:
| Browser/touchpad also :)
| bob1029 wrote:
| Couldn't we use something like this to provide a more intuitive
| experience for the mobile web targets?
|
| https://developer.mozilla.org/en-US/docs/Web/API/Device_orie...
| TaylorAlexander wrote:
| It says my iPhone 12 Pro Max doesn't have WebGPU, but I enabled
| it in experimental features and another website[1] with WebGPU
| demos now works. Has anyone gotten this working on iPhone? Would
| be nice if the web app gave more info on what failed.
|
| [1] https://webgpu.github.io/webgpu-samples/?sample=texturedCube
| KMnO4 wrote:
| I enabled WebGPU in Safari but I'm seeing a bunch of shader
| errors.
|
| WebGPU error [init][validation]: 6 errors generated while
| compiling the shader: 50:22: unresolved call target
| 'pack4x8snorm' 50:9: cannot bitcast from 'aY=' to 'f32' 54:10:
| unresolved call target 'unpack4x8snorm' 59:22: unresolved call
| target 'pack4x8unorm' 59:9: cannot bitcast from 'aY=' to 'f32'
| 63:9: unresolved call target 'unpack4x8unorm'
| soulofmischief wrote:
| Is the demo using user agent strings to determine compatibility?
| That's not good, and feature compatibility should be determined
| on a case-by-case basis by simply attempting to detect/use the
| specific feature.
|
| I am on Chromium, not Chrome, and use WebGPU all the time, but
| the demos tell me to use Chrome, which I cannot do ethically.
| Would love to try the demos out, this looks like a lot of hard
| work!
| robin_reala wrote:
| Don't think so. I'm on a Firefox that has experimental WebGPU
| support enabled, and it fails with shader compilation errors
| rather than any message.
| drusepth wrote:
| If this is the case, I imagine it'd be pretty easy to spoof
| your UA and see the demo, even from Chromium.
| gpm wrote:
| > and use WebGPU all the time
|
| I'm curious, what for?
| soulofmischief wrote:
| I've used it to build and/or run various machine learning
| models for text generation, speech recognition, image
| generation, depth estimation, etc. in the browser, in support
| of an agentic system I've been building out.
|
| Lots of future possibilities as well once support is more
| ubiquitous!
| Twirrim wrote:
| It's not working for me on Chrome under Linux, nor on Android,
| for what it's worth (though Firefox is what I use for
| practically all my browsing needs). Something really odd with
| their detection logic.
| pjmlp wrote:
| WebGPU is not supported on Linux, and it won't for the
| foreseeable future.
|
| On Android you should have at least Android 12, with good
| enough Vulkan drivers, not blacklisted.
| sva_ wrote:
| > WebGPU is not supported on Linux, and it won't for the
| foreseeable future.
|
| A lot of it runs fine with a flag.
| bakugo wrote:
| >Is the demo using user agent strings to determine
| compatibility
|
| >I am on Chromium, not Chrome
|
| Don't know about your build, but I'm using Ungoogled Chromium,
| and it has the exact same user-agent string as Google Chrome.
|
| Have you enabled the WebGL permission for the site in site
| settings? I think it was disabled by default for me.
| SaintSeiya wrote:
| Honest question: It is calim that software rasterizer is faster
| than hardware one. Can someone explain me why? isn't the purpose
| of the GPU to accelerate rasterization itself? Unless is a recent
| algorithm or the "software rasterizer" is actually running on the
| GPU and not the CPU I don't see how
| janito wrote:
| I'm also curious. From what I could read in the repository's
| references, I think that the problem is that the GPU is bad at
| rasterizing small triangles. Apparently each triangle in the
| fixed function pipeline generates a batch of pixels to render
| (16 in one of the slides I saw), so if the triangle covers only
| one or two pixels, all others in the batch are wasted. I
| speculate that the idea is to then detect these small triangles
| and draw them quickly using less pixel shaders (still on the
| GPU, but without using the graphics specific fixed functions),
| but I'm honestly not sure I understand what's happening.
| NotGMan wrote:
| I'm a bit out of the GPU game but so this might be slightly
| wrong in some places: the issue is in small triangles because
| you end up paying a huge cost. GPUs ALWAYS shade in 2x2 blocks
| of pixels, not 1x1 pixels.
|
| So if you have a very small triangle (small as in how many
| pixels on the screen it covers) that covers 1 pixel you will
| still pay the price of a 2x2 block (4 pixels instead of 1), so
| you just wasted 300% of your performance.
|
| Nanite auto-picks the best triangle to minimize this and
| probably many more perf metrics that I have no idea about.
|
| So even if you do it in software the point is that if you can
| get rid of that 2x2 block penalty as much as possible you could
| be faster than GPU doing 2x2 blocks in hardware since pixel
| shaders can be very expensive.
|
| This issue gets worse the larger the rendering resolution is.
|
| Nanite then picks larger triangles instead of those tiny
| 1-pixel ones since those are too small to give any visual
| fidelity anyway.
|
| Nanite is also not used for large triangles since those are
| more efficient to do in hardware.
| kllrnohj wrote:
| > So even if you do it in software the point is that if you
| can get rid of that 2x2 block penalty as much as possible you
| could be faster than GPU doing 2x2 blocks in hardware since
| pixel shaders can be very expensive.
|
| Of course the obvious problem with that is if you don't have
| most of the screen covered in such small triangles then
| you're paying a large cost for nanite vs traditional means.
| Animats wrote:
| The answer to that is in this hour-long SIGGRAPH video.[1] Some
| of the operations needed are not done well, or at all, by the
| GPU.
|
| [1] https://www.youtube.com/watch?v=eviSykqSUUw
| janito wrote:
| Here's the relevant part of the (really cool!) video:
| https://www.youtube.com/watch?v=eviSykqSUUw&t=1888s
| SaintSeiya wrote:
| thanks all, yes it start making sense now
| bob1029 wrote:
| I thought it was a software rasterizer running inside fragment
| shader on the GPU. Not actually on the CPU. I need to watch
| that video again to be sure, but I cant see how a CPU could
| handle that many triangles.
| raphlinus wrote:
| To be precise, this is running in a compute shader
| (rasterizeSwPass.wgsl.ts for the curious). You can think of
| that as running the GPU in a mode where it's a type of
| computer with some frustrating limitations, but also the
| ability to efficiently run thousands of threads in parallel.
|
| This is in contrast to hardware rasterization, where there is
| dedicated hardware onboard the GPU to decide which pixels are
| covered by a given triangle, and assigns those pixels to a
| fragment shader, where the color (and potentially other
| things) are computed, finally written to the render target as
| a raster op (also a bit of specialized hardware).
|
| The seminal paper on this is cudaraster [1], which
| implemented basic 3D rendering in CUDA (the CUDA of 13 years
| ago is roughly comparable in power to compute shaders today),
| and basically posed the question: how much does using the
| specialized rasterization hardware help, compared with just
| using compute? The answer is roughly 2x, though it depends a
| lot on the details.
|
| And those details are important. One of the assumptions that
| hardware rasterization relies on for efficiency is that a
| triangle covers dozens of pixels. In Nanite, that assumption
| is not valid, in fact a great many triangles are
| approximately a single pixel, and then software/compute
| approaches actually start beating the hardware.
|
| Nanite, like this project, thus actually uses a hybrid
| approach: rasterization for medium to large triangles, and
| compute for smaller ones. Both can share the same render
| target.
|
| [1]: https://research.nvidia.com/publication/2011-08_high-
| perform...
| TinkersW wrote:
| A couple reasons
|
| 1. HW does 2x2 blocks of pixels always so it can have
| derivatives, even if you don't use them..
|
| 2. Accessing SV_PrimitiveID is surprisingly slow on Nvidia/AMD,
| by writing it out in the PS you will take a huge perf hit in
| HW. There are ways to work around this, but they aren't trivial
| and differ between vendors, and you have to be aware of the
| issue it in the first place! I think some of the "software" >
| "hardware" raster stuff may come from this.
|
| The HW shader in this demo looks wonky though, it should be
| writing out the visibility buffer, and instead it is writing
| out a vec4 with color data, so of course that is going to hurt
| perf. Way too many varyings being passed down also.
|
| In a high triangle HW rasterizer you want the visibility buffer
| PS do a little compute as possible, and write as little as
| possible, so it should only have 1 or 2 input varyings and
| simply writes them out.
| moffkalast wrote:
| > No WebGPU available. Please use Chrome.
|
| Getting that on Chromium, lol.
| gpm wrote:
| I'm getting that in _google chrome proper_ , but what completes
| the joke is that in firefox I just get a blank page without the
| message to use chrome.
|
| Edit: WebGPU in chrome is behind a flag on linux:
| https://github.com/gpuweb/gpuweb/wiki/Implementation-Status#...
| Animats wrote:
| Oh, nice. Third party implementations of Nanite playback.
|
| Nanite is a very clever representation of graphics meshes.
| They're directed acyclic graphs rather than trees. Repetition is
| a link, not a copy. It's recursive; meshes can share submeshes,
| which in turn can share submeshes, all the way down. It's also
| set up for within-mesh level of detail support, so the submeshes
| drop out when they're small enough. So you can have repetitive
| content of very large size with a finite amount of data and fast
| rendering times. The insight is that there are only so many
| pixels on screen, so there's an upper bound on rendering work
| really needed.
|
| There's a really good SIGGRAPH video on this from someone at
| Epic.
|
| Current GPU designs are a mismatch for Nanite, Some new hardware
| operations are needed to do more of this in the GPU, where it
| belongs. Whether that will happen, with NVidia distracted by the
| AI market, is a good question.
|
| The scene needs a lot of instancing for this to pay off. Unreal
| Engine demos show such things as a hall of identical statues. If
| each statue was different, Nanite would help far less. So it
| works best for projects where a limited number of objects are
| reused to create large areas of content. That's the case for most
| AAA titles. Watch a video of Cyberpunk 2077, and look for
| railings and trash heaps. You'll see the same ones over and over
| in totally different contexts.
|
| Making a nanite mesh is complicated, with a lot of internal
| offsets for linking, and so far only Unreal Engine's editor does
| it. With playback now open source, someone will probably do that.
|
| Those internal offsets in the format present an attack surface
| which probably can be exploited with carefully crafted bad
| content, like hostile Microsoft Word .doc files.
| turtledragonfly wrote:
| I think the SIGGRAPH talk you referred to is: "A Deep Dive into
| Nanite Virtualized Geometry"
| (https://www.youtube.com/watch?v=eviSykqSUUw)
|
| There's also this short high-level intro (2.5 min) that I
| thought was decent: "What is virtualized micropolygon geometry?
| An explainer on Nanite"
| (https://www.youtube.com/watch?v=-50MJf7hyOw)
| Jasper_ wrote:
| > Repetition is a link, not a copy. It's recursive; meshes can
| share submeshes, which in turn can share submeshes, all the way
| down.
|
| While it does construct a DAG to perform the graph cut, the
| final data set on disk is just a flat list of clusters for
| consideration, along with their cutoffs for
| inclusion/rejection. There seems to be a considerable
| misunderstanding of what the DAG is used for, and how it's
| constructed. It's constructed dynamically based on the vertex
| data, and doesn't have anything to do with how the artist
| constructed submeshes and things, nor does "repetition become a
| link".
|
| > The scene needs a lot of instancing for this to pay off.
| Unreal Engine demos show such things as a hall of identical
| statues. If each statue was different, Nanite would help far
| less.
|
| What makes you say this? The graph cut is _different_ for each
| instance of the object, so they can 't use traditional
| instancing, and I don't even see how it could help.
| Animats wrote:
| It may not be based on what the mesh's creator considered
| repetition, but repetition is encoded within the mesh. Not
| sure if the mesh builder discovers some of the repetition
| itself.
|
| Look at a terrain example:
|
| https://www.youtube.com/watch?v=DKvA7NZRUcg
| Jasper_ wrote:
| I'm not seeing what you claim to be seeing in that demo
| video. I see a per-triangle debug view, and a per-cluster
| debug view. None of that is showing repetition.
| Animats wrote:
| If there wasn't repetition, you'd need a really huge GPU
| for that scene at that level of detail.
| gmueckl wrote:
| I don't want to estimate storage space right now, but
| meshes can be stored very efficiently. For example, I
| think UE uses an optimization where vertex positions are
| heavily quantized to just a few bits within the meshlet's
| bounding box. Index buffers can be constructed to share
| the same vertices across LOD levels. Shading normals can
| be quantized quite a bit before shading artifacts become
| noticeable - if you even need them anymore at that
| triangle density.
|
| If your triangles are at or below the size of a texel,
| texture values could even be looked up offline and stored
| in the vertex attributes directly rather than keeping the
| UV coordinates around, but that may not be a win.
| jms55 wrote:
| Not necessarily. Nanite compresses meshes (including in-
| memory) _very_ heavily, and _also_ streams in only the
| visible mesh data.
|
| In general, I wouldn't think of Nanite as "one thing".
| It's a combination of many, many different techniques
| that add up into some really good technology.
| diggan wrote:
| > and so far only Unreal Engine's editor does it
|
| Not a major/mainstream engine by any means (a small Rust ECS
| game engine) but Bevy also supports something similar under the
| feature name "Virtual Geometry", mentioned here:
| https://bevyengine.org/news/bevy-0-14/#virtual-geometry-expe...
|
| Also, a technical deep dive into the feature from one of the
| authors of the feature:
| https://jms55.github.io/posts/2024-06-09-virtual-geometry-be...
| vinkelhake wrote:
| > Nanite playback
|
| That's not what this is though. It's an implementation of the
| techniques/technology used in Nanite. It doesn't load data from
| Unreal Engine's editor. One of the mentioned goals:
| Simplicity. We start with an OBJ file and everything is done
| in the app. No magic pre-processing steps, Blender exports,
| etc. You set the breakpoint at loadObjFile() and F10
| your way till the first frame finishes.
| pcwalton wrote:
| > Making a nanite mesh is complicated, with a lot of internal
| offsets for linking, and so far only Unreal Engine's editor
| does it.
|
| meshoptimizer [1] is an OSS implementation of meshlet
| generation, which is what most people think of when they think
| of "Nanite's algorithm". Bevy, mentioned in a sibling reply,
| uses meshoptimizer as the generation tool.
|
| (Strictly speaking, "Nanite" is a brand name that encompasses a
| large collection of techniques, including meshlets, software
| rasterization, streaming geometry, etc. For clarity, when
| discussing these concepts outside of the context of the Unreal
| Engine specifically, I prefer to refer to individual techniques
| instead of the "Nanite" brand. They're really separate, even
| though they complement one another. For example, software
| rasterization can be profitably used without meshlets if your
| triangles are really small. Streaming geometry can be useful
| even if you aren't using meshlets. And so on.)
|
| [1]: https://github.com/zeux/meshoptimizer
| jms55 wrote:
| Small correction: meshoptimizer only does the grouping
| triangles -> meshlets part, and the mesh simplification.
| Actually building the DAG, grouping clusters together, etc is
| handled by Bevy code (I'm the author, happy to answer
| questions).
|
| That said I do know zeux was interested in experimenting with
| Nanite-like DAGs directly in meshoptimizer, so maybe a future
| version of the library will have an end-to-end API.
| jiggawatts wrote:
| I read through the papers and my impression was that the
| biggest gains were from quantised coordinates and dynamic LOD
| for small patches instead of the entire mesh.
|
| The logic behind nanite as I understood it was to keep the mesh
| accuracy at roughly 1 pixel precision. So for example, a low
| detail mesh can be used with coordinates rounded to just 10
| bits (or whatever) if the resulting error is only about half a
| pixel when perspective projected onto the screen.
|
| I vaguely remember the quantisation pulling double duty: not
| only does it reduce the data storage size it also helps the LOD
| generation because it snaps vertices to the same locations in
| space. The duplicates can then be eliminated.
| hyperthesis wrote:
| This is like when Joel said git stores diffs.
| devit wrote:
| Name and description are very confusing and a trademark violation
| since despite the claims it seems to be completely unrelated to
| actual Nanite in UE5, just an implementation of something similar
| by a person unaffiliated with UE5.
|
| There is also Bevy's Virtual Geometry that provides similar
| functionality and is probably much more useful since it's written
| in Rust and integrated with a game engine:
| https://jms55.github.io/posts/2024-06-09-virtual-geometry-be...
| KMnO4 wrote:
| I don't think it's really an issue. It's clear from the readme
| that it's an implementation.
|
| If I made an "implementation of OpenAI's GPT-3 in JS" you would
| understand that to mean I took the architecture from the
| whitepaper and reimplemented it.
| smartmic wrote:
| Wow, I can't remember the last time I read a project summary with
| so much jargon - I literally didn't understand anything:
|
| > UE5's Nanite implementation using WebGPU. Includes the meshlet
| LOD hierarchy, software rasterizer and billboard impostors.
| Culling on both per-instance and per-meshlet basis.
| goodcjw2 wrote:
| Guess this really shows how much domain specific knowledge in
| the Computer Graphics...
|
| Yet still, this post is now ranked top 1 on HN.
| bogwog wrote:
| UE5 Nanite -> https://dev.epicgames.com/documentation/en-
| us/unreal-engine/...
|
| WebGPU -> https://developer.mozilla.org/en-
| US/docs/Web/API/WebGPU_API
|
| Meshlet -> https://developer.nvidia.com/blog/introduction-
| turing-mesh-s...
|
| LOD ->
| https://en.wikipedia.org/wiki/Level_of_detail_(computer_grap...
|
| Software rasterizer ->
| https://en.wikipedia.org/wiki/Rasterisation ("software" means
| it runs on the CPU instead of GPU)
|
| Billboard imposters ->
| https://www.alanzucconi.com/2018/08/25/shader-showcase-satur...
|
| Culling -> https://en.wikipedia.org/wiki/Hidden-
| surface_determination
| nicebyte wrote:
| > ("software" means it runs on the CPU instead of GPU)
|
| no, in this context it means that the rasterisation algorithm
| is implemented in a compute kernel, rather than using the
| fixed hw built into the gpu. so rasterization still happens
| on the gpu, just using programmable blocks.
| theogravity wrote:
| Using latest chrome on M2 Max for the jinx demo:
| WebGPU error [frame][validation]: Fill size (7398781) is not a
| multiple of 4 bytes. - While encoding
| [CommandEncoder "main-frame-cmd-buffer"].ClearBuffer([Buffer
| "rasterize-sw"], 0, 7398781).
| stephc_int13 wrote:
| I have the same error on Windows 11, GPU is a RTX4090. Browser
| is Edge.
| replete wrote:
| Intel Mac, Chrome and ungoogled chromium: index.web.ts:159
| Uncaught (in promise) OperationError: Instance dropped in
| popErrorScope
| eigenvalue wrote:
| Whenever I see rendered scenes like this (I.e., lots of
| repetitive static geometry) I imagine that annoying guy's voice
| going on about "unlimited detail" from that old vaporware video.
| I guess nanite really did solve that problem for real, as opposed
| to whatever that old thing was using (I remember something about
| oct-trees or something).
| HappMacDonald wrote:
| I recall those claims being made by a company called
| "Euclidean", from Australia I think. Online rumors suggested
| they might have been using octtrees, but later Euclidean videos
| flatly denied that.
| raphlinus wrote:
| It's Euclideon. And it is octtrees. My interpretation after
| reading a _fascinating_ Reddit thread [1] is that these
| denials were misdirection. There 's definitely new interest
| in splatting techniques (Gaussian in particular), though
| they've long been an alternative to triangles in the 3D
| world. I think it'd be fun to experiment with implementing
| some of that using modern compute shaders.
|
| [1]: https://www.reddit.com/r/VoxelGameDev/comments/1bz5vvy/a
| _sma...
| forrestthewoods wrote:
| Note: this isn't actually UE5 Nanite in WebGPU. It's a totally
| independent implementation of the same idea as Nanite.
|
| This technique is starting to appear in a variety of places.
| Nanite definitely made the idea famous, but Nanite is the name a
| specific implementation, not the name of the technique.
| readyplayernull wrote:
| Will virtual geometry be integrated into GPUs some day?
| tech-no-logical wrote:
| getting the message No WebGPU available. Please
| use Chrome.
|
| on chrome (Version 129.0.6668.29 (Official Build) beta (64-bit))
| , under windows
| jms55 wrote:
| It's been mentioned a couple of times in this thread, but Bevy
| also has an implementation of Nanite's ideas (sometimes called
| Virtual Geometry). I'm the author of that, happy to answer
| questions :)
|
| As for this project, Scthe did a great job! I've been talking
| with them about several parts of the process, culminating in some
| improvements to Bevy's code based on their experience
| (https://github.com/bevyengine/bevy/pull/15023). Always happy to
| see more people working on this, Nanite has a ton of cool ideas.
| KronisLV wrote:
| I wonder how other engines compare when it comes to LODs and
| similar systems.
|
| Godot has automatic LOD which seems pretty cool for what it is:
| https://docs.godotengine.org/en/stable/tutorials/3d/mesh_lod...
|
| Unity also has an LOD system, though despite how popular the
| engine is, you have to create LOD models manually:
| https://docs.unity3d.com/Manual/LevelOfDetail.html (unless you
| dig through the asset store and find a plugin)
|
| I did see an interesting approach in a lesser known engine called
| NeoAxis: https://www.neoaxis.com/docs/html/NeoAxis_Levels.htm
| however that engine ran very poorly for me on my old RX580,
| although I haven't tried on my current A580.
|
| As far as I can tell, Unreal is really quite far ahead of the
| competition when it comes to putting lots of things on the
| screen, except the downside of this is that artists will be
| tempted to include higher quality assets in their games, bloating
| the install sizes quite far.
| kllrnohj wrote:
| In theory Nanite is superior to precomputed LODs. In practice
| it's less clear cut as they aren't going to be as good as
| artist-created LODs and it's not entirely reasonable to expect
| them to do so. Also the performance cost is _huge_ as Nanite
| /virtual geometry is a poor fit for modern GPUs. iirc peak fill
| rate is 1/4th or something like that as GPU rasterization works
| on 2x2 quads not per-pixel like shaders do.
| jsheard wrote:
| Hardware-rasterizing small triangles is indeed inefficient
| due to the 2x2 quad tax, but one of Nanites tent-pole
| features is a software rasterizer which sidesteps that
| problem entirely. IIRC they said that for a screen entirely
| filled with triangles roughly the size of a pixel, their
| software raster ends up being about 3x faster than using the
| raster hardware.
| hising wrote:
| I would love to see this but it wont work on Linux + Chrome even
| if WebGPU is enabled.
| jesse__ wrote:
| > If you want to add this tech to the existing engine, I'm not a
| person you should be asking (I don't work in the industry).
|
| Fucking .. bravo man.
| astlouis44 wrote:
| Here's an actual implementation of UE5 in WebGPU, for anyone
| interested.
|
| Just a disclaimer that it will only work on WebGPU-enabled
| browser on Windows (Chrome, Edge, etc) unfortunately Mac has
| issues for now. Also, there is no Nanite in this demo, but it
| will be possible in the future.
|
| https://play.spacelancers.com/
| mdaniel wrote:
| I was curious what "issues" Mac has, and at least for me it
| didn't explode for any _good_ reason, it puked trying to
| JSON.stringify() some capabilities object into localStorage
| which is a pretty piss-poor reason to bomb loading a webpage,
| IMHO
___________________________________________________________________
(page generated 2024-09-05 23:00 UTC)