Posts by gfxstrand@mastodon.gamedev.place
 (DIR) Post #Ad0idj1d0AAXjMv2W0 by gfxstrand@mastodon.gamedev.place
       2023-12-20T18:43:19Z
       
       0 likes, 1 repeats
       
       Finished off my year on a good note. Time to kick back and enjoy a bit of a break so I can go into 2024 fresh.If you want to know what we've been up to with NVK and what I've got on the docket for 2024, I blogged about it today. Check out the @collabora blog: https://col.la/nvkhu
       
 (DIR) Post #Ad0idk1JIpVcofmIeu by gfxstrand@mastodon.gamedev.place
       2023-12-20T18:49:44Z
       
       0 likes, 0 repeats
       
       Gotta say, it's fun seeing everything come together like this. I've spent a long time building and now we're getting to the fun part where features are landing like crazy, games are ruining, and we make *something* at least 10% faster about every week. It's a good time! 😁
       
 (DIR) Post #Ad0idoD9ijMRoZLog4 by gfxstrand@mastodon.gamedev.place
       2023-12-20T19:33:49Z
       
       1 likes, 0 repeats
       
       @CalcProgrammer1 @serebit Be warned that we're seeing some issues with PRIME setups right now (as most NVIDIA laptops are). It's some sort of a bad interaction between nouveau and i915. I think the Red Hat folks are looking into it but it may be rocky for a bit. Hopefully we'll get all that sorted before it hits distros, though.
       
 (DIR) Post #AhU2tmvvHEYtcvNssy by gfxstrand@mastodon.gamedev.place
       2024-05-02T14:19:50Z
       
       1 likes, 0 repeats
       
       @lina Yay VM_BIND!
       
 (DIR) Post #Ai048uyAEGbKl8PZCq by gfxstrand@mastodon.gamedev.place
       2024-05-15T19:53:59Z
       
       0 likes, 1 repeats
       
       This week's project: Reworking NVK cbuf support. We've had a lot of issues with too much internal stalling and I think a lot of them come down to the fact that we're re-binding cbufs every draw call.My plan for root constants, is to do inline updates with the LOAD_CONSTANT_BUFFER command. I don't know how much of a difference there is but I strongly suspect this pipelines much better.For bound cbufs, I'm planning to just make our dirty tracking way more competent.We'll see how it goes!
       
 (DIR) Post #Ai048xknsOhFOZBRIW by gfxstrand@mastodon.gamedev.place
       2024-05-17T21:14:43Z
       
       0 likes, 0 repeats
       
       So far, it's pretty clear that re-binding cbufs repeatedly is causing a significant bottlekneck. By using inline cbuf updates for cb0 and disabling my bound cbuf optimization and switching back to global memory reads for UBOs, we can get a 70% perf boost in The Witness on a 4090. Yeah, clearly something is serializing inside that monster. IDK what all but something.
       
 (DIR) Post #Ai0490fx0tb6SNw6eO by gfxstrand@mastodon.gamedev.place
       2024-05-17T21:17:25Z
       
       0 likes, 0 repeats
       
       Next up: Bindless UBOs.On NVIDIA, bindless UBOs use a 64-bit descriptor with 40 bits of base address and 14 bits of size / 4. Those can be referenced directly from ALU instructions and should get nearly the same shader perf as bound cbufs.The real trick here, though, is that they require the use of the uniform register file. This, unfortunately, has funky register allocation implications because of how it interacts with uniformity and reconvergence.
       
 (DIR) Post #Ai0493JjBybUeKYtvc by gfxstrand@mastodon.gamedev.place
       2024-05-17T21:20:51Z
       
       0 likes, 0 repeats
       
       I believe the rule that I want is that UGPRs can only ever be assigned in uniform control-flow and remain live until their last uniform use. If a UGPR is live-in to a divergent branch instruction, then the merge which forces re-convergence is considered to be a use as well.
       
 (DIR) Post #Ai0495x9ONKIpB1Pea by gfxstrand@mastodon.gamedev.place
       2024-05-17T21:22:52Z
       
       0 likes, 0 repeats
       
       I don't think they can ever safely be assigned outside of uniform control-flow. On NVIDIA HW, it's possible that two different portions of the wave may execute the same block at the same time without being converged. This means that we can't even assume the usual SSA interference rules within a single block unless we're guaranteed that all non-exited invocations are converged as they execute said block.
       
 (DIR) Post #Ai8DXhXZkQXtCaXxXU by gfxstrand@mastodon.gamedev.place
       2024-05-21T23:43:02Z
       
       1 likes, 0 repeats
       
       @CalcProgrammer1 So you're saying it took them longer to fix Wayland than it did me to write a whole new driver from blank files? 😉
       
 (DIR) Post #AirQiHOmZ8dS1APgLA by gfxstrand@mastodon.gamedev.place
       2024-06-12T17:24:07Z
       
       1 likes, 1 repeats
       
       Want to read some spooky driver code?https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/vulkan_hasvk/anv_batch_chain.c?ref_type=heads#L1463That function is mission-critical for performance for Vulkan on Intel hardware with older kernels/hardware.
       
 (DIR) Post #AirQiK3GhaD0FJN2iu by gfxstrand@mastodon.gamedev.place
       2024-06-12T17:24:17Z
       
       0 likes, 0 repeats
       
       In the early days of the Intel Vulkan driver, we had to deal with kernel patching of command buffers. This is because older Intel hardware only had 32-bit addresses and the kernel API didn't allow userspace to know addresses up-front. (No buffer device address for you!) Instead, the kernel would assign addresses on-demand and patch them into userspace BOs before executing. There was also a presumed address mechanism to avoid redundant relocations.
       
 (DIR) Post #AirQiMUzbTZdpsMDDM by gfxstrand@mastodon.gamedev.place
       2024-06-12T17:24:26Z
       
       0 likes, 0 repeats
       
       Unfortunately, thanks to Vulkan's execution model, the userspace driver is the only component in the system that has any idea which bits of memory actually have the right address. Also, because Vulkan heavily re-uses objects across command buffers, things like the descriptor pool would end up having relocations applied frequently and, since the descriptor pool is always in use, doing so would mean a stall on nearly every batch.
       
 (DIR) Post #AirQiOqgrm7P7kWZJQ by gfxstrand@mastodon.gamedev.place
       2024-06-12T17:24:38Z
       
       0 likes, 0 repeats
       
       The solution was to relocate in userspace and tell the kernel not to bother unless it actually moved something (which is rare). However, this means userspace is potentially racing with both the kernel and the hardware.But it's provably safe...And shipped in production for years and I never got a single bug report that ended up being that code.
       
 (DIR) Post #AjIm98T3OjXbMz2OSO by gfxstrand@mastodon.gamedev.place
       2024-06-25T23:47:13Z
       
       0 likes, 1 repeats
       
       
       
 (DIR) Post #AmS3d4ebT4JUMr7UoK by gfxstrand@mastodon.gamedev.place
       2024-09-28T04:33:59Z
       
       0 likes, 1 repeats
       
       Hey Nouveau users!Have any of you been using NVK+Zink as your daily driver in the last 3-6 months? (i.e., running your desktop, productivity apps on it, etc. Not just games.) If so, how well has it been working for you?Boosts welcome. 💜
       
 (DIR) Post #AmfvPJlL0hYVTzuAds by gfxstrand@mastodon.gamedev.place
       2024-10-04T21:13:42Z
       
       0 likes, 0 repeats
       
       A little preview of one of my XDC talks...
       
 (DIR) Post #Aodw8MbIAZJO6O3goK by gfxstrand@mastodon.gamedev.place
       2024-12-02T17:06:26Z
       
       0 likes, 0 repeats
       
       Another day. Another Vulkan release. Another Mesa MR implementing said Vulkan release:https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32433
       
 (DIR) Post #Aodw8O9iO4ROvRAh6W by gfxstrand@mastodon.gamedev.place
       2024-12-02T17:11:12Z
       
       1 likes, 0 repeats
       
       For me personally, this continues an unbroken streak of day-zero Mesa implementations of new Vulkan versions. I was there with the Intel driver when the Vulkan spec was released and me and my team have hit day zero every time since.The difference this time around is that I wrote an entirely new driver between Vulkan 1.3 and Vulkan 1.4. While the Intel team should have an MR going up soon for ANV, my driver for this round is NVK. 😁
       
 (DIR) Post #Aodw8P4mxs5vmRsH44 by gfxstrand@mastodon.gamedev.place
       2024-12-02T18:05:38Z
       
       1 likes, 1 repeats
       
       And... Vulkan 1.4 support for NVK has now merged!