[HN Gopher] AMD GPU Debugger
       ___________________________________________________________________
        
       AMD GPU Debugger
        
       Author : ibobev
       Score  : 186 points
       Date   : 2025-12-08 16:06 UTC (6 hours ago)
        
 (HTM) web link (thegeeko.me)
 (TXT) w3m dump (thegeeko.me)
        
       | snarfy wrote:
       | Is there not an official tool from AMD?
        
         | c2h5oh wrote:
         | GDB supports it
         | https://sourceware.org/gdb/current/onlinedocs/gdb.html/AMD-G...
         | 
         | You also get UMR from AMD
         | https://gitlab.freedesktop.org/tomstdenis/umr
         | 
         | There is also a bunch of other tools provided:
         | https://gpuopen.com/radeon-gpu-detective/
         | https://gpuopen.com/news/introducing-radeon-developer-tool-s...
        
           | slavik81 wrote:
           | It's worth noting that upstream gdb (and clang) are somewhat
           | limited in GPU debugging support because they only use (and
           | emit) standardized DWARF debug information. The DWARF
           | standard will need updates before gdb and clang can reach
           | parity with the AMD forks, rocgdb and amdclang, in terms of
           | debugging support. It's nothing fundamental, but the AMD
           | forks use experimental DWARF features and the upstream
           | projects do not.
           | 
           | It's a little out of date now, but Lance Six had a
           | presentation about the state of AMD GPU debugging in upstream
           | gdb at FOSDEM 2024. https://archive.fosdem.org/2024/events/at
           | tachments/fosdem-20...
        
           | thegeeko wrote:
           | amd gdb is an actual debugger but it only works with
           | applications that emit dwarf and use the amdkfd KMD aka it
           | doesn't work with graphics .. all of the rest are not a
           | actual debuggers .. UMR does support wave stepping but it
           | doesn't try to be a shader debugger rather a tool for drivers
           | developers and the AMD tools doesn't have any debugging
           | capabilities.
        
         | almostgotcaught wrote:
         | > After searching for solutions, I came across rocgdb, a
         | debugger for AMD's ROCm environment.
         | 
         | It's like the 3rd sentence in the blog post.......
        
           | djmips wrote:
           | to be fair it wasn't clear that was an official AMD debugger
           | and besides that's only for debugging ROCm applications.
        
             | almostgotcaught wrote:
             | this sentence doesn't make any sense a) ROCm is an AMD
             | product b) ROCm "applications" are GPU "applications".
        
               | fc417fc802 wrote:
               | But not all GPU applications are ROCm applications (I
               | would think).
               | 
               | I can certainly understand OP's confusion. Navigating
               | parts of the GPU ecosystem that are new to you can be
               | incredibly confusing.
        
               | thegeeko wrote:
               | there's 2 AMD KMD(kernel mode drivers) in linux: amdkfd
               | and amdgpu .. the graphics applications use the amdgpu
               | which is not supported by amdgdb .. amdgdb also has the
               | limitation of requiring dwarf and and mesa/amd UMDs
               | doesn't generate that ..
        
       | shetaye wrote:
       | There also exists cuda-gdb[1], a first-party GDB for NVIDIA's
       | CUDA. I've found it to be pretty good. Since CUDA uses a
       | threading model, it works well with the GDB thread ergonomics
       | (though you can only single-step at the warp granularity IIRC by
       | the nature of SM execution).
       | 
       | [1] https://docs.nvidia.com/cuda/cuda-gdb/index.html
        
       | danjl wrote:
       | For NVIDIA cards, you can use NSight. There's also RenderDoc that
       | works on a large number of GPUs.
        
         | _zoltan_ wrote:
         | nsys and nvtx are awesome.
         | 
         | many don't know but you can use them without GPUs :)
        
       | whalesalad wrote:
       | Tangent: is anyone using a 7900 XTX for local
       | inference/diffusion? I finally installed Linux on my gaming pc,
       | and about 95% of the time it is just sitting off collecting dust.
       | I would love to put this card to work in some capacity.
        
         | qskousen wrote:
         | I've done it with a 6800XT, which should be similar. It's a
         | little trickier than with an Nvidia card (because everything is
         | designed for CUDA) but doable.
        
         | FuriouslyAdrift wrote:
         | You'd be much better off wiht any decent nVidia against the
         | 7900 series.
         | 
         | AMD doesn't have a unified architecture across GPU and compute
         | like nVidia.
         | 
         | AMD compute cards are sold under the Insinct line and are
         | vastly more powerfull than their GPUs.
         | 
         | Supposedly, they are moving back to a unified architecture in
         | the next generation of GPU cards.
        
         | Joona wrote:
         | I tested some image and text generation models, and generally
         | things just worked after replacing the default torch libraries
         | with AMD's rocm variants.
        
         | universa1 wrote:
         | try it with ramalama[1]. worked fine here with a 7840u and a
         | 6900xt.
         | 
         | [1] https://ramalama.ai/
        
         | Gracana wrote:
         | I bought one when they were pretty new and I had issues with
         | rocm (iirc I was getting kernel oopses due to GPU OOMs) when
         | running LLMs. It worked mostly fine with ComfyUI unless I tried
         | to do especially esoteric stuff. From what I've heard lately
         | though, it should work just fine.
        
         | jjmarr wrote:
         | I've been using it for a few years on Gentoo. There were
         | challenges with Python 2 years ago, but over the past year it's
         | stabilized and I can even do img2video which is the most
         | difficult local inference task so far.
         | 
         | Performance-wise, the 7900 xtx is still the most cost effective
         | way of getting 24 gigabytes that isn't a sketchy VRAM mod. And
         | VRAM is the main performance barrier since any LLM is going to
         | _barely_ fit in memory.
         | 
         | Highly suggest checking out TheRock. There's been a big
         | rearchitecting of ROCm to improve the UX/quality.
        
         | veddan wrote:
         | For LLMs, I just pulled the latest llama.cpp and built it.
         | Haven't had any issues with it. This was quite recently though,
         | things used be a lot worse as I understand it.
        
       | mitchellh wrote:
       | Non-AMD, but Metal actually has a [relatively] excellent debugger
       | and general dev tooling. It's why I prefer to do all my GPU work
       | Metal-first and then adapt/port to other systems after that:
       | https://developer.apple.com/documentation/Xcode/Metal-debugg...
       | 
       | I'm not like a AAA game developer or anything so I don't know how
       | it holds up in intense 3D environments, but for my use cases it's
       | been absolutely amazing. To the point where I recommend people
       | who are dabbling in GPU work grab a Mac (Apple Silicon often
       | required) since it's such a better learning and experimentation
       | environment.
       | 
       | I'm sure it's linked somewhere there but in addition to
       | traditionally debugging, you can actually emit formatted log
       | strings from your shaders and they show up interleaved with your
       | app logs. Absolutely bonkers.
       | 
       | The app I develop is GPU-powered on both Metal and OpenGL systems
       | and I haven't been able to find anything that comes near the
       | quality of Metal's tooling in the OpenGL world. A lot of stuff
       | people claim is equivalent but for someone who has actively used
       | both, I strongly feel it doesn't hold a candle to what Apple has
       | done.
        
         | mattbee wrote:
         | My initiation into shaders was porting some graphics code from
         | OpenGL on Windows to PS5 and Xbox, and (for your NDA and devkit
         | fees) they give you some very nice debuggers on both platforms.
         | 
         | But yes, when you're stumbling around a black screen, tooling
         | is everything. Porting bits of shader code between syntaxes is
         | the easy bit.
         | 
         | Can you get better tooling on Windows if you stick to DirectX
         | rather than OpenGL?
        
           | mitchellh wrote:
           | > Can you get better tooling on Windows if you stick to
           | DirectX rather than OpenGL?
           | 
           | My app doesn't currently support Windows. My plane was to use
           | the full DirectX suite when I get there and go straight to
           | D3D and friends. I lack experience at all on Windows so I'd
           | love if someone who knows both macOS and Windows could
           | compare GPU debugging!
        
             | speps wrote:
             | Windows has PIX for Windows, PIX is the name of the GPU
             | debugging since Xbox 360. The Windows version is similar
             | but it relies on debug layers that need to be GPU specific
             | which is usually handled automatically. Although because of
             | that it's not as deep as the console version but it lets
             | you get by. Most people use RenderDoc on supported
             | platforms though (Linux and Windows). It supports most APIs
             | you can find on these platforms.
        
         | billti wrote:
         | It's a full featured and beautifully designed experience, and
         | when it works it's amazing. However it regularly freezes of
         | hangs for me, and I've lost count of the number of times I've
         | had to 'force quit' Xcode or it's just outright crashed. Also,
         | for anything non-trivial it often refuses to profile and I have
         | to try to write a minimal repro to get it to capture anything.
         | 
         | I am writing compute shaders though, where one command buffer
         | can run for seconds repeatedly processing over a 1GB buffer,
         | and it seems the tools are heavily geared towards graphics work
         | where the workload per frame is much lighter. (Will all the AI
         | focus, hopefully they'll start addressing this use-case more).
        
           | mitchellh wrote:
           | > However it regularly freezes of hangs for me, and I've lost
           | count of the number of times I've had to 'force quit' Xcode
           | or it's just outright crashed.
           | 
           | This has been my experience too. It isn't often enough to
           | diminish its value for me since I have basically no
           | comparable options on other platforms, but it definitely has
           | some sharp (crashy!) edges.
        
         | hoppp wrote:
         | Is your code easy to transfer to other environments? The Apple
         | vendor lock-in is not a great place for development if the end
         | product runs on servers, unlike using AMD Gpus which can be
         | found on the backend. Same goes for games because most gamers
         | either have an AMD or an Nvidia graphics card as playing on Mac
         | is still rare, so priority should be supporting those platforms
         | 
         | Its probably awesome to use Metal and everything but the vendor
         | lock-in sounds like an issue.
        
           | mitchellh wrote:
           | It has been easy. All modern GPU APIs are basically the same
           | now unless you're relying on the most cutting edge features.
           | I've found that converting between MSL, OpenGL (4.3+), and
           | WebGPU to be trivial. Also, LLMs are pretty good at it on
           | first pass.
        
             | hoppp wrote:
             | Thats pretty cool then!
        
       ___________________________________________________________________
       (page generated 2025-12-08 23:00 UTC)