https://github.com/GPUOpen-Tools/radeon_gpu_detective Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Customer Stories + White papers, Ebooks, Webinars + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. {{ message }} GPUOpen-Tools / radeon_gpu_detective Public * Notifications * Fork 1 * Star 5 Tool for post-mortem analysis of GPU crashes. License MIT license 5 stars 1 fork Activity Star Notifications * Code * Issues 0 * Pull requests 0 * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Actions * Projects * Security * Insights GPUOpen-Tools/radeon_gpu_detective This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. master Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 1 branch 1 tag Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/G] Use Git or checkout with SVN using the web URL. [gh repo clone GPUOpe] Work fast with our official CLI. Learn more about the CLI. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit @AmitBM AmitBM Additional updates for v1.0 release. ... d04c01d Aug 17, 2023 Additional updates for v1.0 release. d04c01d Git stats * 13 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time build Additional updates for v1.0 release. August 17, 2023 13:30 documentation Additional updates for v1.0 release. August 17, 2023 12:18 external/dev_driver/include Updates for v1.0 release. August 16, 2023 15:20 source/radeon_gpu_detective_cli Updates for v1.0 release. August 16, 2023 15:20 .gitignore Updates for v1.0 release. August 16, 2023 15:20 Buildinfo.properties.in Updates for v1.0 release. August 16, 2023 15:20 CMakeLists.txt Updates for v1.0 release. August 16, 2023 15:20 LICENSE Initial commit July 28, 2023 11:21 README.md Additional updates for v1.0 release. August 17, 2023 13:30 RGD_NOTICES.txt Updates for v1.0 release. August 16, 2023 15:20 RGD_RELEASE_NOTES.txt Updates for v1.0 release. August 16, 2023 15:20 View code [ ] Radeon(tm) GPU Detective (RGD) Build Instructions Running Usage Crash Analysis File Information System Information Markers in Progress Execution Marker Tree Configuring the Execution Marker Output with RGD CLI Configuring the Page Fault Summary with RGD CLI Page Fault Summary Capturing AMD GPU Crash Dump Files README.md Radeon(tm) GPU Detective (RGD) RGD is a tool for post-mortem analysis of GPU crashes. The tool performs offline processing of AMD GPU crash dump files and generates crash analysis reports in text and JSON formats. To generate AMD GPU crash dumps for your application, use Radeon Developer Panel (RDP) and follow the Crash Analysis help manual. Build Instructions It is recommended to build the tool using the "pre_build.py" script which can be found under the "build" subdirectory. Steps: cd build python pre_build.py The script supports different options such as using different MSVC toolsets versions. For the list of options run the script with -h. By default, a solution is generated for VS 2019. To generate a solution for a different VS version or to use a different MSVC toolchain use the --vs argument. For example, to generate the solution for VS 2022 with the VS 2022 toolchain (MSVC 17), run: python pre_build.py --vs 2022 Running Basic usage (text output): rgd --parse -o Basic usage (JSON output): rgd --parse --json For more options, run rgd -h to print the help manual. Usage The rgd command line tool accepts AMD driver GPU crash dump files as an input (.rgd files) and generates crash analysis report files with summarized information that can assist in debugging GPU crashes. The basic usage is: rgd --parse -o The rgd command line tool's crash analysis output files include the following information by default: * System information (CPUs, GPUs, driver, OS etc.) * Execution marker tree for each command buffer which was in flight during the crash. * Summary of the markers that were in progress during the crash (similar to the marker tree, just without the hierarchy and only including markers that were in progress). * Page fault summary (for crashes that were determined to be caused by a page fault) Both text and JSON output files include the same information, in different representation. For simplicity, we will refer here to the human-readable textual output. Here are some more details about the crash analysis file's contents: Crash Analysis File Information * Input crash dump file name: the full path to the .rgd file that was used to generate this file. * Input crash dump file creation time * RGD CLI version used System Information This section is titled SYSTEM INFO and includes information about: * Driver * OS * CPUs * GPUs Markers in Progress This section is titled MARKERS IN PROGRESS and contains information only about the execution markers that were in progress during the crash for each command buffer which was determined to be in flight during the crash. Here is the matching output for the tree below (see EXECUTION MARKER TREE): Command Buffer ID: 0x2e ======================= Frame 268 CL0/DownSamplePS/CmdDraw Frame 268 CL0/DownSamplePS/CmdDraw Frame 268 CL0/DownSamplePS/CmdDraw Frame 268 CL0/DownSamplePS/CmdDraw Frame 268 CL0/DownSamplePS/CmdDraw Frame 268 CL0/Bloom/BlurPS/CmdDraw Note that marker hierarchy is denoted by "/". Execution Marker Tree This section is titled EXECUTION MARKER TREE and contains a tree describing the marker status for each command buffer that was determined to be in flight during the crash. User-provided marker strings will be wrapped in "double quotes". Here is an example marker tree: Command Buffer ID: 0x2e ======================= [>] "Frame 268 CL0" +-[X] "Depth + Normal + Motion Vector PrePass" +-[X] "Shadow Cascade Pass" +-[X] "TLAS Build" +-[X] "Classify tiles" +-[X] "Trace shadows" +-[X] "Denoise shadows" +-[X] CmdDispatch +-[X] CmdDispatch +-[X] "GltfPbrPass::DrawBatchList" +-[X] "Skydome Proc" +-[X] "GltfPbrPass::DrawBatchList" +-[>] "DownSamplePS" +-[>] "Bloom" +-[>] "BlurPS" | +-[>] CmdDraw | +-[ ] CmdDraw +-[ ] CmdDraw +-[ ] "BlurPS" +-[ ] CmdDraw +-[ ] "BlurPS" +-[ ] CmdDraw +-[ ] "BlurPS" +-[ ] CmdDraw +-[ ] "BlurPS" +-[ ] CmdDraw Configuring the Execution Marker Output with RGD CLI RGD CLI exposes a few options that impact how the marker tree is generated: * --marker-src include a suffix tag for each node in the tree indicating its origin (the component that submitted the marker). The supported components are: + [APP] for application marker. + [Driver-PAL] for markers originating from PAL. + [Driver-DX12] for markers originating from DXCP non-PAL code. * --expand-markers: expand all parent nodes in the tree (RGD will collapse all nodes which do not have any sub-nodes in progress as these would generally be considered "noise" when trying to find the culprit marker). Configuring the Page Fault Summary with RGD CLI * --va-timeline: print a table with all events that impacted the offending VA, sorted chronologically (note that this is only applicable for crashes that are caused by a page fault). Since this table can be extremely verbose, and since in most cases this table is not required for analyzing the crash, it is not included by default in the output file. * --all-resources: If specified, the tool's output will include all the resources regardless of their virtual address from the input crash dump file. Page Fault Summary If the crash was determined to be caused by a page fault, a section titled PAGE FAULT SUMMARY will include useful details about the page fault such as: * Offending VA: the virtual address which triggered the page fault. * Resource timeline: a timeline of the associated resources (all resources that resided in the offending VA) with relevant events such as Create, Bind and Destroy. These events will be sorted chronologically. Each line includes: + Time of event + Event type + Type of resource + Resource ID + Resource size + Resource name (if named, otherwise NULL) * Associated resources: a list of all associated resources with their details, including: + Resource ID + Name (if named by the user) + Type and creation flags + Size + Virtual address and parent allocation base address + Commit type + Resource timeline that shows all events that are relevant to this specific resource in chronological order A note about time tracking: The general time format used by RGD is which stands for . Beginning of time (00:00:00.00) is when the crash analysis session started (note that there is an expected lag between the start of the crashing process and the beginning of the crash analysis session, due to the time that takes to initialize crash analysis in the driver). Capturing AMD GPU Crash Dump Files * To learn how to get started with RGD, see the RGD quickstart guide. More information can be found on the RGD help manual. * The complete documentation can be found in the Radeon Developer Tool Suite archive under help/rgd/index.html. About Tool for post-mortem analysis of GPU crashes. Resources Readme License MIT license Activity Stars 5 stars Watchers 8 watching Forks 1 fork Report repository Releases 1 Radeon GPU Detective v1.0 Latest Aug 17, 2023 Packages 0 No packages published Languages * C++ 87.8% * Python 10.4% * CMake 1.7% * C 0.1% Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time.