-
Notifications
You must be signed in to change notification settings - Fork 2.2k
CPU Profiling Tools
Supported on all platforms
- Enable the "Profiler" Gem via the Project Manager tool (bin/o3de) or the O3DE CLI (scripts/o3de)
CLI Example:
cd <engine_root>
./scripts/o3de enable-gem -pp <project_path> -gn Profiler
- Build as you would normally
- Enable the ImGui menu
- Host Platforms: While running either the GameLauncher or Editor, press the "Home" key to bring up the ImGui menu
-
Mobile Platforms: Edit the respective platform's system config (e.g. Android = system_android_android.cfg) to include
imgui_EnableImGui=1
before deploying to device
- Select "Profiler" from the menu bar, then the "CPU" entry in the dropdown
-
You can press Pause/Resume to stop/start the profiler from capturing data every frame and show the data for the last frame.
-
You can press Swap to Visualizer in order to help better analyze the capture.
-
You can press Capture to save the captured data to a json file
-
You can also use Load File button to load a previously saved file for analysis at a later stage.
Note: Only available in Windows
- Install Pix (installer link). Version 2107.01 and later is supported at this time.
- Download the WinPixEventRuntime nuget package (note that this link directs you to version 1.0.210818001 which is the currently supported version).
- Unzip the nuget package (optionally by changing the file extension to
.zip
) either to$LY_3RDPARTY_PATH/winpixeventruntime
or to a custom path - Set the CMake flag
LY_PIX_ENABLED
(passLY_PIX_ENABLED=ON
to your cmake configure command). - If you unzipped the WinPixEventRuntime to an arbitrary path, set the cmake variable
LY_PIX_PATH
accordingly (note, this should point to where you extracted the WinPixEventRuntime package, not where the Pix executable is installed) - After regenerating and recompiling, you'll be able to do captures for CPU timings, or GPU analysis.
Ensure that GPU capture is unchecked when launching or attaching to the executable you wish to profile. After Pix is attached to the runtime, collect a timing capture (not the "legacy capture") by hitting the big play button. You'll generally want to collect samples (at 4k to 8k rate) (along with kernel and IO data as needed). Callstacks on context switches can also be helpful when diagnosing multithreaded workload issues and performance bottlenecks.
- GPU Crash Debugging and Reporting
- CPU & GPU Debugging Tools
- CPU Profiling Tools
- GPU Profiling Tools
- GPU Memory Profiling
- Faster Shader Iteration
- Commit sign off
- PerformanceCollector API
- Allocator Tagging Guide
- What happens when entering/exiting Game mode?
- Hello World
- Using Tick Bus
- Using Transform Bus
- Reflecting Properties to the Editor
- Working With An External Lua Debugger
- Attachment Images and Buffers
- Image Builder
- Scene And Render Pipeline
- Shader Management Console (SMC)
- Work With Passes In Gems
- Developer Guide: Shader Build Arguments Customization
- Developer Guide: Customize AZSLc Executable
- Collecting Graphics Performance Metrics
- Mesh Instancing: For Content Creators
- Mesh Instancing: For Shader Authors
- Mesh Instancing: For Engine Maintainers/Contributors
- Screen Capture Image Comparison Testing