Nsight Systems: Unified Memory Trace Support for GB10 (SM121)

Hello NVIDIA Engineers!

I’m profiling LLM inference workloads on a DGX Spark (GB10, SM121) using Nsight Systems 2025.x. The profiling works well overall, but I noticed that Unified Memory tracing is not supported on this platform:

CUDA device 0: Unified Memory trace is not supported by the current driver version or configuration.

Environment:

- Hardware: NVIDIA GB10 (DGX Spark)

- GPU Architecture: SM121 (Blackwell)

- Driver: 580.95.05

- CUDA: 13.1

- Nsight Systems: 2026.1.1.204-261137176666v0 OSX.

Use Case:

My LLM inference workload on GB10 heavily uses Unified Memory for CPU-GPU data movement. Without UM tracing support, I’m unable to profile a critical aspect of my application’s performance:

1. Page fault analysis - Understanding when and where page faults occur during inference

2. Memory migration patterns - Identifying bottlenecks in data movement between CPU and GPU

3. Prefetch effectiveness - Validating whether prefetching strategies are working as intended

This is a significant gap in profiling capability for workloads that rely on Unified Memory on the GB10 platform.

Questions:

1. Is UM tracing support planned for SM121/GB10 in a future driver or Nsight Systems release?

2. Is this a hardware limitation of the GB10, or a driver/software limitation that could be addressed?

3. Are there alternative profiling approaches you’d recommend for understanding memory access patterns on GB10?

Workaround Attempts:

I’ve tried enabling the options explicitly via `–cuda-um-cpu-page-faults=true` and `–cuda-um-gpu-page-faults=true` on `nsys launch`, but the same limitation message appears.

Thank you for your continued work on Nsight Systems - it’s an invaluable tool for CUDA performance optimization. Any guidance on UM tracing support for Blackwell desktop GPUs would be appreciated.

maybe it’s a touch cosmetic, but it would be nice if the CPU info worked as well:

CPU info

CPU core Socket Core type Max frequency MPIDR Performance / Efficient (P/E)
#0 #36 Unknown 2,81 GHz none none

(…)

Hello,

Have you managed to resolve this issue?
I’m facing the same problem(on Jetson Thor), and I haven’t found any solutions so far.

Thanks.

UVM profiling is not supported on Spark but is currently being worked on for a future release,

Thank you for your reply.

I am currently using Jetson Thor, and I would like to ask whether the situation is the same for Jetson Thor as well.

Thanks.

I am not sure. For non-Spark related questions you will have to go to the relevant forums