Problem
Running nsys profile --trace=cuda,nvtx python script.py inside an AI Workbench
container on DGX Spark GB10 produces a report with CPU/OS rows only. The CUDA lane
is absent. This is despite the workload actively using the GPU (confirmed via
nvidia-smi during the run).
Proposed Fix
The Holohub dev container passes --cap-add CAP_SYS_PTRACE explicitly in its
docker run command, and nsys works there. AI Workbench does not add this flag,
and nsys CUDA tracing fails. The fix is for Workbench to expose a capabilities
field in spec.yaml (or equivalent) so users can add SYS_PTRACE to their
project containers.
We verified this is the root cause via nsys status --environment (see Diagnostic
Output below). The fix is already proven in NVIDIA’s own tooling — it just hasn’t
been surfaced in Workbench.
Environment
| Component | Version |
|---|---|
| Hardware | DGX Spark GB10 |
| NVIDIA AI Workbench | latest (Desktop App, macOS) |
| Container base image | nvcr.io/nvidia/rapidsai/notebooks:26.04-cuda13-py3.12 |
| CUDA | 13.1.1 |
| nsys | 2024.2.3 (installed via apt.txt) |
| Python | 3.12 (conda) |
Symptom
No CUDA events collected. Does the process use CUDA?
No NVTX events collected. Does the process use NVTX?
Tracing nvtx alone yields only CCCL C++ library ranges from the CUDA driver
itself. Python-side NVTX annotations do not appear, regardless of whether the
Python nvtx package or cupy.cuda.nvtx.RangePush/RangePop is used.
Diagnostic Output
nsys status --environment inside the Workbench container:
CPU Profiling Environment Check
Root privilege: disabled
Linux Kernel Paranoid Level = 4
Linux perf_event_open syscall available: Fail
CPU Profiling Environment (process-tree): Fail
CUDA Profiling Environment Check
CUDA driver: 570
SYS_PTRACE: Fail ← the blocker
The container is missing CAP_SYS_PTRACE. Without it, nsys cannot inject its
interception library and CUDA tracing is impossible regardless of nsys version.
Root Cause
AI Workbench does not pass --cap-add=SYS_PTRACE to its docker run command.
The Holohub dev container (an NVIDIA project) explicitly adds this flag and nsys
works there:
# from Holohub dev_container launch --verbose (March 2025, thread below):
docker run ... --cap-add CAP_SYS_PTRACE --ipc=host --ulimit=memlock=-1 ...
The Workbench spec.yaml has no capabilities field, and none of the official
Workbench example repos appear to expose one. The fix exists in NVIDIA’s own
ecosystem — it just hasn’t been surfaced in Workbench.
Reference: “Nsight in holohub not working”
Questions
-
Does Workbench
spec.yamlsupport adding Linux capabilities (e.g.,SYS_PTRACE,
SYS_ADMIN) or seccomp overrides? If so, what is the correct field? -
What is the supported workflow for nsys profiling in a Workbench container on
DGX Spark GB10? The machine was purchased specifically for hardware profiling. -
Is the host paranoid level 4 an additional blocker once the container capability
is fixed, or does CUDA tracing work regardless of the CPU profiling environment?
Code
We are profiling a Python scientific computing workload with NVTX annotations
around each algorithmic phase. Full source is available to NVIDIA personnel on
request.
Related Thread
This follows up on: “Nsys profile not showing any GPU data” (November 2025)
which reports the identical symptom on DGX Spark and was acknowledged by @aniculescu
in December 2025 with no resolution posted. We believe the root cause identified
above explains that report as well.