Hi,
I am using nsys to profile a GH200 machine.
According to the user manual,, the NVLink performance data should be shown in the nsys report, but I do not see it in the GUI (image below), nor does nsys report any errors that it could not be collected.
Platform Linux
OS Rocky Linux 9.3 (Blue Onyx)
Hardware platform armv8
Serial number Local (CLI)
GPU descriptions NVIDIA GH200 120GB
NVIDIA driver version 550.54.15
Max EMC frequency 1.60 GHz
CPU context switch supported
GPU context switch supported
Tunnel traffic through SSH no
Timestamp counter supported
NVIDIA Nsight Systems version 2024.1.1.59-241133802077
What does diagnostics summary say? Are there any messages mentioning GPU Metrics?
This is the only message regarding GPU metrics in the diagnostic summary
Information Analysis 00:05.251
Number of GPU Metrics events collected: 286,057.
Could you share the report? You can either upload it publicly, or send it to me privately to pkovalenko@nvidia.com, or, if you don’t want to share confidential data, collect and share a new report by profiling sleep 1
. Note that 2024.1 is a fairly old release, so please try the latest one first.
I still have the same problem of NVLink metrics not visible on NSys. I am now running 2024.5
. Attached the report with nsys profile --gpu-metrics-set=gh100 --gpu-metrics-devices=all --cuda-um-cpu-page-faults=true --cuda-um-gpu-page-faults=true --event-sample=system-wide sleep 1
report4.nsys-rep.zip (255.5 KB)
Thanks for your patience. Looking in the report, it seems that the NVLink metrics have been scheduled and collected, but for some reason are not getting displayed on the timeline. The latest nsys-ui shows your report exactly the same way which means the problem hasn’t been fixed. I’ll do more digging and see what’s going on.
OK, so I must admit I got confused as well and didn’t notice there’s a single GPU in your system. NVLink metrics are only available on multi-GPU systems. If your goal is to observe the traffic between GPU and CPU (which also goes through NVLink), you should be looking at CTC Throughput metrics which are available in the latest website release.
1 Like