According to the CUDA 12.8 release notes, “profiling of NVLink metrics is now supported using profiler host API and range profiler APIs.”
Would it be possible to get a relevant example?
I tested to retrieve NVLink metrics like nvlrx__bytes.sum
on a DCX A100 server using the range_profiling and userrange_profiling samples.
In range_profiling, I encountered:
cuptiProfilerHostConfigAddMetrics() failed with error(999): CUPTI_ERROR_UNKNOWN
In userrange_profiling, I encountered:
NVPM_RawMetricsConfig_AddMetrics() with error NVPA_STATUS_ERROR
Neither of these error codes provide any useful information.
Other metrics, such as PCIe metrics, are being retrieved successfully—only NVLink metrics seem to be the issue.
After changing CUpti_ProfilerType from CUPTI_PROFILER_TYPE_RANGE_PROFILER to CUPTI_PROFILER_TYPE_PM_SAMPLING, this error disappeared.
So, is there a cupti profiler type that supports nvlink like pm sampling?