On a machine with 1 * NVIDIA GeForce RTX 5080 using Cuda Toolkit 12.9, I fail to query the number of passes for lts__t_* metrics. The step in the workflow that causes this failure is the call to NVPW_RawMetricsConfig_AddMetrics(…) which returns an error code of NVPA_STATUS_ERROR.
Metrics I have encountered this for are:
- lts__t_requests_srcnode_gpc_realtime.avg
- lts__t_requests_srcunit_ce_realtime.avg
- syslts__t_requests_srcnode_gpc_realtime.avg
This behavior can be reproduced via the CUPTI sample code cupti_metric_properties.cpp found in Cuda Toolkit 12.9:
reproducer_cupti_metric_properties.cpp.txt (30.8 KB)
Are these metrics limited to the Range Profiling API or can they be used with the MetricsEvaluator API found with the PerfWorks Metrics API?
The metrics you listed are for Periodic Sampling (PM Sampling). To get properties for these metrics and determine the number of passes needed for data collection, set rawMetricsConfigCreateParams.activityKind to NVPA_ACTIVITY_KIND_REALTIME_SAMPLED.
If you are doing range-based profiling, these metrics aren’t supported—you should use non-realtime metrics in that case.
Since you’re already using the 12.9 Toolkit, I highly recommend switching to the new CUPTI Profiler Host APIs for all metric-related tasks. This includes enumerating metric properties, generating the config image (which includes scheduling info), and evaluating metrics from the counter data image. Note that the NVPW_MetricsEvaluator and NVPW_CUDA_RawMetricsConfig APIs are deprecated and will be removed in future releases.
Here’s a quick summary of the relevant CUPTI APIs:
- CUPTI Range Based Profiling (
cuptiRangeProfiler*): Collects metrics data for user-defined code ranges or per-kernel, depending on your settings.
- CUPTI Periodic Sampling (
cuptiPmSampling*): Collects metrics data at regular intervals, as specified by the sampling period.
- CUPTI Profiler Host API (
cuptiProfilerHost): Used on the host side for both range-based and periodic sampling workflows. You’ll use these APIs to create config images for each metric set and to evaluate metrics from the counter data image which has the profiling data. These tasks were previously handled by the now-deprecated PerfWorks Metrics APIs.
Hello @ssubudhi ,
Thank you for the reply and information.
As stands, I am using the NVPW_MetricsEvaluator API; however, I do know of the need to update to the Range Profiling API.
In terms of the NVPW_MetricsEvaluator API, when I enumerate through the available metrics on a device the lts__t_ metrics appear. To clear up some confusion on my end, from your reply it seems that these metrics are restricted to using the PM Sampling API, do I have that correct?
Or can these metrics be used with the NVPW_MetricsEvaluator API if I set rawMetricsConfigCreateParams.activityKindto NVPA_ACTIVITY_KIND_REALTIME_SAMPLED?
Thank you for your help.