Unable to obtain the number of passes for lts__t_* metrics on a machine with 1 * RTX 5080

On a machine with 1 * NVIDIA GeForce RTX 5080 using Cuda Toolkit 12.9, I fail to query the number of passes for lts__t_* metrics. The step in the workflow that causes this failure is the call to NVPW_RawMetricsConfig_AddMetrics(…) which returns an error code of NVPA_STATUS_ERROR.

Metrics I have encountered this for are:

  • lts__t_requests_srcnode_gpc_realtime.avg
  • lts__t_requests_srcunit_ce_realtime.avg
  • syslts__t_requests_srcnode_gpc_realtime.avg

This behavior can be reproduced via the CUPTI sample code cupti_metric_properties.cpp found in Cuda Toolkit 12.9:

reproducer_cupti_metric_properties.cpp.txt (30.8 KB)

Are these metrics limited to the Range Profiling API or can they be used with the MetricsEvaluator API found with the PerfWorks Metrics API?

The metrics you listed are for Periodic Sampling (PM Sampling). To get properties for these metrics and determine the number of passes needed for data collection, set rawMetricsConfigCreateParams.activityKind to NVPA_ACTIVITY_KIND_REALTIME_SAMPLED.

If you are doing range-based profiling, these metrics aren’t supported—you should use non-realtime metrics in that case.

Since you’re already using the 12.9 Toolkit, I highly recommend switching to the new CUPTI Profiler Host APIs for all metric-related tasks. This includes enumerating metric properties, generating the config image (which includes scheduling info), and evaluating metrics from the counter data image. Note that the NVPW_MetricsEvaluator and NVPW_CUDA_RawMetricsConfig APIs are deprecated and will be removed in future releases.

Here’s a quick summary of the relevant CUPTI APIs:

  • CUPTI Range Based Profiling (cuptiRangeProfiler*): Collects metrics data for user-defined code ranges or per-kernel, depending on your settings.
  • CUPTI Periodic Sampling (cuptiPmSampling*): Collects metrics data at regular intervals, as specified by the sampling period.
  • CUPTI Profiler Host API (cuptiProfilerHost): Used on the host side for both range-based and periodic sampling workflows. You’ll use these APIs to create config images for each metric set and to evaluate metrics from the counter data image which has the profiling data. These tasks were previously handled by the now-deprecated PerfWorks Metrics APIs.

Hello @ssubudhi ,

Thank you for the reply and information.

As stands, I am using the NVPW_MetricsEvaluator API; however, I do know of the need to update to the Range Profiling API.

In terms of the NVPW_MetricsEvaluator API, when I enumerate through the available metrics on a device the lts__t_ metrics appear. To clear up some confusion on my end, from your reply it seems that these metrics are restricted to using the PM Sampling API, do I have that correct?

Or can these metrics be used with the NVPW_MetricsEvaluator API if I set rawMetricsConfigCreateParams.activityKindto NVPA_ACTIVITY_KIND_REALTIME_SAMPLED?

Thank you for your help.

From your reply it seems that these metrics are restricted to using the PM Sampling API, do I have that correct?

That’s correct—the realtime metrics are specifically for PM Sampling.

Or can these metrics be used with the NVPW_MetricsEvaluator API if I set rawMetricsConfigCreateParams.activityKind to NVPA_ACTIVITY_KIND_REALTIME_SAMPLED?

To determine the passes needed for a set of metrics, you first need to create a config image. Previously, this was done using the (now deprecated) NVPW_CUDA_RawMetricsConfig API. If you’re creating a config image for range profiling, set activityKind to NVPA_ACTIVITY_KIND_PROFILER; for PM Sampling, use NVPA_ACTIVITY_KIND_REALTIME_SAMPLED.

Note that config images for range profiling and PM sampling are not interchangeable—you need to use the appropriate one for your use case.

If you’re doing range profiling, you’ll need to skip these realtime metrics. For PM sampling, set activityKind to NVPA_ACTIVITY_KIND_REALTIME_SAMPLED and you’ll be able to use them.

Let us know if you have any other questions!

@ssubudhi Thank you for the reply and information.

I am currently using the PerfWorks Metrics API with the MetricsEvaluator sub API, but I will look to upgrade to the Range Profiling API.

If you’re doing range profiling, you’ll need to skip these realtime metrics.

For the PerfWorks Metrics API, is there a way to filter/skip these realtime metrics as you enumerate through getMetricNamesParams.numMetricsobtained from the callNVPW_MetricsEvaluator_GetMetricNames?

I do not see a member variable for NVPW_MetricsEvaluator_GetMetricNames_Paramswhich would allow me to filter the realtime metrics out.

Options I have thought of include:

  1. Use strstrto see if realtimeis in the metric name. However, from further testing this does not hold as lts__t_sectors_srcunit_tex_lookup_miss_realtime.avgcan obtain the number of passes.
  2. For each metric go through the NVPW_CUDA_RawMetricsConfig_Create_V2workflow which if it is a realtime metric would fail at NVPW_RawMetricsConfig_AddMetricswith error code NVPA_STATUS_ERROR.

@tburgess Can you tell me what is the activity kind you are setting for your workflow?

@ssubudhi I am setting the activityKind to NVPA_ACTIVITY_KIND_PROFILER.