Documenting the number of Kernel passes required for CUPTI metrics/events

Hi,

Would you please provide a easy documentation about the number of Kernel passes required for CUPTI metrics and events?

Thanks

Hi Pouya

  1. For events, number of passes can be queried using the API cuptiEventGroupSetsCreate(). Profiling one event always takes single pass. Multiple passes might be required when we want to profile multiple events together.

Code snippet showing how this can be done:

CUpti_EventGroupSets *eventGroupSets = NULL;
size_t eventIdArraySize = sizeof(CUpti_EventID) * numEvents;
CUpti_EventID *eventIdArray = (CUpti_EventID *)malloc(sizeof(CUpti_EventID) * numEvents);
// fill in event Ids
cuptiEventGroupSetsCreate(context, eventIdArraySize, eventIdArray, &eventGroupSets);

passes = eventGroupSets->numSets;
  1. For metrics, number of passes can be queried using the API cuptiMetricCreateEventGroupSets(). Profiling a metric can take one or more passes depending on the number and type of events it is calculated from.

Code snippet showing how this can be done:

CUpti_EventGroupSets *eventGroupSets = NULL;
size_t metricIdArraySize = sizeof(CUpti_MetricID) * numMetrics;
CUpti_MetricID metricIdArray = (CUpti_MetricID *)malloc(sizeof(CUpti_MetricID) * numMetrics);
// fill in metric Ids
cuptiMetricCreateEventGroupSets(context, metricIdArraySize, metricIdArray, &eventGroupSets);

passes = eventGroupSets->numSets;