Difference between thread_inst_executed metrics

Hi there - I was wondering what the difference between the metrics smsp__thread_inst_executed and smsp__sass_thread_inst_executed is. I am guessing it is just the same?

These two metrics are equivalent & the metric values are expected to be the same. But the metric collection methods for the two are different.

  • smsp__thread_inst_executed - data is collected through a hardware performance monitor on the GPU
  • smsp__sass_thread_inst_executed - data is collected through software patching of the kernel instructions (this applies to all metrics which have “__sass” in the metric name). This has a higher run time metric collection overhead.

Thank you so much! This is really good to know.

Is there a documentation on the specific metrics, which ones are hardware counters and such?

All metrics which have “__sass” in the metric name are software based. Other metrics can be based on some hardware and/or software metrics/counters. This detail is currently not included in the document.

Refer the Metrics Guide section section in the .Kernel Profiling Guide.

The Overhead sub-section in the Kernel Profiling Guide has a brief about different types of metrics.

Thank you so much!

You may want to look at this topic, too.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.