Obtaining Shader Core utilization

Greetings all!,

I am trying to analyze the performance of PageRank on a NVIDIA GPU. I want to characterize this application and so as a first step I would like to measure the utilization of the Shader Core (an average would do). In other words I want to measure the percentage of time the Shader core is doing useful computation and the percentage of time the shader core has stalled. I tried nvprof but it only gives me the function-by-function split. Can you suggest some way to measure this metric ?


I would use nvprof. It’s not clear what your objection to it is. For example, the sm_efficiency metric gives a percentage measurement expressing the ratio of time that a SM is not stalled to the total time of SM utilization.