Hi,
Is there a way to configure the GPU to collect instruction_executed counter (for example), and access it from the kernel?
- I know that CUPTI gives you the option to configure which events (counters) you want to collect, is there a way to do it directly w/o CUPTI?
- I know that you can access %pm (performance monitors) special registers from the kernel, are they the registers that store profiling data, if yes how do I know which of them holds instruction_executed counter (after I configured the GPU to collect this event) ?
I wish I was able to access profiler counters the same way I access %clock register, for example if I want see how many instruction where executed from the beginning of the invocation on the current SM.
Thanks,
Natan