How to get the number of instructions executed by each thread?

Hi all,

I am wondering if there is way to get the number of instructions executed by each thread. The profiler from CUDA only gives the total number of instructions (i.e. by all threads), not per-thread. I also checked CUPTI, and found nothing. Thanks!

Bo

Try cuobjdump or nvdisasm ?

Bo
Did you ever figure out the answer to your question? Look at the sass_source_map example in cupti.
Bob