Hi,
I am trying to export the SM Occupancy of each kernel with its execution time. One thing I noticed with Nsight Systems is the bins are somewhat continuous. If a bin shows 10% at the end of kernel 1, it will continue from 10% at the start of kernel 2. Is this a coincidence or shall I skip the bins that are continues from the previous kernel ?
Also, from this blog post :
Blockquote
Formally, the quantity reported is an average of SM activity. 50% may mean that all SMs are active 50% of the time, or 50% of the SMs are active 100% of the time, or anything in between, but in practice, our earlier definition prevails.
So this makes me wonder how could I capture the execution time of the kernel alongside with SM occupancy during this execution ?
Thank you
Could you attach a screenshot showing the contiguous 10% utilization?
Regarding your second question - this thread seems to cover it: How to get dram throughput in Nsight system?
You could probably use the script I provided with a few changes. Let me know if that is what you’re looking for.
Hi @pkovalenko,
Thank you for your answer, I will check the thread for my second question.
For the first question, So as we could see there is a gap between kernel launches but the occupancy shows in the next kernel start as a bit low until it jumps to high again
As you can see, the sample only partially covers the gap between kernel launches, and the second kernel launch is associated with 100% utilization in the first two samples. The value you observe in the sample covering the gap is a weighed average between 0% (between kernel launches) and 100% (at the start of the second kernel launch). Using a higher sampling rate should clear the picture. Or you can just keep this in mind and continue profiling at the default 10 kHz rate (if that’s the case).