Extract memory bandwidth and SM occupancy from ncu-rep file

I just ran a pytorch model and got some profiling results using ncu tools. Now, I need to check SM occupancy and global memory bandwidth for each kernel manually when making inference and it’s very cumbersome! I wonder if there are some quick methods to extract these information for all kernels so that I can visualize using matplotlib. Thanks

You can use the Nsight Compute Python Report Interface to extract information from a ncu-rep file.

Also the ncu –metrics option along with the –import and –page raw options can be used to get required metric values for all kernels. Also the –csv option can be used to get output in CSV format which can be easier to parse.

e.g.

ncu  --metrics launch__grid_size,sm__warps_active.avg.pct_of_peak_sustained_active --page raw  --csv --import c:\temp\vectorAdd_set_full.ncu-rep
"ID","Process ID","Process Name","Host Name","Kernel Name","Kernel Time","Context","Stream","launch__grid_size","sm__warps_active.avg.pct_of_peak_sustained_active"
"","","","","","","","","","%"
"0","12440","vectorAdd.exe","127.0.0.1","vectorAdd(const float *, const float *, float *, int)","2021-Nov-08 19:56:53","1","7","196","77.969567"
1 Like