How to profile a part of kernel function with Nsight Compute

Hi. I would like to know, how to profile a part of kernel functions with Nsight Compute.
I want to profile the code without the reading and writing to the global memory part, so I only need part of the kernel function to be profiled.
Many thanks!

Hi, @markusxwr

Thanks for using Nsight Compute. You can refer 2. Kernel Profiling Guide — NsightCompute 12.4 documentation.

The Source View supports per instruction and per source line counters. The tools does not support collecting hardware performance metrics for a subset of instructions or a range of code at this time. The Source View supports export to CSV. The CSV can be loaded into Excel (or Google Sheets) and it would be possible to sum ranges or eliminate specific instructions (LD,ST) from summation.

Are there specific sets of metrics that you want to see for a function, range of source code, range of instructions, etc?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.