Total kernel execution time

I’m profiling application executable on Windows platform and basically I receive all required information about kernels using Nsight Compute, but is there any way to see which kernel takes most time in total? Currently I observe a single record per kernel invocation, however, I need a total time for a specific kernel. Lets say - I want to launch a kernel 10 times and as a result I want to see the overall time of 10 invocations. Also, would be great to be able to sort such “kernel groups” by the total execution time. Is there a way to receive this information using Nsight Compute? I guess there is such capability in Nsight integration for VS, but currently I don’t have a source code and using Nsight Compute for this purpose by attaching to process. This information will be very helpful because it shows which kernel takes most of the time during application execution. Thank you in advice!

For understanding system- and application-level performance (such as which kernels or API functions take the most time in your application), you should use Nsight Systems.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.