Is there a way to use NVIDIA Nsight Compute to profile the whole application instead of profiling by kernels?
Nsight Compute is a GPU kernel profiler. You can profile one or more kernels.
To profile the application you can use Nsight Systems.
Hi Sanjiv, I am interested in profiling the entire application and extract information such as GPU Speed of Light, Roofline chart, Compute Workload Analysis, and Memory Workload Analysis; is there a way to profile it with Nsight Systems?, If not, what profiling tool do you recommend to do so?
You cannot get the same sections and all the same metrics with Nsight Systems. However, recent versions have the capability to collect a limited set of metrics over the runtime of the application. The feature is called GPU Metrics sampling and is available starting with version 2021.2 of Nsight Systems. You can find the documentation on the same here.
These metrics and analysis results are mostly relevant for GPU kernels and you can collect them using Nsight Compute. You can profile the entire application using Nsight Compute and get these metrics for each kernel launch. Based on these you could do some equivalent analysis at the application level on your own (e.g. by exporting the Nsight Compute metric values to a CSV file).