Time column in "nsys stats"

mahmood.nt · September 14, 2020, 3:04pm

Hi,
I would like to know how “nsys stats” reports the GPU time percentage number for kernels. Please see the following line:

 Time(%)  Total Time (ns)  Instances   Average   Minimum  Maximum                                                  Name                 
 -------  ---------------  ---------  ---------  -------  -------  ----------------------------------------------------------------------------------------------------
    51.8    1,854,648,457      4,554  407,257.0  332,256  436,716  nbnxn_kernel_ElecEw_VdwLJFsw_F_cuda(cu_atomdata, cu_nbparam, Nbnxm::gpu_plist, bool)

So, it is 51.8%.
Now, looking at the picture below
https://pasteboard.co/Jr3CvTo.jpg
which is the output of nsys-ui, I guess the total time calculation should be

0.9990.5750.745*0.882=0.377
or 37.7%.

Am I right? What is missing here?

jkreibich · October 7, 2020, 4:31pm

This is saying that the executions of this kernel represent 51.8% of the execution time of all the things in this report.

The help explains it better:

Note that the “Time(%)” column is calculated using a summation of the “Total Time” column, and represents that API call’s, kernel’s, or memory operation’s percent of the execution time of the APIs, kernels and memory operations listed, and not a percentage of the application wall or CPU execution time.

Here is the full help output:

$ nsys stats --help-reports apigpusum

apigpusum[:base] -- CUDA API & GPU Summary (CUDA API + kernels + memory ops)

    base - Optional argument, if given, will cause summary to be over the
           base name of the kernel, rather than the templated name.

    Output: All time values given in nanoseconds
        Time(%) : Percentage of "Total Time"
        Total Time : The total time used by all executions of this kernel
        Instances: The number of executions of this object
        Average : The average execution time of this kernel
        Minimum : The smallest execution time of this kernel
        Maximum : The largest execution time of this kernel
        Category : The category of the operation
        Operation : The name of the kernel

    This report provides a summary of CUDA API calls, kernels and memory
    operations, and their execution times. Note that the "Time(%)"
    column is calculated using a summation of the "Total Time" column,
    and represents that API call's, kernel's, or memory operation's
    percent of the execution time of the APIs, kernels and memory
    operations listed, and not a percentage of the application wall or
    CPU execution time.

    This report combines data from the "cudaapisum", "gpukernsum", and
    "gpumemsizesum" reports.  It is very similar to profile section of
    "nvprof --dependency-analysis".

mahmood.nt · October 8, 2020, 5:48pm

But the difference between that and what I see in the nsys-ui is large. nsys-ui also says a kernel takes X percents in a stream and that stream takes Y percent and …