What is the difference between 'GPU activities' and 'API calls' in the results of 'nvprof'?

What is the difference between ‘GPU activities’ and ‘API calls’ in the results of ‘nvprof’?

I don’t know why there’s a time difference in the same function.

So I don’t know what the right time is.

for example(The picture is not attached, so I put it up like this.)

       Type       Time(%)       Time  Calls       Name

GPU activities: 47.04% 54.011ms 20 [CUDA memcpy HtoD]
0.02% 19.939us 2 [CUDA memcpy DtoH]
API calls: 13.16% 60.844ms 2 cuMemcpyDtoH
12.01% 55.539ms 20 cuMemcpyHtoD

Section ‘GPU activities’ list activities which execute on the GPU like CUDA kernel, CUDA memcpy, CUDA memset. And timing information here represents the execution time on the GPU.

Section ‘API Calls’ list CUDA Runtime/Driver API calls. And timing information here represents the execution time on the host.

For ex - CUDA kernel launches are asynchronous from the point of view of the CPU. It returns immediately, before the kernel has completed, and perhaps before the kernel has even started. This time is captured for the Launch API like cuLaunchKernel in the ‘API Calls’ section. Eventually kernel starts execution on the GPU and runs to the completion. This time is captured for kernel in the ‘GPU activities’.