[NSys Timeline] End - Start is not the same as latency

puneethnaik · December 27, 2022, 7:35am

Here is a screenshot of a profiling session.

In the yellow box upon hovering on a kernel call in stream 15, we can see that end - start is 46.531 microseconds. But the latency is reported as 7.347 microseconds. Why are they not the same? What is end - start capturing that latency is not capturing? Also I noticed the latency in the yellow box is the same as that noted in the corresponding launch kernel call in CUDA API events log in NSys.
Thanks

hwilper · December 28, 2022, 7:28pm

I think we are having a terminology mismatch. Can you take a look at https://developer.nvidia.com/blog/understanding-the-visualization-of-overhead-and-latency-in-nsight-systems/

thanks!

puneethnaik · December 30, 2022, 5:58am

Indeed a well-written piece. Thanks for the article. It helps clarify the doubt. So latency is the time between the time when the API was enqueued, and the time the GPU started executing it. And duration in the cuda API trace is the CPU wrapper overhead.

hwilper · December 30, 2022, 3:30pm

Thanks, since I wrote it (with Jason and Bob).

Yup.

system · January 13, 2023, 3:31pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.