Why is the time so different?
NVTX is a CPU API, so what you see on the CPU is the range of time it was active on the CPU.
We then project that range onto the GPU so that you can see which CUDA kernels are active during that range. The time you see there is the amount of time than any of those kernels were active on the GPU.
So if the NVTX range covered not only some CUDA code that ran, but also some non-work CUDA operations (or just work on the CPU side that isn’t on the GPU) you can see this.
1 Like
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.