Timing with cuda profiler

I previously used to time my CPU code inside the host code, using cuda timer. Let’s say I got 15X speedup.
Now, I run my code through cuda profiler, and both CPU and GPU show the same time !!! no speedup?

  1. How is that possible? Has anyone saw this before??
  2. How does cuda measure CPU times?


CPU time is the time from start of CUDA call to return to CPU. So in general CPU time = GPU time + a little overhead for transfer of parameters to the GPU, etc. GPU time is the time from kernel start on GPU to kernel end on GPU.

Thank you very much for the reply. I don’t think that was made clear in cuda profiler readme.