I would like to know why cputime differs from gputime? Shouldn’t be the same?
For example, on this line:
memcpyHtoD,10.880,8.877 → Does it mean that it took 10.880 microseconds to the gpu to execute memcopy and 8.877 microseconds to the CPU?
Please help…I’ve been looking on the readme but there is nothing in the description of those times… External Image
GPU Time: It is the execution time for the method on GPU. CPU Time:It is sum of GPU time and CPU overhead to launch that Method. At driver generated data level, CPU Time is only CPU overhead to launch the Method for non-blocking Methods; for blocking methods it is sum of GPU time and CPU overhead. All kernel launches by default are non-blocking. But if any profiler counters are enabled kernel launches are blocking. Asynchronous memory copy requests in different streams are non-blocking.