Measure the kernel duration ...


Will it provide the correct kernel run time if it is measured as follow?

gettimeofday(&begin, NULL);
mykernel<<<1,1>>>(dev_a, dev_b, …);
gettimeofday(&end, NULL);

Assume properly call of gettimeofday() function …


I measured durations in kernel using clock() function and got larger time value compared to above kernel time (which is wrong …!). What might cause this timing error?

I am using CUDA 4.0 on Fermi …

Thank you.

Kernel launch is done asynchronously, what you measure here is only the time to submit the job to the GPU, not the execution time. So, the time value you got with clock() is not necessarily as inaccurate as you think. Try inserting cudaDeviceSynchronize() before the last gettimeofday to get a more realistic number.

Thank you for the reply.