Calculation of time of a bunch of kernels Is there a simpler way to measure time?

sbaebler · June 13, 2007, 1:04pm

The clock example from the CUDA sample projects shows how the time for executing a kernel is measured. I wonder if there is a simpler solution.

I assume that something like:

int

main( int argc, char** argv) 

{

c = clock();

kernel<<<...>>>(...);

clocks = clock() - c; 

}

yields no useful results, as the clock() function outside a kernel refers to CPU clocks and the CPU stalls until the kernel has finished execution.

In the clock example, only the thread with threadId 0 calls the clock function. Is it therefore possible to measure time in the following way (I use only 1 Block but several threads):

kernel() {

if (threadId == 0) c = clock();

do some memory copying according to threadId

__synchtrhreads();

if (threadId == 0) c = clock() - c;

}

The variable c would then have approximately the execution time for all threads, as I call the __synchthreads() and only thread 0 does a calculation after the __synchthreads().

Or is the example in the clocks project already the easiest way?

Thanks

Sacha

prkipfer · June 13, 2007, 1:17pm

It is not only possible to call clock() in every thread - I strongly recommend doing so! The reason is that clock() being a time stamp, the differences you compute will be wall clock time, ie. they include any stalls the thread experience. To make sure, you get a consistent view of whats going on, you really need to look at each thread individually.

For the purpose of getting only the maximum block execution time, you can use the code you showed above, but you need to add a syncthreads before the first clock() call to make sure that all threads have a common start time. Note however that each multiprocessor has only 8 ALUs, so for any block size larger than half-a-warp, you won’t get close to the real thread time.

Peter

Topic		Replies	Views
How to measure time inside Kenrel CUDA Programming and Performance	12	6549	August 5, 2011
How to measure kernel exercution time from kernel itself? CUDA Programming and Performance	2	946	October 10, 2013
how to measure the time elapsed (or no. of clock cycles) between the start and the end of a cuda thr CUDA Programming and Performance	7	2785	December 13, 2009
How to measure time in kernel function on devices? CUDA Programming and Performance	2	1406	September 25, 2011
Timer in kernel CUDA Programming and Performance	1	2203	May 28, 2009
Timing Kernel Is this a sound approach? CUDA Programming and Performance	0	1948	September 10, 2009
Compare Execution Times CPU vs GPU the proper way? CUDA Programming and Performance	5	5931	September 8, 2009
Measuring running time CUDA Programming and Performance	1	1437	June 13, 2009
How to measure time in cuda kernel ...? [CUDA 4.0] CUDA Programming and Performance	2	1275	May 7, 2013
can you help me measring the run time, memory time CUDA Programming and Performance	1	3032	June 25, 2008

Calculation of time of a bunch of kernels Is there a simpler way to measure time?

Related topics