how to measure the time elapsed (or no. of clock cycles) between the start and the end of a cuda thr

ghandurah · December 9, 2009, 10:44am

Hi everyone,

I am trying to compare the time taken by cuda threads, I have a 1000000 floats array and a 1000000 clock_t array, I tried the following way using clock_t to count no. of clock cycles:

__global__ kernel (...........,clock_t* d_clocks){

int idx=..........;

clock_t start=clock();

//code

d_clocks[idx]=clock()-start;

}

then I copy the values of d_clocks to a host array called clocks after calling cudaThreadSynchronize() just after the kernel call.

It outputs big numbers at the begining, but after idx=1009 it always outputs 1 !

and is there a way to measure the time inside a kernel in milliseconds in debug mode not EmuDebug mode?

Thanks

CapJo · December 9, 2009, 12:22pm

Hi everyone,

I am trying to compare the time taken by cuda threads, I have a 1000000 floats array and a 1000000 clock_t array, I tried the following way using clock_t to count no. of clock cycles:
__global__ kernel (...........,clock_t* d_clocks){

int idx=..........;

clock_t start=clock();

//code

d_clocks[idx]=clock()-start;

}
then I copy the values of d_clocks to a host array called clocks after calling cudaThreadSynchronize() just after the kernel call.

It outputs big numbers at the begining, but after idx=1009 it always outputs 1 !

and is there a way to measure the time inside a kernel in milliseconds in debug mode not EmuDebug mode?

Thanks

The CUDA SDK contains examples that show how to do time measurement.

_PM · December 9, 2009, 10:27pm

Look at cudaEvent in user’s manual.

king1 · December 11, 2009, 7:24am

See clock sample to know how to use.

You need to calculate start time and end time of each thread and find min and max value from those. final value is max - min.

ghandurah · December 13, 2009, 8:14am

the cudaEvent is used inside the main (host) function (i.e. to measure the time taken by a cuda call for example), while what I am looking for is the time taken by each individual thread, even inside the same block.

ghandurah · December 13, 2009, 8:18am

I took a look at the clock example, they’re doing the same thing that I am doing, so I guess it’s correct, I just have to understand what do the numbers represent.

ghandurah · December 13, 2009, 8:20am

I understand, but in my case, I need to measure the time taken by each individual thread (some threads should take a lot more time than the others), so I don’t need the max-min step.

ghandurah · December 13, 2009, 8:21am

Thanks all, I’ll try again to figure out the meaning of the numbers.

Topic		Replies	Views
Calculation of time of a bunch of kernels Is there a simpler way to measure time? CUDA Programming and Performance	1	1760	June 13, 2007
How to measure time inside Kenrel CUDA Programming and Performance	12	6561	August 5, 2011
How to measure time in cuda kernel ...? [CUDA 4.0] CUDA Programming and Performance	2	1281	May 7, 2013
Timer in kernel CUDA Programming and Performance	1	2208	May 28, 2009
Number of GPU clock cycles CUDA Programming and Performance	15	10458	June 16, 2017
How to measure time in kernel function on devices? CUDA Programming and Performance	2	1429	September 25, 2011
Compare Execution Times CPU vs GPU the proper way? CUDA Programming and Performance	5	6057	September 8, 2009
time measurement discrepancy timer, clock(), profiling CUDA Programming and Performance	4	6700	April 7, 2010
Measuring running time CUDA Programming and Performance	1	1439	June 13, 2009
timing kernel execution with clock() CUDA Programming and Performance	6	3740	July 6, 2009

how to measure the time elapsed (or no. of clock cycles) between the start and the end of a cuda thr

Related topics