Measuring Time of Data Transistion

Hi Im using 9800GT and 9500GT on windows.

Im trying to measure time it takes to copy data from host to device. Im currently using code bellow.

#include <time.h>

int start, end;

float time;

start = clock();

cudaMemcpy(...);

end = clock();

time = start - end;

problem arise that variable time calculated this way is 0 and Im pretty sure it would not be that short since Im copying

array of size 1000 with cuComplex type. Also Im struggling with making my code to run fast but not sure if generally

time of copying back and forth the data takes long or displaying the data with bitmap.anim takes long. Can somebody help me with this?

Thank you for your kindness.

Why don’t you try events?

You probably want to use a timer with higher resolution. In addition you might want to execute the memory copy a couple of times and take the median, mean, minimum or maximum. A higher resolution timer for windows can be found here for example: http://stackoverflow.com/questions/1739259/how-to-use-queryperformancecounter/1739265#1739265

Thanks this helps me a lot. Is there way to directly display data in graphic card instead of sending it back to CPU and using
bitmap.anim ? Im using cuda 4.0 and it says it can directly send data to displaying gpu but cannot find the method. Thank you.

Regards,
Jaehong Yoon

If you are copying from the host to the device, copies up to 64k are asynchronous.