How to measure total time for CPU and GPU

I have written code using the CPU and GPU. I want to measure the total time on the CPU and the GPU.
How can I measure the total time?(I have used Visual Studio C++.)
Thanks :)


I work with c++/linux, and I use this lines to measure the gpu time

float elapsed=0;
cudaEvent_t start, stop;


HANDLE_ERROR( cudaEventRecord(start, 0));

>> here you launch the kernel

HANDLE_ERROR(cudaEventRecord(stop, 0));
HANDLE_ERROR(cudaEventSynchronize (stop) );

HANDLE_ERROR(cudaEventElapsedTime(&elapsed, start, stop) );


printf("The elapsed time in gpu was %.2f ms\", elapsed);

and to get the cpu time you can use

clock_t cpu_startTime, cpu_endTime;

double cpu_ElapseTime=0;
cpu_startTime = clock();

>>here you have the cpu code

cpu_endTime = clock();

cpu_ElapseTime = ((cpu_endTime - cpu_startTime)/CLOCKS_PER_SEC);

Later, you can subtract the cpu time - gpu time to get the real cpu expended.

I dont know whether is the same for visual c++, but I hope this be helpful.

Thanks a lot TDiego :)
but I want to measure time in miliseconds.
So I used windows.h, QueryPerformanceFrequency and QueryPerformanceCounter functions.
When I used windows.h in my cuda code, I get the error “windows.h is not defined”
How can I measure the total time in miliseconds???

Hi skymoonraider

With the code above, you get the time in miliseconds, both with cuda and with c code.

And about the windows.h, im sorry, but I dont have used visual c++ with cuda.

Try to use the above code.

Hi TDiego, I tried that code and the values (ms) are:

First column:nxn matris
Second column:QueryPerformanceFrequency (ms)
Thirdcolumn:cpu_startTime (ms)

10 nxn 0,0167 ms 0,000 ms
20 nxn 0,0524 ms 0,000ms
30 nxn 0,1342 ms 0,000ms
40 nxn 0,2584 ms 0,000ms
50 nxn 0,4347 ms 0,000ms
250 nxn 23,3117 ms 0,000ms
260 nxn 25,5456 ms 0,000ms
270 nxn 28,3133 ms 0,000ms

So I used QueryPerformanceFrequency , but it did not work on GPU.


In this page explain how to get the time for cpu on visual c++

and to use the cudaevents you have to include the given namespace, check this pages

I hope this be helpful.

Use cpu_ElapseTime, not cpu_startTime for the last column.

Thanks oshkosher, but I already used cpu_ElapseTime :)

I had the same issue, measured 0 with:
cpu_ElapseTime = ((cpu_endTime - cpu_startTime)/CLOCKS_PER_SEC);

Replace by:
cpu_ElapseTime = ((cpu_endTime - cpu_startTime)/(double)CLOCKS_PER_SEC);
and you will get correct results.