Stopwatch Class

I wrote a stopwatch class that I use to time the performance of the CUDA part of my application. It might be usefull to others so I’d like to share it.

Where without the class you have to do:

cudaEvent_t start, stop;

float t;

cudaEventCreate(&start);

cudaEventRecord(start, 0);

stuffYouWantToTime();

cudaEventCreate(&stop);

cudaEventRecord(stop, 0);

cudaEventSynchronize(stop); 

cudaEventElapsedTime(&t, start, stop);

printf("time: %1.2f\n", t);

with the class you can do:

#include "cudastopwatch.h"

cudaStopWatch sw(10); //create the class and allow 10 stacked timers

sw.start();

stuffYouWantToTime();

printf("time: %1.2f\n", sw.stop());

and you can stack the timers like this:

#include "cudastopwatch.h"

cudaStopWatch sw(10); //create the class and allow 10 stacked timers

sw.start();

sw.start();

firstPartYouWantToTime();

printf("first part: %1.2f\n", sw.stop());

sw.start();

secondPartYouWantToTime();

printf("second part: %1.2f\n", sw.stop());

printf("overall: %1.2f\n", sw.stop());

I tested it on WXP64.

There’s still room for improvement as at the moment every stop() call synchronizes the GPU. A future version may work in a way that it only synchronizes when the user explicitly wishes to in order to avoid unnecessary busy waiting and to fully exploit the asynchronous behavior of the new API.
cudastopwatch.h (1.77 KB)

I Updated the class!

Now it includes 2 different kinds of timers: timers to insert into the stream and regular timers. Stream timers are used with sStart() and sStop() calls whereas regular timers are used with tStart() and tStop().
cudastopwatch.h (2.7 KB)