Cutil Source Code

I have been trying to figure out cutTimer function
It would be great if i could have a look at the source code for the cutil functions but i am unable to locate it :(

cutil is part of the SDK.

i could only find the cutil header file which has only a brief discription of the functions,are there better explanation on what exactly the functions do ?

I’m not sure what info you’re looking for about the timer functions that isn’t found in cutil.h. Basically, I do something like this whenever I’m using a timer:

unsigned int timer;

CUT_SAFE_CALL(cutCreateTimer(&timer));

CUT_SAFE_CALL(cutResetTimer(timer)); // might be unnecessary, but I haven't had a chance to ensure that it is always unnecessary

CUT_SAFE_CALL(cutStartTimer(timer));

// call kernel

...

CUT_SAFE_CALL(cutStopTimer(timer));

printf("The call took %f ms\n", CUT_SAFE_CALL(cutGetTimerValue(timer)));

CUT_SAFE_CALL(cutDeleteTimer(timer));

Does that help?

the doubt i have is does the cutTmer wait for whatever GPU calls have been made (before the line where stop is called ) to complete and only then will cutStopTimer execute

or is it like the kernel might be executing in the background and the host code there keeps executing and might triger the cutTimerStop??.

also does it in way interfere with the normal execution of the code?

and also can i measure CPU prefomance also with it ??

No, cuttimer wil not wait or synchronize for anything, you need to do this yourself. See the SDK demos…

I’m new to this as well, but it seems that if you compile your code in RELEASE mode, which is the default for the projects, the kernel invocation will return immediately, leaving the kernel code to execute on the GPU. This means that the timer is really just timing how long it took to tell the GPU to run the kernel, not the kernel execution time. I believe there are two solutions to this problem.

First, you could call cudaThreadSynchronize() before calling cutStopTimer(). I’ve tried this, and it works.

A much better solution is to run your code in DEBUG mode. My understanding is that the various cutil macros will call cudaThreadSynchronize() in DEBUG mode, but not in RELEASE mode. So, if you just want to get an idea of speedup, compile and run your code in DEBUG mode. When you’re satisfied that the code works, use RELEASE, and the various cutil macros will no longer call cudaThreadSynchronize(), and shouldn’t have much impact on normal execution.