CUDA emulation release Performance when running in emulation

Chen_Bilong · October 9, 2007, 4:02pm

Hi,

When the program is build in EmuRelease configuration, what is the targeted device? When I call cutGetTimerValue function to get the execution time, is it the execution time on the targeted device or the host processor? I tried to run the same program several times, but everytime it gives me different results, and seems the result also depends on the host processor’s loading.

Thanks.

paulius · October 9, 2007, 5:50pm

Everything runs on the CPU in the emulation mode. I wouldn’t worry about the times in emulation, as the mode is emulating execution by threads, blocks, etc., all sequentially.

Paulius

seb · October 9, 2007, 8:27pm

And also cutGetTimerValue() only returns the time passed between cutStartTimer and cutStopTimer calls. It doesn’t care what you did in between so you can’t say it is the execution time on the target or the host device.

Chen_Bilong · October 11, 2007, 7:00am

Is there any way to get execution time (or the number of cycles) a segment of program will execute on real GPU? I don’t have a CUDA compatible graphics card at hand.

mfatica · October 11, 2007, 7:08am

No, it is an emulator not a cycle accurate simulator.

preetib · October 23, 2007, 7:27am

Hi,

Can you provide me some information regarding “cutGetTimerValue function”
and sample code to apply it?

As I want to estimate time taken by “cublas and cufft APIs”.
But don’t knw how to this?

Thanks in advance :)

MisterAnderson42 · October 23, 2007, 11:27am

You can read the source code for cutGetTimerValue in the CUDA SDK yourself. Or look at any of the sdk projects for an example.