Why compute a float in cudaEventElapsedTime() instead of a long integer?

Anyone care to explain why the event stream elapsed time function computes a float instead of a 64-bit integer?

[indent][font=“Courier New”]cudaError_t cudaEventElapsedTime (float* ms, cudaEvent_t start, cudaEvent_t end)[/font]
[font=“Georgia”]Computes the elapsed time between two events (in milliseconds with a resolution of around 0.5 microseconds). If
either event has not been recorded yet, this function returns cudaErrorInvalidValue. If either event has been recorded
with a non-zero stream, the result is undefined.[/font][/indent]

I’m guessing they use float for convenience. Most GPU functions take a few ms, so it gives you something human readable without further conversion.

But imo, that’s moot because the event interfaces is too cumbersome to use in the 1st place. I still use CPU clocks.

Yeah… using CPU clocks to time GPU performance is pretty useless. Resolution isn’t high enough, there’s OS scheduler and PCIe interference, etc.

It works fine for my purposes. I’m using QueryPerformanceCounter on Windows and clock_gettime(CLOCK_REALTIME, &t) on Linux, which both give have a resolution of at least 10^-5s. I still use CUDA profiler if I want to know the time without launch overhead.

Actually, I take back about the event interface being too cumbersome. That’s also moot because the SDK has a timer wrapper class, which I totally forgot about.

If you ever send me code that uses cutil, I helpfully ignore it completely.

Basically if you are timing what is happening on the card you need to be using events because otherwise context switches will ruin your results.

I was just curious because floats are hardly a great representation for time duration and conversion but, as you say, I bet they thought this would be more convenient.

I consider it a bug and hope someone will deprecate it (unless there is something I’m missing).

Perhaps the cudaEvent_t is not entirely opaque and there is an unofficial way to get an integral timestamp?