Anyone care to explain why the event stream elapsed time function computes a float instead of a 64-bit integer?
[indent][font=“Courier New”]cudaError_t cudaEventElapsedTime (float* ms, cudaEvent_t start, cudaEvent_t end)[/font]
[font=“Georgia”]Computes the elapsed time between two events (in milliseconds with a resolution of around 0.5 microseconds). If
either event has not been recorded yet, this function returns cudaErrorInvalidValue. If either event has been recorded
with a non-zero stream, the result is undefined.[/font][/indent]
It works fine for my purposes. I’m using QueryPerformanceCounter on Windows and clock_gettime(CLOCK_REALTIME, &t) on Linux, which both give have a resolution of at least 10^-5s. I still use CUDA profiler if I want to know the time without launch overhead.
Actually, I take back about the event interface being too cumbersome. That’s also moot because the SDK has a timer wrapper class, which I totally forgot about.
I was just curious because floats are hardly a great representation for time duration and conversion but, as you say, I bet they thought this would be more convenient.
I consider it a bug and hope someone will deprecate it (unless there is something I’m missing).
Perhaps the cudaEvent_t is not entirely opaque and there is an unofficial way to get an integral timestamp?