getElapsedTime vs Profiler

wuninsu · June 29, 2011, 12:38am

I used both getElapsedTime and profiler to find the elapsed time of my kernel.
But the profiler’s result is 10 times lower than elpasedTime’s result.
Which one do I have to believe?

And when I changed the kernel code and memcpy code, then profiler gives me that you called kernel 2 times, but in code 16 times.
For example

for i (0… 16){
MEMCPY()
KERNEL()
MEMCPY()
}

=> KERNEL 16 TIMES!

But
for i (0… 16){
MEMCPY()
KERNEL()
}
for i (0… 16){
MEMCPY()
}

=> KERNEL 2TIMES!

I thought it is caused by pipelining. Am I right? :(

Sanjiv.Satoor · July 4, 2011, 8:21am

The getElapsedTime() reported time will include host side overhead for the kernel launch. The profiler reported gputime does not include host side overhead - it is just the kernel execution time on the GPU. The profiler cputime should be closer to the time obtained using getElapsedTime().

The profiler output in the second case should also have the kernel launch 16 times. The profiler output could be incomplete due to some other reason. You can try adding a cudaDeviceSynchronize() call after the second loop.

wuninsu · July 4, 2011, 1:01pm

Thanks, I don’t know why the CPU time has a lot of difference between the result of getElapsedTime() :(

Topic		Replies	Views
cuda visual profiler CUDA Programming and Performance	12	8168	July 30, 2008
Profiler v. cudaEventSynchronize CUDA Programming and Performance	6	8140	March 27, 2008
Profiler Times just need some info CUDA Programming and Performance	4	4531	June 16, 2010
timing and the profiler getting different results from each CUDA Programming and Performance	10	1764	February 3, 2010
strange GPU idle time in profiler CUDA Programming and Performance	4	1002	June 27, 2011
Profiler - CPU Time CUDA Programming and Performance	8	6005	August 10, 2008
Visual Profiler (CPU-TIME) CUDA Programming and Performance	3	1568	November 4, 2010
Difference between time measured and time reported by profiler CUDA Programming and Performance	0	944	January 19, 2009
What is GPU&CPU time in profiler? instrumentation overhead included? CUDA Programming and Performance	0	1111	September 19, 2008
Profiler Interpretation of profiler results CUDA Programming and Performance	3	5868	July 3, 2007

getElapsedTime vs Profiler

Related topics