cuda visual profiler

gmtan · July 10, 2008, 9:38am

What do the terms of “memcpy” and “cpu time” in th profiler output represent?
Is the “memcpy” memcpy() executed by CPU?
The execution time of a kernel method equals to the sum of the “GPU time” + “CPU time”

E.D_Riedijk · July 10, 2008, 5:33pm

No, CPU time is the time it takes a kernel call as seen from CPU. GPU time is the time the GPU is actually busy (kernel call overhead not included, that is included in CPU time)
memcopy is the time the memcopy call takes (CPU time also)

Oceanian · July 16, 2008, 2:31am

So GPU time is the time between kernel start and return on GPU and CPU time is the time between the start and end of the kernel call on CPU(kernel execution time not included). Do I understand you correctly?

E.D_Riedijk · July 16, 2008, 5:45am

CPU time is the time between the start and end of the kernel call on CPU(kernel execution time included)

Oceanian · July 16, 2008, 9:50am

But isn’t kernel call asynchoronous? The programing guide says the host won’t wait for a global kernel to complete.

And I got one profiling result showing a kernel with GPU time less than CPU time.

External Image My mistake, I was too careless reading the output…

E.D_Riedijk · July 16, 2008, 11:12am

In normal code yes, but in the profiler no.

CPU time = GPU time + overhead, so GPU time should always be less than CPU time. If you have a profile where CPU time < GPU time I would be very surprised and you should file a bug report ;)

jordyvaneijk · July 29, 2008, 7:49am

I know the CUDA Visual profiler is a method to time your programs different parts, but is this an accurate way to time your program, or is it better to time it using the timers inside the program itself?

E.D_Riedijk · July 29, 2008, 11:19am

well you can accurately time your program between the first and last CUDA call I guess. You can ask the profiler to save CPU time. Than you can see how many time has passed between your cuda-related calls.

Sarnath · July 29, 2008, 1:36pm

Enabling Profiler reduces the GPU clock and your app would actually run slower!!!

Dont use profiler to time your code!!!

Read “Release Notes” , search for “profiler” – If an application crashes while profiling is enabled, the GPU clocks remain reduced even for other GPU applications which dont need profiling. You have to reboot it to get it fixed…

– I experienced it just now!!! I was getting 57x and I rebooted and I got 65x – with no change in inputs or anything… Jusss the profiler clocks down the GPU…

Beware…

senorbum · July 29, 2008, 3:27pm

Yeah, definitely don’t time your code with profiler. It is VERY useful for determining what parts of your program are worth optimizing though :)

E.D_Riedijk · July 29, 2008, 5:41pm

Hmm, that is good news indeed. My real performance will be even better than what I thought :)

CudaEvents it shall be I guess

jarjar · July 29, 2008, 10:41pm

I am facing the following problems with the CUDA profiler:

(1)
I have an application where after each kernel call, I use memcpy to copy some data from the GPU to the CPU. The memcpy does a very simple 4byte (integer) copy from the GPU to the CPU. This memcpy is called as many times as the kernel is called.

When I used the cuda visual profiler I found that it correctly reports the number of times the memcpy is called, but some how does not report the amount of time spent in the memcpy correctly. You can also see from the cuda profiler that memcpy time is reported in the GPU usec and not in CPU usec for any program.

So my question is does the size of data transfer (and hence the time spent) influence the result displayed by the visual profiler for memcpy operation time ?

E.D_Riedijk · July 30, 2008, 5:26am

yes, it differs. The reason you only see GPU time is as far as I know because the GPU is doing the memcopy.

Topic		Replies	Views
Profiler Times just need some info CUDA Programming and Performance	4	4531	June 16, 2010
Profiler, GPU/CPU time CUDA Programming and Performance	0	2553	January 29, 2009
How to explain the performance difference? CUDA Programming and Performance	7	3506	March 26, 2008
What do you understand by CPU time? CPU time, computational load, cuda prof CUDA Programming and Performance	8	2406	July 11, 2008
some cuda question CUDA Programming and Performance	6	980	December 23, 2015
cpu and gpu time in cuda profiler CUDA Programming and Performance	0	979	December 4, 2010
On timing and timer CUDA Programming and Performance	7	4190	July 15, 2009
Slow memory transfers CUDA Programming and Performance	7	1987	May 23, 2011
Timing with cuda profiler CUDA Programming and Performance	2	2911	December 6, 2008
how to evaluate the CUDA's performance how can i know the program is optimazed CUDA Programming and Performance	7	7336	July 24, 2008

cuda visual profiler

Related topics