Could someone give a better idea about this c2c_mradix_sp method ? Does this include the time to execute cufftPlan3D, cufftExecC2C and cufftDestroy in total or something more to it?
Also is the memcopy only the time between copying contents from CPU to GPU and vice versa or something more than that ?
I need some information on analysing cuda visual profiler output as well…i dont know if there s any documentation that can help…for example i wanted to know what s the unit for GPU time…is it miliseconds or seconds or what…
There are releasenotes for the profiler, I think it was in that one that I read a section about interpreting the results of the profiler. GPU time (and CPU time) is in microseconds (which is shown when you make a graph in later versions if I remember correctly)