Timing functions in CUDA fortran

I want to know the timing functions that can be called from outside and inside the Kernel subroutine in CUDA fortran. I wanted to know this so as to understand the time spent in actual computations and the time spent in transfers.

Thank you

Hi Kaustubh,

You might find the following article I wrote helpful. http://www.pgroup.com/lit/articles/insider/v2n1a4.htm. My example program uses Cuda events to show the data transfer time as well as the compute kernel times.

Hope this helps,