Hi,
I want to know the timing functions that can be called from outside and inside the Kernel subroutine in CUDA fortran. I wanted to know this so as to understand the time spent in actual computations and the time spent in transfers.
Thank you
Regards
Kaustubh