we are not able to profile CUDA+MPI+FORTRAN program, we are not getting any full example of such program profile.
These are the following process we have used.
$pgfortran -ta=nvidia -o myprog myprog.f90
$ pgcollect ./myprog
$ pgprof -text -exe ./myprog
but we are not getting information about cuda kernels.
could you please suggest some docs or elaborate profiling process for such program.