Text based profiling for CUDA Fortran with MPI program

Hi!

we are not able to profile CUDA+MPI+FORTRAN program, we are not getting any full example of such program profile.

These are the following process we have used.

$pgfortran -ta=nvidia -o myprog myprog.f90
$ pgcollect ./myprog
$ pgprof -text -exe ./myprog
but we are not getting information about cuda kernels.

could you please suggest some docs or elaborate profiling process for such program.




Thanks

Hi Manoj_YADAV,

Try adding “-cuda” to your pgcollect options.

% pgcollect -help -cuda
Reading rcfile /proj/pgi/linux86-64/15.7/bin/.pgcollectrc

Usage: pgcollect [-time] program [program_args]
pgcollect [-hwtime|<event_options>] program [program_args]
pgcollect [-hwtime|<event_options>] -exe program script [script_args]
-help[=groups|basic|cpu|gpu|overall]
Show profiler usage & switches
-cuda[=gmem|branch|cfg:|tesla|cc1x|fermi|cc2x|kepler|cc3x|list]
Collect performance data from CUDA-enabled GPU
gmem Global memory access statistics
branch Branching and Warp statistics
cfg: Specifies as CUDA profile config file
tesla Use counters for Tesla architecture
cc1x Use counters for compute capability 1.x
fermi Use counters for Fermi architecture
cc2x Use counters for compute capability 2.x
kepler Use counters for Kepler architecture
cc3x Use counters for compute capability 3.x
list List cuda event names used in profile config file

Alternately, you can use nvprof instead of pgcollect.

Hope this helps,
Mat