Pgprof and accelerators

TheMatt · January 25, 2010, 7:31pm

I decided to try out pgprof with an accelerated kernel mainly for my own education and to see if there are bottlenecks I’m missing. I followed the example in the PGI Tools document:

> make runsorad-vector32.exe
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff -c src/sorad.vector32.f
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff -c src/sorad.orig.noaero.donottouch.f
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff -c src/driver-check.f90
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff sorad.vector32.o sorad.orig.noaero.donottouch.o driver-check.o -o runsorad-vector32.exe
> pgcollect -time runsorad-vector32.exe
...output from program...
> ls pg*out
pgpacc.out  pgprof.out
> pgprof -exe runsorad-vector32.exe

At that point, things diverge. Instead of seeing the two Accelerator columns as shown in Figure 15.11, I get the normal two-column mode. Of course, that also means that nothing of my GPU kernel is displayed as well. Likewise the Accelerator “undertab” is ever-blank.

Yet, there is that non-zero-size pgpacc.out file with cryptic information in it. Is there an extra flag/switch I need to use to get pgprof to read the accelerator results?

Matt

MatColgrove · January 26, 2010, 1:59pm

Hi Matt,

It might be a conflict with the “time” option and pgcollect. Both use the same profiling routine to capture the GPU timing info. Instead of being directed to the pgprof.out file, it may be going to stderr. Try removing “time” from “-ta”.

Thanks,
Mat

TheMatt · January 26, 2010, 2:13pm

Ayup, that was it! I guess that means a minor bug report for the PGI Tools document which specifically states you can do both pgcollect and “time”.

Thanks,
Matt

(Preview: My next post/thread will be on pgcollect and OProfile…though I’m guessing the fault is with OProfile and not pgcollect.)

MatColgrove · January 26, 2010, 10:08pm

Hi Matt,

I guess that means a minor bug report for the PGI Tools document which specifically states you can do both pgcollect and “time”.

Well, the doc is correct, it’s pgcollect that has the bug.

Thanks,
Mat

Topic		Replies	Views
Accelerator Kernel Timing info Legacy PGI Compilers	3	3503	December 31, 2010
pgprof doesn't profile accelerated code Legacy PGI Compilers	4	8704	February 28, 2013
"pgcollect -cuda" invalid option Legacy PGI Compilers	7	22360	July 21, 2010
pgprof: FileError.File 'pgprof.out' Legacy PGI Compilers	2	8678	June 12, 2013
Using the CUDA Visual Profiler and the Pgprof Profiler Legacy PGI Compilers	0	4650	June 5, 2010
pgcollect + openacc , not working with pgi14.X Legacy PGI Compilers	9	16323	May 20, 2015
pgprof/pgcollect : problem with CPU+openacc on same routine Legacy PGI Compilers	2	7309	November 24, 2014
GPU time measuring using accel.h routines PGI 20.1 Legacy PGI Compilers	5	743	May 29, 2020
Problem with -ta=nvidia,time Legacy PGI Compilers	3	8222	March 11, 2010
profiling individual subroutines Legacy PGI Compilers	1	7215	June 11, 2013

Pgprof and accelerators

Related topics