Pgprof and accelerators

I decided to try out pgprof with an accelerated kernel mainly for my own education and to see if there are bottlenecks I’m missing. I followed the example in the PGI Tools document:

> make runsorad-vector32.exe
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff -c src/sorad.vector32.f
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff -c src/sorad.orig.noaero.donottouch.f
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff -c src/driver-check.f90
pgfortran -fast -r4 -Mextend -Mpreprocess -Ktrap=fp -ta=nvidia,time -Minfo=ccff sorad.vector32.o sorad.orig.noaero.donottouch.o driver-check.o -o runsorad-vector32.exe
> pgcollect -time runsorad-vector32.exe
...output from program...
> ls pg*out
pgpacc.out  pgprof.out
> pgprof -exe runsorad-vector32.exe

At that point, things diverge. Instead of seeing the two Accelerator columns as shown in Figure 15.11, I get the normal two-column mode. Of course, that also means that nothing of my GPU kernel is displayed as well. Likewise the Accelerator “undertab” is ever-blank.

Yet, there is that non-zero-size pgpacc.out file with cryptic information in it. Is there an extra flag/switch I need to use to get pgprof to read the accelerator results?

Matt

Hi Matt,

It might be a conflict with the “time” option and pgcollect. Both use the same profiling routine to capture the GPU timing info. Instead of being directed to the pgprof.out file, it may be going to stderr. Try removing “time” from “-ta”.

Thanks,
Mat

Ayup, that was it! I guess that means a minor bug report for the PGI Tools document which specifically states you can do both pgcollect and “time”.

Thanks,
Matt

(Preview: My next post/thread will be on pgcollect and OProfile…though I’m guessing the fault is with OProfile and not pgcollect.)

Hi Matt,

I guess that means a minor bug report for the PGI Tools document which specifically states you can do both pgcollect and “time”.

Well, the doc is correct, it’s pgcollect that has the bug.

Thanks,
Mat