I know that I shouldn’t have to do anything special to profile an omp implementation. However, when I run pgcollect on my executable the program itself runs on several threads (evident by the 5X speedup), while pgcollect profiles everything as being single threaded. Pgprof confirms this since all of the profiled sections of code are are shown to be run on a single thread/processor.
I think there is something wrong with my environment setup, but I don’t know where to start looking since this problem isn’t well documented on the forms and internet.
Any suggestions would be greatly appreciated.
My OMP code implementation skeleton is:
SUBROUTINE f_name() !$OMP PARALLEL DEFAULT(PRIVATE) & !$OMP SHARED(...) !$OMP DO DO IC = 1, NC ..... ENDDO !$OMP END DO !$OMP END PARALLEL RETURN END
My compilation tags are:
pgfortran -V13.4 -mp -fast -Mipa=fast,inline -Minfo=ccff
Only environment variable changed from default was:
Ubuntu 12.04 (precise)
12 Intel Xeon CPU X5650
pgcollect -V prints:
pgcollect 13.4-0 64-bit target on x86-64 Linux