Pgprof application

Hello,

Last years, I have always installed PGI “Community” version, because of I managed some computers in two university labs. Two months ago, I installed lastest PGI “Community” version that I found at Nvidia site (http://www.pgroup.com/products/community.htm redirected me to https://developer.nvidia.com/hpc-sdk). However, this new version (hpc-2020) doesn’t include “pgprof application” and I need it (older “Community” versions like 2019-19.4 and 2019.19.10 included “pgprof”.

What can I do now?

Thanks

Thaks

Hi,

PGI was re-branded as the NVIDIA HPC Compiler and is now included as part of the NVIDIA HPC SDK ( https://developer.nvidia.com/hpc-sdk). While the PGI Community Edition is no longer available, the NVHPC SDK is available at no cost for all releases, not just two releases a year. The older PGI drivers (pgcc, pgc++, pgfortran) are available with SDK but you should consider moving to the new compiler drivers (nvc, nvc++, nvfortran).

Pgprof was a repackaged version of nvprof. Nvprof is available in the SDK however NVIDIA deprecated this profiler about a year ago so please consider transitioning to the new NSight-Systems and NSight-Compute profilers. See: https://developer.nvidia.com/blog/migrating-nvidia-nsight-tools-nvvp-nvprof/

Hope this helps,
Mat

It looks like nsys is not able to profile CPU-only application like pgprof was.
What is the solution ? (and command lines) ?

Thanks

You actually can get the same function level profiling of the CPU application as pgprof with nsys. By default nsys uses the “Last Branch Report” for CPU profiling (-b lbr) which doesn’t give great detail.

Instead, I’d recommend compiling your code with “-g” or “-gopt” to include Dwarf information. (-g can inhibit some optimization to make it easier to debug, so in this case I use -gopt, which includes Dwarf without reducing optimization).

Then when collecting the profile add “-b dwarf” to nsys. The CPU report is not shown in the command line “stats”, so you then need to open the profile in the GUI. From there, below the timeline there’s a “Events View” box. Select the drop-down menu to one of the three views: Top-down, Bottom-up, or flat. This gives you the same function level profile as you’d see in pgprof.

If needed, you can adjust the “–sampling-period” from the default 1000000. The accepted values range between 4000000 and 125000. Though the smaller the sample size, the bigger the profile and more profiling overhead.

-Mat

Dear Mat, Thanks for your replay, it’s very Clear.

This pars is quite blocking when working in a HPC environnement … it could be great to make this evolve. We do not only profile a GPU kernel, but a complete portion of software.

Best regards