I’m attempting to profile a large mixed-language application that links to many library packages. The primary source code for the application and many of the library packages are compiled with nvfortran, nvcc, and nvc++ (all products from nvhpc@22.7), but some packages are required to be compiled with gcc@9.4.0.
The issue that I’m seeing is that the top-down view of nsys-ui@2022.2.1 is reporting 94% of the runtime is being recorded in “broken backtraces”. Are there debugging flags that I can pass the respective compilers that would produce better reporting?
There is a bunch of information on troubleshooting backtraces at User Guide :: Nsight Systems Documentation (this is a direct link to the symbol troubleshooting, this interface doesn’t seem to want me to edit the link text).
Are you running CLI or GUI and what architecture are you on? I’m wondering which backtrace method is being invoked.
Thanks for the link. This is being collected by CLI on an MPI-parallel job on a Cascade Lake + V100 Linux cluster. The code is mostly OpenACC accelerated Fortran.
The code is being compiled as -O3 -fast -gopt -Melf. I played around with some of the things in the user manual, but the command line flags appear to be for GCC, not NVHPC, and ResolveSymbols didn’t change anything.
Okay, “-b fp” means that you are using frame pointers for your back trace. I am wondering if most of your libraries were not compiled with frame pointers.
Can you try switching to “-b dwarf” or “-b lbr” (dwarf unwind or Intel Last Branch Registers). LBR is the fastest, but limited depth (hardware counters)?
Thanks for the suggestions. -b lbr disables the Top-Down View, so not much help there. -b dwarf creates small traces (~2MB rather than 14MB for frame pointers), and most of the symbols are unresolved. ResolveSymbols -s myexe trace_file.nsys-rep wasn’t able to resolve any of them. Just for kicks, I tried running -b fp on a debug version: no improvement.
I just recompiled and profiled with -Mframe added to the compiler, with no change. Is that the correct nvhpc equivalent to -fno-omit-frame-pointer? Would -pg help?
I’ll note that the application that I’m profiling does have a very deep call chain: maybe that is causing problems. Also, the fragments listed under “Broken backtraces” are firmly within the application, not on the boundaries to calls to 3rd party libraries.
I’m having a hard time tracking down the answer to your compiler switch question. Can you point me at the documentation for the Mframe and pg switches?
Hey Paul, I would be happy to help on a reproducer. Alternatively, if you can share the code in question with me, I can work with @rknight to understand what’s going wrong on your systems.