You actually can get the same function level profiling of the CPU application as pgprof with nsys. By default nsys uses the “Last Branch Report” for CPU profiling (-b lbr) which doesn’t give great detail.
Instead, I’d recommend compiling your code with “-g” or “-gopt” to include Dwarf information. (-g can inhibit some optimization to make it easier to debug, so in this case I use -gopt, which includes Dwarf without reducing optimization).
Then when collecting the profile add “-b dwarf” to nsys. The CPU report is not shown in the command line “stats”, so you then need to open the profile in the GUI. From there, below the timeline there’s a “Events View” box. Select the drop-down menu to one of the three views: Top-down, Bottom-up, or flat. This gives you the same function level profile as you’d see in pgprof.
If needed, you can adjust the “–sampling-period” from the default 1000000. The accepted values range between 4000000 and 125000. Though the smaller the sample size, the bigger the profile and more profiling overhead.