Nsight Systems uses LD_PRELOAD to inject the MPI tracing. How is MPI loaded/setup in your execution environment? How does the execution of your “real” app differ from the execution of your test app?
An option would be that MPI tracing silently fails. Let’s first check the stdout and stderr outputs (use the drop-down box “Timeline View” on the top left). If this doesn’t help, we should check NVLOG. To do so. save the following content to a nsys_nvlog.config file:
+ 75iewf 75IWEF global
$ /tmp/nsight-sys.log
ForceFlush
Format $sevc$time|${name:0}|${tid:5}|${file:0}:${line:0}[${sfunc:0}]: $text
Then add NVLOG_CONFIG_FILE= before the command line. After execution, there should be a log file at /tmp/nsight-sys.log.
Note that MPI Fortran 2008 is not supported by Nsight Systems and the MPI calls won’t show up.