I’m currently profiling a unified memory-based application running on a Grace Hopper architecture using nsys. However, I notice that the CLI output does not show any information about CPU and GPU page faults. Here is the command I use to run the profiler:
nsys --profile output=my_output ./my_exe
Version of nsys: 2024.2.3
Am I missing a specific configuration or flag to display page fault details? If there’s a recommended workflow or option to include this data in the output, please let me know.
A couple of things. Unified memory page faults cause a fair amount of overhead, so they are not turned on by default. You will need to explicitly request them.
Secondly, the information isn’t output on the screen after you run the CLI automatically. There are some statistical analysis scripts that will put things on the screen, but there are none that by default show these. Plus it will be much simpler to understand what is going on if you visualize them on the timeline.
I would go ahead and take a look at the examples at User Guide — nsight-systems 2025.1 documentation (i know that looks like a link to the top of the documentation, the forum software munges it, that’s actually a direct link to unified memory transfer trace)
I used the options --cpu-um-cpu-page-faults=true and --gpu-um-cpu-page-faults=true, but I still cannot see the page faults in the CLI output. Could you clarify why this might be happening?
What scripts are you referring to in this context? Could you please elaborate?
Are you suggesting that page faults can only be viewed on the GUI by default and not through the CLI? If so, could you confirm or explain this limitation?
Nsight Systems only outputs data to the command line post run (other than basic diagnostic information) if the user has asked for it. The command that you were showing did not ask for any statistical information.
What you want to do, if you do not want to look at the data in the GUI, is to use one of the post analysis stats scripts (User Guide — nsight-systems 2025.1 documentation). In particular, you probably want to use one of:
um_cpu_page_faults_sum – Unified Memory CPU Page Faults Summary
um_sum[:rows=] – Unified Memory Analysis Summary
um_total_sum – Unified Memory Totals Summary
In the documentation of statistical analysis you can get more details on each of these. The output can be directed to the screen or to a file in various formats.
The stats scripts are also written in Python, so if you want something slightly different from what we provide, you can make some alterations.
The things is we have remote servers with no GUI. So I am limited to CLI, so wanted to know if I could get results similar to GUI on CLI. I am able to generate .nsys-rep file and sqlite3 file but using
If you just call “nsys stats my_output.nsys” you will only get some default statistics, which does not include any unified memory information. You need to call:
nsys stats (the scripts I mentioned above) my_output.nsys-rep