I’d like to use the scripts in nsys/target-linux-x64/reports to read the nsys reports. However, when I run the scripts with the exported sqlite file as argument, there is an error named no such function: median. Here are the steps I did.
I’ve added the /storage/users/yhao24/opt/nsys-2024.7.1/host-linux-x64/python/lib to PYTHONPATH.
For the first nsys stats command, I saw the following output.
Generating SQLite file /tmp/tritonbench/rope/nsys_traces/inductor_rotary_pos_emb_full_op_0/nsys_output.sqlite from /tmp/tritonbench/rope/nsys_traces/inductor_rotary_pos_emb_full_op_0/nsys_output.nsys-rep
Processing [/tmp/tritonbench/rope/nsys_traces/inductor_rotary_pos_emb_full_op_0/nsys_output.sqlite] with [/storage/users/yhao24/opt/nsys-2024.7.1/host-linux-x64/reports/nvtx_sum.py]...
** NVTX Range Summary (nvtx_sum):
Time (%) Total Time (ns) Instances Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Style Range
-------- --------------- --------- --------- --------- -------- -------- ----------- ------- ------------------
100.0 704,805 1 704,805.0 704,805.0 704,805 704,805 0.0 PushPop :tritonbench_range
It looks this report script nvtx_sum.py can be successfully executed by nsys.
But if I run /storage/users/yhao24/opt/nsys-2024.7.1/host-linux-x64/reports/nvtx_sum.py nsys_output.sqlite , the output is
Nsight Systems uses a custom build of of the SQLite library that includes a number of additional functions, including the median() aggregate function (as well as the standard deviation function). Some of these build customizations are open-source packages, but some are Nsight-specific code. This custom SQLite library is also integrated into our build of Python, such that those SQL functions are available in the built-in Python SQLite module.
In short, the only reliable way to run the reports is to use the Python that ships with Nsight Systems. The reports were never meant to be run as standalone Python scripts.
This opens up a bigger question of what problem you are trying to solve by running the reports outside of the Nsight environment. The nsys stats command has extensive options for outputting reports in different formats, including CSV and other “code friendly” formats. There are also options to redirect the output to a file, or even to pipe the output to another process. We’ve tried to cover an extensive number of use-cases, including integration into automated test environments (something that’s used extensively internally at NVIDIA). If there is something you’re attempting to do that we don’t already provide for, we’d like to understand what you’re trying to accomplish.
Re-reading your original post, I might have misunderstood the issue. If all you want to do is run the nvtx_sum report (and only that report) you can do that with the command:
If no --report is given, nsys runs a default subset of reports, but you can specify one or more reports on the CLI so only those reports are run. See nsys help stats for more info.
There are many reports that are not included in the default list. See nsys stats --help-reports for a full list, and nsys stats --help-report <name> for a more complete description of each report.
Thanks for your reply! I’m trying to implement a nsys report analyzer in tritonbench, similar to the exisiting ncu analyzer. I prefer to use scripts nsys/target-linux-x64/reports rathern than get another system call to nsys stats --report. I found some scripts in nsys/target-linux-x64/reports marked as deprecated can work, such as nvtxpptrace.py. But since it is marked as deprecated, I’m wondering if the latest scripts can work without nsys call. As you replied, the scripts there are for specific environment, I guess I’ll use the deprecated ones for now.
nvtxpptrace was replaced with nvtx_pushpop_trace. We had a mass renaming several versions ago. If those reports still exist in your install, you might consider upgrading.
Which reports will work and won’t work is dependent on the SQL features they use. Most trace reports aren’t going to aggregate data, so they won’t run into needing the additional medan and standard deviation functions, nor some of the other custom functions.
But yes, nsys stats can process several different reports with one pass, and can output the data to a CSV file or some other format that should be easy to bring into an numeric framework. A simple --output . will do that, or you can chose a custom name. Also see --format (and --help-format[s]) for other supported formats.
Depending on the complexity of what you’re trying to do, you can also write your own stats report, or the newer recipe system. You can also have your analysis system import the data directly from SQLite, or one of the other supported export formats, such as Arrow.
Hi @jkreibich , I just found nvtx_sum returns execution time in us not ns, but cuda_gpu_kern_sum or nvtx_kern_sum return time in ns as table header shown for range longer than 1ms. Can you help to double check this?
Are you using the GUI or CLI? I believe the GUI attempts to auto-scale values so they’re in “convenient” ranges. This makes the display units both report and file dependent. The CLI will default to showing all time values in ns, unless you specify otherwise.