the above was very much slow and i ran it very long and managed to collect 3GB of ncu-rep file. But when loading in UI due to RAM constraint(exceeding 32GB to load a 3GB report file), im not to view it, even after adding adding swap constantly getting not-responding state.
so i tried to check other alternate,
ncu --import profile.ncu-rep --print-fp true --print-details all --section section --csv
but not able to get the report, getting errors(attached image)
Sorry for the issue you met. As in the screenshot, it shows the required metric XXX could not be found.
Can you use --metrics to collect those metrics and see what error reported ?
Also in order to reduce the report size, can you use -c option in the command line also ?
I added --metrics and -c still getting same report i.e metrics values are not there and the no. of lines in the output file also not not reduced… it is still same with and without using -c
full command which i used:
ncu --import profile.ncu-rep --print-fp true --print-details all --section section --csv --metrics -c
the error im getting is:
ERR Required metric sm__throughput.avg.pct_of_peak_sustained_elapsed could not be found.
----- --------------------------------------------------------------------------------------------------------------
ERR Required metric gpu__compute_memory_throughput.avg.pct_of_peak_sustained_elapsed could not be found.
----- --------------------------------------------------------------------------------------------------------------
ERR Rule Bottleneck returned an error: Some required metrics are missing; aborted rule execution.
----- --------------------------------------------------------------------------------------------------------------
ERR <built-in function raise_exception> returned a result with an exception set
/root/Documents/NVIDIA Nsight Compute/2023.3.0/Sections/SpeedOfLight.py:90
/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/profilers/Nsight_Compute/target/linux-desktop-glibc_2_11_3-x64/../../se
ctions/RequestedMetrics.py:231
/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/profilers/Nsight_Compute/target/linux-desktop-glibc_2_11_3-x64/../../se
ctions/NvRules.py:2834
i see… above two metrics are required ones… i will try collecting without adding --metrics.
ncu --replay-mode application -o profile --launch-skip 900 --section SpeedOfLight --target-processes-filter regex:text-generation-server text-generation-launcher
Let me know if anything can be optimized in collecting, i need to SpeedOfLight only.