Problem with nv-nsight-cu-cli

Greetings all,

I am trying to run the command line version of nsight. I follow the steps as in:

When I execute the following command:

/usr/local/NVIDIA-Nsight-Compute-2021.2/nv-nsight-cu-cli --query-metrics-mode suffix --metrics “regex:.*” --devices 0 ./bfs …/…/data/bfs/graph1MW_6.txt

This is the command that is mentioned in the previous pdf file that I have shared. I need the values for all the metrics for the BFS benchmark in Rodinia Benchmark. However, on executing this command I get the following error:

Device Quadro RTX 4000 (TU104)
==WARNING== The below metrics are either invalid or don’t have suffixes.
regex:.*

I am following the steps as outlined but still Iam getting the error. If anyone can help it will be great.

Regards
Govind

query-metrics-mode suffix does not support specifying metric base names via regex. You can simply use

ncu --query-metrics-mode all --devices 0

to achieve this. Note that specifying a target application binary when querying metrics is not possible/useful. You should first determine the metrics to collect and then collect them for the specific application. I recommend to not collect all possible metrics, though, but rather the “full” set and/or a subset of all possible metrics that is required for your analysis. You can combine --set full with --metrics.

ncu --set full ./bfs
ncu --metrics <comma-separated list> ./bfs

I am trying to collect all metrics when profiling. However, I was hoping to get something like --metrics “regex:.*” to work and avoid typing all metrics. Unfortunately, I could not get that to work.

Currently I am using

nv-nsight-cu-cli              \
--target-processes all        \
--set full                    \
--kernel-id ::my_kernel:32    \
-fo quick-test                \
quick-test.py

However, it is not collecting some of the memory analysis metric as it can be seen below

Any suggestion on what I would need to do?

I found that my issue might be related to the device I am using. The image above show the metrics when using Tesla P40 (Pascal Architecture SM_61).

According to the documentation for the nv-nsight-cu-cli v.2019.5 there are several more metrics for PerfWorks Metric or Formula (>= SM 7.0).

I tested profiling with a Tesla T4 (Turing Architecture SM_75) and the metrics I was looking for were there.