I try to use Nsight compute command line to fetch all metric at runtime, when use --metrics “regex:.*”, it extracts all metrics but with all values are “n/a”. But when try nv-nsight-cu-cli --metrics smsp__cycles_elapsed.avg.pct_of_peak_sustained_elapsed,sm__warps_active.avg.pct_of_peak_sustained_active, it can extracts a detailed values. So how to skip the metrics with “n/a” values and why when extract all metrics, they are all “n/a”?
Could you please let us know the OS, GPU and version of Nsight Compute you are using (e.g. run nv-nsight-cu-cli --version).
For Pascal GPUs, I would not expect this behavior.
For Volta and later GPUs, there is a huge number of metrics, and collecting all at once is not an anticipated use case. Specifically, if collecting certain metrics causes issues, this might impact data collection for all metrics in the report, as you have seen. This behavior will be improved in future versions of the tool.
In the meantime, I recommend collecting more targeted subsets of metrics, e.g. as provided by the sections shipping with the tool, or as queried using the --query-metrics command line option.
Thanks a lot.
I run in on a AWS P3 instance, with CUDA 10.1 and drive 418.67, in Ubuntu.
so the --query-metrics will display the available metrics in my instance or same as the all metrics that used by regex:.* ?
By default, --query-metrics will show the available metrics for all devices present in the current system. You can limit this to specific devices by additionally using the --devices parameter. You can also query for any specific chip by additionally using the --chip parameter. See --help or Nsight Compute CLI :: Nsight Compute Documentation for details.
We pre-selected metrics for certain problem domains in what is called “sections”. By default, all sections are collected. You can use --list-sections to see which are available. You can also inspect which metrics are in which sections by looking at the .section files that ship with the tool. A list of those files can be found here: Nsight Compute :: Nsight Compute Documentation
Thanks a lot.
I am not very familiar with these options, so to use devices or sections, should use like this:
sudo /usr/local/cuda-10.1/NsightCompute-2019.3/nv-nsight-cu-cli --devices --query-metrics ?
But it get Invalid device --query-metrics in the --devices option.
Please read the referenced documentation on these topics.