I tested the tool with my GV100 using nv-nsight-cu-cli command line.
The view the profile file with nv-nsight-cu tool.
In the summary page I can see SOL Memory [%] and SOL SM[%]
However, when I run with my RTX6000 the SOL Memory [%] and SOL SM[%] data is missing.
It is only available in the detail page for each kernel.
Hi, for my understanding, you are saying that when profiling on RTX6000, you do have all the sections properly filled in the Details page (i.e. no “n/a” warnings), but the matching metrics are not available in the Summary page, correct?
Could you please let me know if you:
- Profiled GV100 and RTX6000 in the same or in different runs and reports?
- Used the default profiling settings (sections), or selected different metrics or sections for the runs?
As a note, the metric names are different between GV100 and RTX6000, so collecting the same metric names might cause issues. However, if you are using the default settings, this doesn’t seem to be the problem here.
Thank you.
Hello,
Yes, in the detail page I get all the info for each kernel call but no on the summary page.
the profiling of the GV100 and RTX600 are on different runs and separate reports, I do not run the at the same time.
Here is the output of the RTX600 from the summary page
ID Time API Call ID Function Name Demangled Name Device Name
0 2018-Nov-11 13:46:14 23104 fusedConvolutionReluKernel void fused::fusedConvolutionReluKernel<fused::SrcChwcPtr_FltTex_Reader<float,int=1,int=1,int=1>,fused::KpqkPtrWriter<float,int=1,int=1>,float,float,int=3,int=4,int=1,int=7,int=7,int=2,int=2>(fused::ConvolutionParams<floatSrcType,int=1,int=1Type>,float) Quadro RTX 6000
It is missing the SOL Memory [%] and SOL SM [%] info
While the summary for GV100
ID Time API Call ID Function Name Demangled Name Device Name Cycles [cycle] SOL Memory [%] SOL SM [%]
0 2018-Nov-02 14:30:36 22840 fusedConvolutionReluKernel void fused::fusedConvolutionReluKernel<fused::SrcChwcPtr_FltTex_Reader<float,int=1,int=1,int=1>,fused::KpqkPtrWriter<float,int=1,int=1>,float,float,int=3,int=4,int=1,int=7,int=7,int=2,int=2>(fused::ConvolutionParams<floatSrcType,int=1,int=1Type>,float) Quadro GV100 95,544.67 36.98 32.73
I can get the info from the Detail page but I don’t want to go kernel by kernel to get the SOL Memory and SOL SM[%]
I am using the command line to collect the metrics for both cards.
/usr/local/NVIDIA-Nsight-Compute-1.0/nv-nsight-cu-cli
Thank you for the details. We will be looking into this, and I will update here once I have new information.
Hi rafa, I don’t have an update on your issue yet. However, I wanted to check with you if in the meantime, you could use the Raw page instead of the Summary page to check this metric across all collected kernel instances. The Raw page will show the metric names instead of the labels in the Summary page. The metric names for RTX 6000 for SOL SM is sm__throughput.avg.pct_of_peak_sustained_elapsed and for SOL Memory it is gpu__compute_memory_throughput.avg.pct_of_peak_sustained_elapsed. I hope that helps until the issue is resolved on our end.
The issue will be fixed in the next release of Nsight Compute.
Good News, Nsight Compute 2019.1 is now available. Please download from [url]https://developer.nvidia.com/gameworksdownload#?dn=nsight-compute-2019-1[/url]