Unable to get metrics with prefix 'pmsampling:' in Jetson orin nx

I was trying to get the unified memory throughput while executing Gemm-ops based on PyTorch.
Following the manual, I executing following instruction:

ncu --query-metrics-collection pmsampling

and then I got a list of options available on Orin.
I chosn some related to the unified memory by key words ‘soc’:

mcc__dram_throughput_op_read                                                Throughput      %               soc_mcc             MCC read throughput                                                                   
mcc__dram_throughput_op_read_internal_activity                              Throughput      %               soc_mcc             MCC read throughput, internal activity                                                
mcc__dram_throughput_op_write                                               Throughput      %               soc_mcc             MCC write throughput                                                                  
mcc__dram_throughput_op_write_internal_activity                             Throughput      %               soc_mcc             MCC write throughput, internal activity                                               
mcc__dram_throughput_srcnode_cpu_op_read                                    Throughput      %               soc_mcc             MCC read throughput from CPU                                                          
mcc__dram_throughput_srcnode_cpu_op_read_internal_activity                  Throughput      %               soc_mcc             MCC read throughput from CPU, internal activity                                       
mcc__dram_throughput_srcnode_cpu_op_write                                   Throughput      %               soc_mcc             MCC write throughput from CPU                                                         
mcc__dram_throughput_srcnode_cpu_op_write_internal_activity                 Throughput      %               soc_mcc             MCC write throughput from CPU, internal activity                                      
mcc__dram_throughput_srcnode_dbb_op_read                                    Throughput      %               soc_mcc             MCC read throughput from DBB                                                          
mcc__dram_throughput_srcnode_dbb_op_read_internal_activity                  Throughput      %               soc_mcc             MCC read throughput from DBB, internal activity                                       
mcc__dram_throughput_srcnode_dbb_op_write                                   Throughput      %               soc_mcc             MCC write throughput from DBB                                                         
mcc__dram_throughput_srcnode_dbb_op_write_internal_activity                 Throughput      %               soc_mcc             MCC write throughput from DBB, internal activity                                      
mcc__dram_throughput_srcnode_gpu_op_read                                    Throughput      %               soc_mcc             MCC read throughput from GPU                                                          
mcc__dram_throughput_srcnode_gpu_op_read_internal_activity                  Throughput      %               soc_mcc             MCC read throughput from GPU, internal activity                                       
mcc__dram_throughput_srcnode_gpu_op_write                                   Throughput      %               soc_mcc             MCC write throughput from GPU                                                         
mcc__dram_throughput_srcnode_gpu_op_write_internal_activity                 Throughput      %               soc_mcc             MCC write throughput from GPU, internal activity

I executed some of them, just as following instructions:

sudo /usr/local/cuda-12.6/bin/ncu \
      --nvtx \
      --set full\
      --metrics pmsampling:mcc__dram_throughput_op_read.avg.pct_of_peak_sustained_elapsed \
      --force-overwrite \
      -o "$ncu_report_file" \
      $CONDA_PYTHON_PATH $script_name

as well as options:

mcc__dram_throughput_op_read_internal_activity                              Throughput      %               soc_mcc             MCC read throughput, internal activity                                                
mcc__dram_throughput_op_write                                               Throughput      %               soc_mcc             MCC write throughput                                                                  
mcc__dram_throughput_op_write_internal_activity                             Throughput      %               soc_mcc             MCC write throughput, internal activity                                               
mcc__dram_throughput_srcnode_cpu_op_read                                    Throughput      %               soc_mcc             MCC read throughput from CPU

then I got some ‘*.ncu-rep’ files. I downloaded these files and open it with Nsight Compute GUI on windows11.
But when I tried to check, there was no metrics, just like following figure:


I tried to rename options, like:

pmsampling:mcc__dram_throughput_op_read
pmsampling:mcc__dram_throughput_op_read.avg

but didn’t work.
I tried other options, for example:

pmsampling:l1tex__data_pipe_lsu_wavefronts.avg

then I got the metrics:


So how can I solve this problem?
Platform:
Jetson orin nx
cuda 12.6
ncu 2024.3.1.0

Hello,

Thanks for visiting the NVIDIA Developer forums.

Your topic will be best served in the Jetson category, I have moved this post for better visibility.

Cheers,
Tom

thank you

1 Like

Hi,

Thanks for reporting this.

With your command, we also see “no data available” with the Nsight compute.
We need to check if further and update.

Thank you.
The following picture shows that metrics such as ‘dramc_…’ cannot be collected on shared-memory soc like Jetson Orin NX. So how can I get the likely metrics such as shared-memory Throughput, Read Bandwidth and Write Bandwidth?