Hi all,
I wanted to measure the peak read/write (load/store) bandwidth separately, across the device memory during my kernel executions. I was wondering whether Nsight had any parameter which could capture that.
If not, is there any other tool I could possibly use to obtain it? I know NVML provided functionality to do that for data across PCIe, but I needed it for GPU memory.
Thanks
I would suggest to start by collecting the MetricWorkloadAnalysis* sections, either separately, or together with any other metrics and/or sections you are interested in. This should give you several tables and charts in the UI, when opening as a report.
nv-nsight-cu-cli --section "MemoryWorkloadAnalysis.*" (app)
If you only want to collect individual metrics, you can start with
nv-nsight-cu-cli --metrics dram__bytes_write.sum,dram__bytes_read.sum,dram__bytes_write.sum.pct_of_peak_sustained_elapsed,dram__bytes_read.sum.pct_of_peak_sustained_elapsed (app)
See Nsight Compute :: Nsight Compute Documentation for the list of available sections. The current active set is also available via --list-section or in the Sections/Rules Info window in the UI.
See the --query-metrics and --query-metrics-mode command line options in Nsight Compute CLI :: Nsight Compute Documentation for how to query individual metric names.
1 Like