RdBW and WrBW measurement

dumbogeorge · April 8, 2021, 4:32am

Hi All

We have lots of CUDA kernels supporting multiple inference nets in our application. They are running on multiple GPUs (2080s), slotted into regular skylake PC. We would like to measure total Rd and Wr data from DDR for each GPU and also p2p traffic over PCIe (not using NvLnk), over certain duration of our run.

Which tools should we use ? nvvs ? Are there performance counters that we can directly read and report such that we can get an estimate rd/wr contribution of each internal module ? If these counters are hidden and not exposed - can we get to know the right API / libraries to get the info we want ?

Thanks.

Topic		Replies	Views
How to measure GPUDirect RDMA performance? CUDA Programming and Performance	2	549	September 10, 2021
Measuring DRAM throughput CUDA Programming and Performance	12	8705	October 11, 2019
Bandwidth estimate across memory/DDR controller Jetson TX2	4	1258	October 18, 2021
accessing hardware counters measuring performance on NVIDIA G80 CUDA Programming and Performance	3	2875	March 6, 2007
PCIe bandwidth information CUPTI – CUDA Profiler Tools Interface pcie	5	1575	September 1, 2023
API can measure or query values of performance counters CUDA Programming and Performance	5	1495	August 1, 2017
Measure DDR bandwidth on Orin Jetson AGX Orin	6	43	July 31, 2025
Performance Counters similar to CPU CUDA Programming and Performance	2	755	May 20, 2019
How to get the bytes read/write sum about Memory access between GPUs? Nsight Compute	7	940	March 20, 2024
Custom profiler counters CUDA Programming and Performance	0	648	June 17, 2010

RdBW and WrBW measurement

Related topics