How to get the compute and memory throughput of GPU from the perspective of the whole GPU system

user122022 · September 21, 2022, 11:51am

Hi~When I profile my cuda program or DL.inference, I could get the profiled compute and memory throughput for each kernel even they are in diverse processes or streams. But I want to get these metrics from the perspective of the whole GPU system rather than kernel level when I launch multiple kernels in different processes or streams. How can I do that… Thanks so much!

jmarusarz · September 21, 2022, 8:22pm

If you want to see activity from the entire device as multiple processes and kernels run, you may be looking for Nsight Systems. Take a look here and see if it’s what you are looking for User Guide :: Nsight Systems Documentation

user122022 · September 22, 2022, 12:56am

I can get the DRAM throughput for each kernel but it seems that I cannot get that of the whole system even by Nsight System

jmarusarz · September 22, 2022, 3:11pm

If you collect GPU Metrics with Nsight Systems, you will get a row in the timeline for DRAM Bandwidth and Throughput (see below). This is for the entire device. There is more information about this here User Guide :: Nsight Systems Documentation

Are you able to collect those metrics or are you looking for something else?

user122022 · September 23, 2022, 1:17am

got it and thanks so much

Topic		Replies	Views
I want to know values about DRAM, clock cycles Nsight Compute	1	585	August 9, 2021
How to get memory access profile (over time) inside a kernel? Profiling Linux Targets nsight	4	748	May 3, 2023
How to read out the GPU DRAM Bandwidth with Nvidia Nsight system Jetson Nano cuda	4	1151	July 21, 2023
Measuring peak read/write bandwidth across device memory Nsight Compute	1	621	May 19, 2020
NVIDIA Nsight Compute to profile the whole application Nsight Compute	4	593	May 26, 2021
Question about viewing mapped memory on CUDA (GPU side)? CUDA Programming and Performance	2	497	June 21, 2022
How can I profile both kernel and cuda APIs hardware usage and application total duration Nsight Compute	5	410	March 27, 2024
Core by core performance CUDA-GDB	1	609	July 14, 2021
Question regarding memory profiling Nsight Compute	2	71	November 12, 2024
PCIe bandwidth information CUPTI – CUDA Profiler Tools Interface pcie	5	1217	September 1, 2023

How to get the compute and memory throughput of GPU from the perspective of the whole GPU system

Related topics