What is the lowest level of GPU application that can be monitored on a GPU/GPUs? is it PID or threadId?

avdhoot.joshi · June 20, 2024, 4:08pm

I want to monitor GPU activity from application perspective. Is there any object “below” PID that can be monitored to get SM utilization? For example: if a LLM is executing inferences, can I monitor each inference activity? or if there are parallel threads within a process, one running on GPU0 and other on GPU1, GPU2, GPU3 can I get separate util metrics for these 2 threads within a process? or all metrics are available at PID level only?

veraj · June 24, 2024, 5:11am

Hi, @avdhoot.joshi

This is forum support for developer tools - cuda gdb.

From your description, I think maybe Nsight Systems | NVIDIA Developer can meet your requirement. Please check.

avdhoot.joshi · June 24, 2024, 5:13am

Thanks @veraj will post in other forum.

mhallock · June 25, 2024, 3:38pm

Greetings,

Metrics, such as SMs Active, SM Instructions, and Warp Occupancy are not at the pid level, but at the device level. Even if the GPU was quiescent with respect to the application being profiled, SM utilization would still be captured if other applications were utilizing the GPU at that time. You can observe this by running:

sudo nsys profile --gpu-metrics-device=all

And then running a GPU application in another terminal.

Now. with that said, you can observe kernel launches and execution on a per-thread basis. Clicking on a kernel on the GPU timeline should highlight the correlated activity on the CPU thread timeline to tell you where it is coming from. If your kernels are sufficiently long (such that your frequency of metrics collection is able to get enough samples during the kernel) then you can empirically correlate the SM metric value to that kernel.

avdhoot.joshi · July 5, 2024, 4:21pm

Thanks @mhallock for the detailed reply! Will check further!

Topic		Replies	Views
Utilization report in Nsight Systems Profiling Linux Targets	2	184	July 4, 2024
Question about viewing mapped memory on CUDA (GPU side)? CUDA Programming and Performance	2	500	June 21, 2022
Any hardware performance counters for number of cores/SMs occupied? CUDA Programming and Performance	2	1091	January 20, 2020
How to get the compute and memory throughput of GPU from the perspective of the whole GPU system Nsight Compute cuda	4	1128	September 23, 2022
showing gpu utlization per process CUDA Programming and Performance	5	1999	October 12, 2018
Questions about NVVP Visual Profiler and nvprof	1	657	April 7, 2019
How to figure out CPU and GPU activity parallelism using Nsight Systems or Nsight Compute? Profiling Linux Targets	3	978	December 19, 2019
Nvprof SM number usage and metrics profiling Nsight Visual Studio Edition	0	447	December 23, 2020
CUDA Resource monitor how to monitor what my program is doing! CUDA Programming and Performance	4	12828	December 29, 2009
Watch Resource Usage of an SM in Real Time CUDA Programming and Performance	1	657	April 12, 2023

What is the lowest level of GPU application that can be monitored on a GPU/GPUs? is it PID or threadId?

Related topics