Getting information about CUDA kernel executions from another process

luis.leon · July 12, 2024, 11:38am

Hi all,

I am working on research where we need a non-invasive meter. We want to retrieve information about the CUDA kernel names executed by a process of interest.

More or less, the idea is to have some sort of tool to get information about a process, such as the memory consumed, the GPU usage, and the CUDA kernels dispatched for execution.

For instance, if we have a code (./matmul) that invokes a kernel named compute_matmul<<<1, 32, 0, stream>>>(...), we want our tool to print something like:

$ ./matmul &
$ ./query_process $!
Usage,Mem,Executed kernels
0,0,None
0,0,None
10,10,compute_matmul
5,10,compute_matmul
0,0,None

Do you know if it is possible to use any available library and a starting point?

Cheers,
Leon

Sanjiv.Satoor · July 12, 2024, 12:35pm

Hello,

Just trying to understand your requirement better.

What are the rows with “None” in the sample output of the tool?

What do you mean by GPU usage? What would a value of “10” for GPU usage mean?

luis.leon · July 12, 2024, 12:55pm

Hi,

Sure. Placing a bit of context, let’s assume that each line is generated every second.

None means that none kernels were executed in the sampling interval (in a second).

The GPU usage is from 0 to 100% and means the SM usage, similar to this one.

Cheers,
Leon

luis.leon · July 24, 2024, 5:43pm

Any news on this?

Apologies for bringing this up again.

Sanjiv.Satoor · July 29, 2024, 6:22am

You can use NVML APIs for GPU memory usage and GPU usage.

You can use CUPTI Callback APIs to get CUDA kernels launched. Refer the Driver and Runtime API Callbacks section in the CUPTI document.

luis.leon · July 29, 2024, 10:55am

Thanks for your answer. I am already using NVML.

However, I’m a bit new in CUPTI. I have been trying to give a PID to analyse but I haven’t found the way.

Just to recall that I would like to analyse an external application and the main assumption is that I don’t have access to the application code.

Thanks in advance,
Leon

Sanjiv.Satoor · August 1, 2024, 6:05am

You can write a CUPTI trace injection library. With this you do not need access to the application code.

To get kernel name information you can enable callbacks for a subscriber for domain CUPTI_CB_DOMAIN_DRIVER_API and callback ID CUPTI_DRIVER_TRACE_CBID_cuLaunchKernel.

Refer CUPTI sample cupti_trace_injection to get started on writing a CUPTI trace injection. Before launching the CUDA application, the environment variable CUDA_INJECTION64_PATH needs to be set to point to the injection library.

Refer the CUPTI sample callback_timestamp for how to get the kernel name in the callback function.

Note that the kernel information you get with this approach will not give the accurate time when the kernel is started on the GPU. It only gives the time when the kernel is launched from the host. There is some time gap between the kernel launch and when kernel execution starts on the GPU.

You will need to decide on how to combine the information obtained using NVML and CUPTI.

luis.leon · August 12, 2024, 8:56am

Thanks for the idea. I will give it a try.

Topic		Replies	Views
Find Peak Gpu Memory Usage using CUPTI CUPTI – CUDA Profiler Tools Interface cuda	1	1361	July 22, 2021
Why the utilization from kernel activity records is not equal to GPU Utilization? CUPTI – CUDA Profiler Tools Interface kernel	1	449	June 26, 2024
CUPTI automatic callbacks CUPTI – CUDA Profiler Tools Interface	5	1220	January 12, 2024
CUPTI activity API and child processes CUPTI – CUDA Profiler Tools Interface	9	2361	October 12, 2021
Get launch kernel response time by CUPTI CUPTI – CUDA Profiler Tools Interface	7	1199	May 9, 2023
Can We use CUPTI for Run-Time Analysis of Cuda Applications for GPU Metrics CUPTI – CUDA Profiler Tools Interface	4	922	January 15, 2024
How to know a kernel actually starts running in Cuda C++? CUDA Programming and Performance	3	1591	April 26, 2019
Using CUpti_ActivityKernel4 to find the start and end time in ns for a kernel wrapped in a function CUPTI – CUDA Profiler Tools Interface	7	1128	January 23, 2020
API can measure or query values of performance counters CUDA Programming and Performance	5	1467	August 1, 2017
How to use CUPTI to get average instruction execution time? CUDA Programming and Performance	7	1039	March 20, 2018

Getting information about CUDA kernel executions from another process

Related topics