Hi all,
I am working on research where we need a non-invasive meter. We want to retrieve information about the CUDA kernel names executed by a process of interest.
More or less, the idea is to have some sort of tool to get information about a process, such as the memory consumed, the GPU usage, and the CUDA kernels dispatched for execution.
For instance, if we have a code (./matmul
) that invokes a kernel named compute_matmul<<<1, 32, 0, stream>>>(...)
, we want our tool to print something like:
$ ./matmul &
$ ./query_process $!
Usage,Mem,Executed kernels
0,0,None
0,0,None
10,10,compute_matmul
5,10,compute_matmul
0,0,None
Do you know if it is possible to use any available library and a starting point?
Cheers,
Leon
Hello,
Just trying to understand your requirement better.
What are the rows with “None” in the sample output of the tool?
What do you mean by GPU usage? What would a value of “10” for GPU usage mean?
Hi,
Sure. Placing a bit of context, let’s assume that each line is generated every second.
None
means that none kernels were executed in the sampling interval (in a second).
The GPU usage is from 0 to 100% and means the SM usage, similar to this one.
Cheers,
Leon
Any news on this?
Apologies for bringing this up again.
You can use NVML APIs for GPU memory usage and GPU usage.
You can use CUPTI Callback APIs to get CUDA kernels launched. Refer the Driver and Runtime API Callbacks section in the CUPTI document.
Thanks for your answer. I am already using NVML.
However, I’m a bit new in CUPTI. I have been trying to give a PID to analyse but I haven’t found the way.
Just to recall that I would like to analyse an external application and the main assumption is that I don’t have access to the application code.
Thanks in advance,
Leon
You can write a CUPTI trace injection library. With this you do not need access to the application code.
To get kernel name information you can enable callbacks for a subscriber for domain CUPTI_CB_DOMAIN_DRIVER_API and callback ID CUPTI_DRIVER_TRACE_CBID_cuLaunchKernel.
Refer CUPTI sample cupti_trace_injection to get started on writing a CUPTI trace injection. Before launching the CUDA application, the environment variable CUDA_INJECTION64_PATH needs to be set to point to the injection library.
Refer the CUPTI sample callback_timestamp for how to get the kernel name in the callback function.
Note that the kernel information you get with this approach will not give the accurate time when the kernel is started on the GPU. It only gives the time when the kernel is launched from the host. There is some time gap between the kernel launch and when kernel execution starts on the GPU.
You will need to decide on how to combine the information obtained using NVML and CUPTI.
Thanks for the idea. I will give it a try.