CUPTI automatic callbacks

Hi,

I want to build my own tool to analyze the performance of GPU kernels automatically. This tool should be applied or attached to many applications. Since I am lazy to rewrite the applications’ source code, I ask myself whether there is a runtime solution, like LD_PRELOAD?

Thanks,
Bo

Hi Bo,

What do you mean by analyzing the performance automatically? Do you want to collect the timing information, or hardware performance counters/metrics or both?

You can write a GPU performance analysis tool based on CUPTI interface. User can inject the CUPTI based shared library into the target application using LD_PRELOAD. Please refer to the post CUPTI activity API and child processes - #8 by mjain

@mjain Hi, sorry for disturbing your timeline in 2023. I have recently been investigating some approaches to profile GPU program at runtime and without intrusion into the code. So I’m curious if CUPTI can be used in such a scenario: a GPU program is running then can I get trace info by using command like profile-gpu --pid=$PID (the profile-gpu is implemented using CUPTI)?

Hi stricklandyyy,

This can be achieved by using the standard injection methods like using ptrace to inject a library/code into the running process.

Hi mjain. Thanks for your reply. After trying some other approaches, I have also found that ptrace is a feasible option. :D

mjain via NVIDIA Developer Forums <notifications@nvidia.discoursemail.com> 于2024年1月10日周三 23:31写道: