Can't get GPU activities while inject CUPTI to running process

Hi, I’m trying to implement a demo GPU profiler shared library, which can be attached to specified process. The profiler is based on CUPTI and libkineto.
Most implemation of the profiler library refers to code example of Dynamic Attach and Detach. However, in order to attach to running process, I’m using ptrace based code injector, to jnject and execute the profiler library ( via __attribute__((constructor))) to target process, instead of set CUDA_INJECTION64_PATH before the target start.
The problem is, while host side activity(like CUPTI_ACTIVITY_KIND_RUNTIME) works just fine, I can’t get GPU related activities(CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL, CUPTI_ACTIVITY_KIND_MEMCPY) via this approch. When I turn to CUDA_INJECTION64_PATH with the same code, I can get all activities.
Please kindly clarify a lillte about the principle of CUDA injection, is the injection only allowed while CUDA initialization, or this code inject approch is also feasible?

1 Like

I just noticed that CUPTI initialization is happened in main thread in the attach/detach code sample, but one sperated thread in my implementaion.
Is it required to initialize CUPTI in main thread? I also tried but encountered some failure while code injection. I’ll update here after the reason found out

Hi, @juju812

Sorry for the late response. Anything we can help now ?

Hi, @veraj

After some analysis with @mjain , we fixed this issue and get CUPTI activities working.

Thank you all for the kindly help!

Thanks for letting me know this ! Anything else need help, please feel free to start another topic, and we’ll do our best to help !

This topic was automatically closed after 21 hours. New replies are no longer allowed.