Not attaching to Python3.9 application

I was recently attempting to profile PyFR with Nsight compute and it seemed unable to attach itself to the process.

After trying several things, I narrowed the problem to Python3.9. I.e using the same version of PyFR (1.12.1) Nsight was able to attach and profile successfully PyFR when run with Python3.8, but not 3.9.

I am using CUDA 11.2.67 on Ubuntu 20.04.2 LTS (5.8.0-59-generic x86_64), with a Titan V.

Does anyone have any idea what the cause of this issue may be and any potential fixes? This work is part of ongoing research and I am reluctant to update to CUDA 11.4 as I would probably have to rerun a lot of tests.

Will

There are known issues with tracking child processes for certain process launch calls as documented in the Known Issues

Profiling child processes launched via clone() is not supported.
Profiling child processes launched from Python using os.system() is not supported.

It is possible that there is a related implementation change between python 3.8 and 3.9 that is causing this.

I am reluctant to update to CUDA 11.4

You haven’t specified which version of Nsight Compute you are using, but I presume the one from CUDA 11.2.67. Note that Nsight Compute is backwards-compatible with older CUDA versions, meaning you can upgrade to a newer tool with bug fixes and new features but still profile CUDA applications using an older toolkit. The latest Nsight Compute standalone installer can be downloaded from Nsight Compute | NVIDIA Developer | NVIDIA Developer