When running a python binary under ncu I keep hitting an error message
==ERROR== Failed to connect to process
This happens in both launch-and-attach as well as launch mode (followed by an attach). I tried to get more debug information by creation config to flush output using a config as described here (NSight Compute not finding kernels - #6 by veraj). the logs there point to a timeout when trying to attach the process, see logs below.
I16:24:03:852|CmdlineProfiler|PID415001|TID415001|:294[]:Connecting to new process 414901
I16:24:03:852|profiler_multi_api_debugger_client|PID415001|TID415001|:270[]:Creating an API debugger and attaching...
I16:24:03:852|TPS_Comms|PID415001|TID415001|:87[]:Remaining transactions: 1
I16:24:03:852|tps_sync|PID415001|TID415001|:54[]:Begin WaitForCompletion
I16:24:03:852|tps_sync|PID415001|TID415001|:30[]:Create: 0x1f981080
I16:24:08:852|tps_sync|PID415001|TID415001|:40[]:result: Timeout
I16:24:08:852|tps_sync|PID415001|TID415001|:97[]:End WaitForCompletion
I16:24:08:852|tps_sync|PID415001|TID415001|:35[]:Destroy: 0x1f981080
E16:24:08:852|synchronous_multi_api_debugger_client|PID415001|TID415001|:285[]:Failed to attach to ApiDebugger.
E16:24:08:852|profiler_multi_api_debugger_client|PID415001|TID415001|:278[]:Failed to attach ApiDebugger.
E16:24:08:852|CmdlineProfiler|PID415001|TID415001|:331[]:Failed to create ProfilerMultiApiDebugger.
I16:24:08:852|CmdlineProfiler|PID415001|TID415001|:1307[]:End
I16:24:08:852|CmdlineProfiler|PID415001|TID415001|:1331[]:Profiler shutdown requested
I am able to profile simple python applications with my setup, but I’m having issues with a specific python application (which I am unable to share more details about). I am using CUDA 12.4 with NCU 2023.1 and python 3.9