Eclipse debugger fails to respond as soon as cudaMalloc call is made. Jetson TX2

Trying to reduce this to the simplest reproducible case.

Environment:

  • Ubuntu Linux on desktop running Eclipse with CUDA 10.0 (default JetPack 4.2 install on virgin Ubuntu 16.04 LTS)
  • Jetson TX2 connected over Gigabit Ethernet
  • Default CUDA C/C++ Project => CUDA Runtime Project (the sample that does the array reciprocol and sum)
  • PTX/GPU code set to 6.2 (Pascal)

Observations:

  • Remote run works successfully (and it’s FAST)
  • Remote debug starts. I can step over code and into code.
  • I get to the line below and attempt to step over or step in and the debugger hangs. (size is 65530). Setting a breakpoint somewhere after this line will also never be reached.
CUDA_CHECK_RETURN(cudaMalloc((void **)&gpuData, sizeof(float)*size));

Attempting to pause the debugger or stop the application does nothing from Eclipse. I have to ssh into the TX2 and kill the cuda process to stop it.

Advice? I’d like to be able to use the debugger if I need it and I suspect something simple can fix this issue. I’ve broken the cudaMalloc call out into a single cudaError_t result = cudaMalloc(…) and it definitely fails on the cudaMalloc call and not the macro.

I am having a similar problem when I try to run the profiler on my code. The code runs fine outside of the profiler, but when running under the profiler, the program bombs out at the first call to cudaMalloc.
I’m using JetPack 4.2, cuda 10, ubuntu 18.04 on Xavier.
Any help would be greatly appreciated…
Thanks

I added a call to cudaProfilerStart()before my first cudaMalloc() and now, the code stops at the call to cudaProfilerStart(). The console in Nsight Eclips Ediition shows:
APP_NAME on REMOTE_DEVICE(1)
started…
logout

APP_NAME is hte name of my applicaiton, REMOTE_DEVICE is the name of my remote Xavier
“started…” is generated by my code using printf.

It appears that my code is stopping at first call to a cuda funciton. But this only happens when running in hte profiler.

Any ideas why this is and how I can fix it?
Cheers

Ive followed the instructions from https://askubuntu.com/questions/611528/why-cant-sudo-find-a-command-after-i-added-it-to-path to add the path to nvprof on the target device. Now I can run nvprof on the device, but still not able to run the visual profiler in Nsight from a remote machine…