This happens any time I attempt to debug CUDA when I’m anything other than root. (In fact, I only thought to try this as root right before reporting this.) Anyway, here’s a small example. First, some trivial source code:
#include <stdio.h>
#include <unistd.h>
__global__ void Nil()
{
printf("short and stout.\n");
}
int main(int argc, char** argv)
{
printf("I'm a little teapot,\n");
Nil<<<1, 1>>>();
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) {
fprintf(stderr,
"kernel launch failure: %s (%d)\n",
cudaGetErrorString(err), err);
exit(-1);
}
cudaThreadSynchronize();
exit(0);
}
Next, the cuda-gdb session:
$ cuda-gdb
... banners ...
(cuda-gdb) b 6
Breakpoint 1 at 0x10fc: file cudanil.cu, line 6.
(cuda-gdb) r
Starting program: /some/path/cudanil
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
I'm a little teapot,
---HANG---
It appears that the cudbgprocess backend process crashes very soon after its creation, after it has created its pipe to communicate with cuda-gdb. And then cuda-gdb hangs indefinitely trying to open the pipe with no process on the other end.
I spotted this when testing our NightView debugger with our not-yet-released RedHawk kernel based on L4T R31.1 and with an L4T R31.1 userland. But I was able to reproduce it with the cuda-gdb on the vanilla L4T R31.1 kernel. And today I verified that it still happens with cuda-gdb on the vanilla L4T R32.1 too.