Cuda-gdb hangs indefinitely for CUFFT API calls

Multiple-GPU (two Tesla P100), Centos 7.5, CUDA 11.6. CUDA applications seem to hang on CUFFT API calls within the CUDA Debugger.

 532> nvidia-smi
Mon Jul 11 10:00:17 2022

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:07:00.0 Off |                    0 |
| N/A   48C    P0    32W / 250W |    261MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 00000000:0A:00.0 Off |                    0 |
| N/A   45C    P0    24W / 250W |      2MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Similar to a previous post, all binaries of interest execute as expected outside the CUDA debugger. I’ve been able to replicate the problem with the simpleCUFFT application provided in cuda-samples.

  1. Compile simpleCUFFT with debug options.
  2. Put a breakpoint on cufftPlan call within simpleCUFFT
  3. Run debugger and hit breakpoint at line 124
  4. Hit continue
  5. No progress in application after >10 minutes
[usr@host] /path/to/samples/cuda-samples/Samples/4_CUDA_Libraries/simpleCUFFT/ 
 542> /usr/local/cuda/bin/cuda-gdb
NVIDIA (R) CUDA Debugger
11.6 release
Portions Copyright (C) 2007-2022 NVIDIA Corporation
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
(cuda-gdb) file simpleCUFFT
Reading symbols from simpleCUFFT...
(cuda-gdb) break simpleCUFFT.cu:124
Breakpoint 1 at 0x404739: file simpleCUFFT.cu, line 124.
(cuda-gdb) run
Starting program: /path/to/samples/cuda-samples/Samples/4_CUDA_Libraries/simpleCUFFT/simpleCUFFT 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[simpleCUFFT] is starting...
[Detaching after fork from child process 15452]
[New Thread 0x7fffe6907700 (LWP 15473)]
GPU Device 0: "Pascal" with compute capability 6.0

[New Thread 0x7fffe6106700 (LWP 15475)]

Thread 1 "simpleCUFFT" hit Breakpoint 1, runTest (argc=1, argv=0x7fffffffda68) at simpleCUFFT.cu:124
124       checkCudaErrors(cufftPlan1d(&plan, new_size, CUFFT_C2C, 1));
(cuda-gdb) c
Continuing

I’ve tried setting CUDA_VISIBLE_DEVICES to see if it was somehow related to using multiple GPUs, with seemingly no effect. Other binaries without CUFFT dependencies seem to run fine; I haven’t tried to see if the issue also affects applications dependent on other CUDA libraries.

Hi @patrick.m.wolf
Thank you for your report. We are investigating the issue.