cuda-memcheck failed on cufft library

Hi folks,

I had strange errors related to cufft when I feed my program to cuda-memcheck. The results were correct and no errors were detected by cuda-gdb. nvprof worked fine, no privilege-related errors. I then decided to test
NVIDIA_CUDA-10.1_Samples/bin/x86_64/linux/release/simpleCUFFT
from cuda samples (NVIDIA_CUDA-10.1_Samples/7_CUDALibraries/simpleCUFFT) and received the same errors. (attached at the end)

Solutions I’ve tried including:

My laptop is running ubuntu 18.04, “GeForce RTX 2070 with Max-Q Design” with compute capability 7.5 and driver 435.21, cuda/cuda-memcheck version:
$ cuda-memcheck --version
CUDA-MEMCHECK version 10.1.243 ID:(46)

Any suggestions?

Thank you.

$ cuda-memcheck ./simpleCUFFT 
========= CUDA-MEMCHECK
[simpleCUFFT] is starting...
GPU Device 0: "GeForce RTX 2070 with Max-Q Design" with compute capability 7.5

========= Internal Memcheck Error: Initialization failed
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so.1 [0x13ba7c]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3d7e4a]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3caf70]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3d719a]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3dae9f]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3db60a]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3cec3c]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3bed7e]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3f022c]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x379a2]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x37fa6]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x39af2]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftXtMakePlanMany + 0x63a) [0x4d0ca]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftMakePlanMany64 + 0xfd) [0x4e02d]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftMakePlanMany + 0x193) [0x4aaf3]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftPlanMany + 0xd2) [0x4b082]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftPlan1d + 0x48) [0x4b1a8]
=========     Host Frame:./simpleCUFFT [0x7358]
=========     Host Frame:./simpleCUFFT [0x711e]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xe7) [0x21b97]
=========     Host Frame:./simpleCUFFT [0x6efa]

Hi,

I currently face the same problem, did you solve the issue?

Thanks,

Tobi

As much as I’d like to have it solved, I still don’t have a solution.

I am also seeing this, CentOS 8, CUDA 10.2 on a 2080Ti with driver 440.100.
In fact, I get a similar error as you in cuda-memcheck with just this:

// bug.cu
#include<cufft.h>
int main(){
cufftHandle plan;
cufftPlan1d(&plan,1, CUFFT_C2C, 1);
cufftDestroy(plan);
return 0;
}

Running cuda-memcheck on this results in:
$ nvcc bug.cu -lcufft && cuda-memcheck ./a.out

========= CUDA-MEMCHECK
========= Internal Memcheck Error: Initialization failed
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame:/lib64/libcuda.so.1 [0x1403fc]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3d887a]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3cb9a0]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3d7bca]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3db8cf]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3dc03a]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3cf66c]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3bf16e]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x3f138c]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x37b82]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x38186]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 [0x39cd2]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftXtMakePlanMany + 0x63a) [0x4d2aa]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftMakePlanMany64 + 0xfd) [0x4e20d]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftMakePlanMany + 0x193) [0x4acd3]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftPlanMany + 0xd2) [0x4b262]
=========     Host Frame:/usr/local/cuda/lib64/libcufft.so.10 (cufftPlan1d + 0x48) [0x4b388]
=========     Host Frame:./a.out [0x33c5]
=========     Host Frame:/lib64/libc.so.6 (__libc_start_main + 0xf3) [0x236a3]
=========     Host Frame:./a.out [0x32be]
=========
========= ERROR SUMMARY: 1 error

Making virtually impossible to debug any cuda code containing a cuFFT call…

As a last resort, running this example via cuda-gdb does work without error:

$ cuda-gdb -q ./a.out
Reading symbols from ./a.out...(no debugging symbols found)...done.
(cuda-gdb) set cuda memcheck on
(cuda-gdb) r
Starting program: ./a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
warning: Cannot parse .gnu_debugdata section; LZMA support was disabled at compile time
[New Thread 0x7fffe63f4700 (LWP 44592)]
[New Thread 0x7fffe5bf3700 (LWP 44593)]
[New Thread 0x7fffe5371700 (LWP 44594)]
[Thread 0x7fffe5371700 (LWP 44594) exited]
[Thread 0x7fffe5bf3700 (LWP 44593) exited]
[Thread 0x7fffe63f4700 (LWP 44592) exited]
[Inferior 1 (process 44577) exited normally]
(cuda-gdb)

But this seems to fail with other examples with no obvious pattern…

Maybe this will help:
https://developer.nvidia.com/nvidia_bug/3050187

Seems I should have search further: similar issues on dual RTX 2070s heres: Trivial cuFFT causes cuda-memcheck errors on RTX 2070 SUPER

@ RaulPPelaez, the link to nvidia-bug #3050187 doesn’t appear publicly accessible: can you post relevant details/workaround?

Yeah, sorry I am able to cuda-memcheck cufft code using this environmental variable:
CUDA_MEMCHECK_PATCH_MODULE=1
According to the bug page it is a known issue with this release and it is solved in CUDA 11.

Thanks for the info: I will try that. Yeap, that worked: thanks!

On my dual RTX 2070 SUPERs I ugpraded to Cuda SDK 11 RC and rebuilt & tried with 11 driver & runtime – same issue without the above patch-module flag added, so looks like still broken in 11.