The convolutionFFT2D code compiles successfully, but gives the following CUFFT_EXEC_FAILED error when run:
[codebox]Using device 0: GeForce 8800 GT
Input data size : 1000 x 1000
Convolution kernel size : 7 x 7
Padded image size : 1006 x 1006
Aligned padded image size : 1024 x 1024
Allocating memory…
Generating random input data…
Creating FFT plan for 1024 x 1024…
Uploading to GPU and padding convolution kernel and input data…
…initializing padded kernel and data storage with zeroes…
…copying input data and convolution kernel from host to CUDA arrays
…binding CUDA arrays to texture references
…padding convolution kernel
…padding input data array
Transforming convolution kernel…
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/execute.cu, line 1038
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/execute.cu, line 284
cufft: ERROR: CUFFT_EXEC_FAILED
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/cufft.cu, line 119
cufft: ERROR: CUFFT_EXEC_FAILED
Running GPU FFT convolution…
GPU time: 8.666000 msecs. //115.393487 MPix/s
Reading back GPU FFT results…
Checking GPU results…
…running reference CPU convolution
…comparing the results
Max delta / CPU value 1.000000E+00
L2 norm: 1.000000E+00
TEST FAILED
Shutting down…
Press ENTER to exit…
[/codebox]
I am using a Geforce 8800GT with 1GB memory. I also saw that I can run simpleCUFFT without any problems.
The problem appears to be in the three calls to cufftExecC2C (i.e., in the two forward and one inverse transform calls).
Any pointers greatly appreciated!