CUDA Noob: "cufft: ERROR: CUFFT_INVALID_PLAN"

coec · December 4, 2008, 11:35pm

Hi all

I’m trying to help some geoscientists trial trial CUDA from fortran. When I run the attached code (called from fortran) I get these error messages:

In function ‘cu_fft_1d_r2c_’

plan created
about to execute plan
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/cufft.cu, line 115
cufft: ERROR: CUFFT_INVALID_PLAN
plan executed
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/cufft.cu, line 94
cufft: ERROR: CUFFT_INVALID_PLAN
Leaving function ‘cu_fft_1d_r2c_’
amplitude of bp filter at 0.5 transmitter base= (0.9961086,0.000000)
In function ‘cu_fft_1d_c2r_’
about to execute plan
plan created
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/cufft.cu, line 115
cufft: ERROR: CUFFT_INVALID_PLAN
plan executed
cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/cufft.cu, line 94
cufft: ERROR: CUFFT_INVALID_PLAN
Leaving function ‘cu_fft_1d_c2r_’

I understand the ‘cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/cufft.cu’ part, my code does not live in this directory so I figure thats where the NVidia coder was working.

The ‘line 115’ error occurs when executing the plan.

The ‘line 94’ error occurs when destroying the plan.

I’ve read the CUFFT Library 2.0 PDF but I can’t spot what I’m doing wrong with the plan.

in the fortran code, I call my functions by ‘call cu_fft_1d_r2c(pt_series,dt_series,n)’ and ‘call cu_fft_1d_c2r(pt_series,dt_series,n)’.

I have two quadro FX5600’s in my workstation (SLI bridge attached but SLI not configure in X - I’m on a RedHat EL5.2 workstation).

Can someone please assist?

Thanks in advance.

CC
cudafunction.cu.txt (2.96 KB)

mfatica · December 5, 2008, 1:23am

You are mixing host memory and GPU memory.
The code should do something like:

allocate GPU memory
copy from CPU to GPU
call cuFFT ( input and output should be arrays in GPU memory)
copy result from GPU to CPU
free GPU memory

coec · December 5, 2008, 1:24am

Ahh, my problem is/was that the transform size was a little of 18,000,000. The CUFFT Library doco states that “1D transform sizes up to 8 million elements”. When I hardcoded NX to be 7999999, the code suddenly ran without errors.

So my question now is, how can I overcome this problem? Could someone point me at an example?

Thanks

CC

coec · December 5, 2008, 2:27am

Cool, thx for the tip.

I’m now doing:

cudaMalloc((void**)&out, sizeof(cufftComplex)*NX);
cudaMemcpy(in, out, sizeof(cufftComplex)*NX, cudaMemcpyHostToDevice);

But now I’m getting ‘cudaErrorInvalidDevicePointer’ when cudaMalloc is called.

CC
cudafunction.cu.txt (3.22 KB)

Reimar · December 5, 2008, 10:04am

No surprise, the destination is the first argument for memcpy, not the other way round as you try to do it.