CUFFT problem invalid plan / internal error

I’m have a problem doing a 2d transform - sometimes it works, and sometimes it doesn’t, and I don’t know why! Here are the details:

My code creates a large matrix that I wish to transform. After clearing all memory apart from the matrix, I execute the following:

[codebox] cufftHandle plan;

cufftResult theresult;

theresult = cufftPlan2d(&plan, t_step_h, z_step_h, CUFFT_C2C);

printf("\n\n\n\n error message:     %i\n\n\n\n", theresult);

cufftExecC2C(plan, zt_section_d, zt_section_d, CUFFT_FORWARD);


When the transform fails, I get the following error message:

[codebox] cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/, line 143


cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/, line 122



In addition, the error message integer is 5, which I believe is CUFFT_INTERNAL_ERROR.

As I say, this doesn’t happen all the time. I seem to be able to reproduce it reliably for a 10000x3000 matrix (I have a 4GB card btw). It may not be size related though - I’ve seen 2000x3000 matrices fail, but I’ve also seen 2000x5000 matrices work. I’ve tried using powers of 2 for matrix sizes, and the error still appears.

I appreciate any help people can give.

…so nobody has any ideas? Has anybody seen a problem like this before? I’ve seen at least one other post on this forum to do with transform size issues and CUFFT, although the details were somewhat different.

The latest development is that I’ve managed to do a 5000 x 14000 transform without incident, but who knows - maybe tomorrow it will fail again. I’m almost afraid to ask, but does this sound like a hardware problem to anyone?

Did you call cudaThreadSynchronize() before executing the plan? This should rule out errors in previous calls…

Thanks for the response - I’ll add that to the code. As I say, this error is intermittent, and I haven’t seen it since the day after I first posted. Hopefully this addition will make sure it doesn’t come back, but we’ll see…


Apologies for resurrecting the post - I’m having the same problem.

What is your hardware setup? And are you using your CUDA compute card as your display card also?

Hopefully we can thrash this out!

Kind regards

Tom Clark


Did you try kingofthehill3’s idea? I added the thread sync call, and although it shouldn’t have made a difference, I don’t think I’ve had the problem since. Around the same time however, I also enforced a power of two condition on my matrices - the CUFFT manual seems to imply the possibility of stability issues with the transform algorithm if you don’t use a power of two or small prime - so maybe this solved the problem instead. I don’t know…

I use a Tesla C1060 and a separate display card. The only unusual thing about my setup is that I use an ATI display card, so I use the startup script method mentioned in the release notes to make everything work properly (I also had to remove all traces of Nvidia stuff from my xorg.conf.)

It’d be interesting to find out what the problem was, just incase it comes back to haunt me…