The CUFFT library doesn’t seem to be working properly on my new Fermi machine. I am trying to do a number of 1-d, in-place, double precision FFTs of varying sizes, but cufftExecD2Z keeps throwing CUFFT_EXEC_FAILED errors for certain sizes. I’ve written a test code to test all of the sizes from 1 to 8192, and I see errors for the following sizes:
577 <= n <= 640
2308 <= n <= 2560, but only multiples of 4 and some are skipped, seemingly at random
Furthermore, I do not observe any errors for single precision transforms (i.e. R2C). Here’s my test code:
[codebox]#define MAX_SIZE 8192
#include <cufft.h>
#include <stdlib.h>
int main(int argc, char ** argv){
int n;
int i;
int error_count = 0;
cufftDoubleComplex * v;
cufftDoubleReal * x;
cufftHandle p;
x = (cufftDoubleReal ) malloc(sizeof(cufftDoubleComplex)(MAX_SIZE/2+1));
cudaMalloc((void **)&v, sizeof(cufftDoubleComplex)*(MAX_SIZE/2+1));
for (i=0; i<MAX_SIZE; ++i)
x[i] = (cufftDoubleReal) i;
for (n=4; n><=MAX_SIZE; ++n){
cudaMemcpy((cufftDoubleReal *) v, x, sizeof(cufftDoubleReal) * n, cudaMemcpyHostToDevice);
cufftPlan1d(&p, n, CUFFT_D2Z, 1);
int result = (int) cufftExecD2Z(p, (cufftDoubleReal *) v, v);
cufftDestroy(p);
if (result){
printf("Error %d for n=%d\n", result, n);
error_count++;
}
}
if (!error_count)
printf("No errors...\n");
free(v);
}[/codebox]
I’m running amd64 Ubuntu 10.04 LTS with the 2.6.32-24 kernel.
CUDA Toolkit 3.1
CUDA SDK 3.1
Compiled with gcc
AMD Phenom II X4 955 Black edition, 8 Gb RAM, EVGA GTX480 1.5 GB