The CUFFT library doesn’t seem to be working properly on my new Fermi machine. I am trying to do a number of 1-d, in-place, double precision FFTs of varying sizes, but cufftExecD2Z keeps throwing CUFFT_EXEC_FAILED errors for certain sizes. I’ve written a test code to test all of the sizes from 1 to 8192, and I see errors for the following sizes:

577 <= n <= 640

2308 <= n <= 2560, but only multiples of 4 and some are skipped, seemingly at random

Furthermore, I do not observe any errors for single precision transforms (i.e. R2C). Here’s my test code:

[codebox]#define MAX_SIZE 8192

#include <cufft.h>

#include <stdlib.h>

int main(int argc, char ** argv){

int n;

int i;

int error_count = 0;

cufftDoubleComplex * v;

cufftDoubleReal * x;

cufftHandle p;

x = (cufftDoubleReal *) malloc(sizeof(cufftDoubleComplex)*(MAX_SIZE/2+1));

cudaMalloc((void **)&v, sizeof(cufftDoubleComplex)*(MAX_SIZE/2+1));

for (i=0; i<MAX_SIZE; ++i)

```
x[i] = (cufftDoubleReal) i;
```

for (n=4; n><=MAX_SIZE; ++n){

```
cudaMemcpy((cufftDoubleReal *) v, x, sizeof(cufftDoubleReal) * n, cudaMemcpyHostToDevice);
```

cufftPlan1d(&p, n, CUFFT_D2Z, 1);

```
int result = (int) cufftExecD2Z(p, (cufftDoubleReal *) v, v);
cufftDestroy(p);
if (result){
printf("Error %d for n=%d\n", result, n);
error_count++;
}
```

}

if (!error_count)

```
printf("No errors...\n");
```

free(v);

}[/codebox]

I’m running amd64 Ubuntu 10.04 LTS with the 2.6.32-24 kernel.

CUDA Toolkit 3.1

CUDA SDK 3.1

Compiled with gcc

AMD Phenom II X4 955 Black edition, 8 Gb RAM, EVGA GTX480 1.5 GB