cufftExecD2Z giving CUFFT_EXEC_FAILED errors for certain sizes

The CUFFT library doesn’t seem to be working properly on my new Fermi machine. I am trying to do a number of 1-d, in-place, double precision FFTs of varying sizes, but cufftExecD2Z keeps throwing CUFFT_EXEC_FAILED errors for certain sizes. I’ve written a test code to test all of the sizes from 1 to 8192, and I see errors for the following sizes:

577 <= n <= 640

2308 <= n <= 2560, but only multiples of 4 and some are skipped, seemingly at random

Furthermore, I do not observe any errors for single precision transforms (i.e. R2C). Here’s my test code:

[codebox]#define MAX_SIZE 8192

#include <cufft.h>

#include <stdlib.h>

int main(int argc, char ** argv){

int n;

int i;

int error_count = 0;

cufftDoubleComplex * v;

cufftDoubleReal * x;

cufftHandle p;

x = (cufftDoubleReal ) malloc(sizeof(cufftDoubleComplex)(MAX_SIZE/2+1));

cudaMalloc((void **)&v, sizeof(cufftDoubleComplex)*(MAX_SIZE/2+1));

for (i=0; i<MAX_SIZE; ++i)

x[i] = (cufftDoubleReal) i; 

for (n=4; n><=MAX_SIZE; ++n){

cudaMemcpy((cufftDoubleReal *) v, x, sizeof(cufftDoubleReal) * n, cudaMemcpyHostToDevice);

cufftPlan1d(&p, n, CUFFT_D2Z, 1);

int result = (int) cufftExecD2Z(p, (cufftDoubleReal *) v, v);

cufftDestroy(p);

if (result){

  printf("Error %d for n=%d\n", result, n);

  error_count++;

}

}

if (!error_count)

printf("No errors...\n");

free(v);

}[/codebox]

I’m running amd64 Ubuntu 10.04 LTS with the 2.6.32-24 kernel.

CUDA Toolkit 3.1

CUDA SDK 3.1

Compiled with gcc

AMD Phenom II X4 955 Black edition, 8 Gb RAM, EVGA GTX480 1.5 GB

Have you tried this with the CUDA Toolkit 3.2 RC?

Thanks,
Cliff

Have you tried this with the CUDA Toolkit 3.2 RC?

Thanks,
Cliff

Sorry for the late reply, just started a new job.

No, once I get access to my machine again I’ll give it a whirl.

Sorry for the late reply, just started a new job.

No, once I get access to my machine again I’ll give it a whirl.