# CUFFT and 2D array of complex numbers

Greetings,

I am a complete beginner in CUDA (I’ve never hear of it up until a few weeks ago). I was given a project which requires using the CUFFT library to perform transforms in one and two dimensions. In order to test whether I had implemented CUFFT properly, I used a 1D array of 1’s which should return 0’s after being transformed. The data being passed to cufftPlan1D is a 1D array of complex numbers as shown in the following code:

void runTest(int argc, char** argv);

#define SIGNAL_SIZE 4096
#define REPEAT 5000

int main(int argc, char** argv)
{
runTest(argc, argv);

``````    cutilExit(argc, argv);
``````

}

void runTest(int argc, char** argv)
{
if( cutCheckCmdLineFlag(argc, (const char**)argv, “device”) )
cutilDeviceInit(argc, argv);
else
cudaSetDevice( cutGetMaxGflopsDeviceId() );

``````    // Allocate host memory for the signal
cufftComplex* h_signal = (cufftComplex*)malloc(SIGNAL_SIZE * REPEAT * sizeof(cufftComplex));

// Initalize the memory for the signal
for (unsigned int i = 0; i < SIGNAL_SIZE; i++) {
h_signal[i].x = 1.0f; //real
h_signal[i].y = 0.0f; //imag
}

// display the signal
for (unsigned int i = 0; i < SIGNAL_SIZE; i++) {
printf("%g %g\n", h_signal[i].x, h_signal[i].y);
}

printf("End of signal\n");

// Allocate device memory for signal
Complex* d_signal;
cudaMalloc((void**)&d_signal, SIGNAL_SIZE * REPEAT * sizeof(Complex));

// Copy host memory to device
cudaMemcpy(d_signal, h_signal, SIGNAL_SIZE * REPEAT * sizeof(Complex),
cudaMemcpyHostToDevice);

// Create a 1D FFT plan
cufftHandle plan;
cufftPlan1d(&plan, SIGNAL_SIZE, CUFFT_C2C, REPEAT);

// Use the CUFFT plan to transform the signal in place
cufftExecC2C(plan, (cufftComplex *)d_signal,
(cufftComplex *)d_signal, CUFFT_FORWARD);

// Check if CUFFT library initialized successfully
if (CUFFT_SETUP_FAILED != 0)
printf("CUFFT Library initialized\n");

// Check if CUUFT executed the transform on the GPU
if (CUFFT_EXEC_FAILED != 0)
printf( "FFT successfully executed on the GPU\n" );

// Copy result from device to host
cufftComplex* h_transformed_signal = h_signal;
cutilSafeCall(cudaMemcpy(h_transformed_signal, d_signal,
SIGNAL_SIZE * REPEAT * sizeof(Complex), cudaMemcpyDeviceToHost));

// Display results
for (unsigned int i = 0; i < SIGNAL_SIZE; i++) {
printf("%g %g\n", h_transformed_signal[i].x, h_transformed_signal[i].y);
}

printf("End of result\n");

// Destroy the CUFFT plan
cufftDestroy(plan);

// Free host and device memories
free(h_signal);
cutilSafeCall(cudaFree(d_signal));

``````

}

I’ve been struggling trying to figure out how to initialize and pass a 2D array of complex numbers to a 2d C2C CUFFT plan. I’ve read everything on the forums that I could, but it’s still not clear to me. I know most people mention it better to flatten multidimensional arrays, but even getting to this point is proving to be very frustrating. I’ve tried the following with no success:

``````// Allocate memory for host signal
cufftComplex *h_idata = (cufftComplex *)malloc(size);

for (unsigned int col = 0; col < NX; col++) {
for (unsigned int row = 0; row < NY; row++) {
h_idata[row][col].x = 1.0f; //real
h_idata[row][col].y = 0.0f; //imag
}
}
``````

But, I do believe that CUDA flattens multidimensional arrays(?).

I sincerely appreciate any help.

Thanks

In Cuda CUFFT take complex numbers as input in the form of

cufftComplex *a_h;
for (i=0; i < N; i++) {
}
then it can be easily transferred to GPU by cudaMalloc and cudaMemcpy.

cufftComplex *h_idata = (cufftComplex *)malloc(size);

for (int col = 0; col < NX; col++) { {
h_idata[col].x = 1.0f; //real
h_idata[col].y = 0.0f; //imag
}
}
this will work hopefully.

I am using something like this:

``````int count=0;

for (int i=0;i<nx;i++)

{

for(int j=0;j<ny;j++)

{

h_data[count].x=...;

h_data[count].y=...;

count=count+1;

}

}
``````

You transfer the data as a 1D array of size nx*ny