I am currently trying to run a FFT in a CUDA program, but when i run it, the results of the FFT appears to be more or less the same whether the input data is a sinewave or an impulse or whether I use 512 data points or more or less.
The first 3 points of the result are generally in the region of 10^8, and by point 10 are around a minimum of about 5000, i am dividing the result by the number of data points. Any clues as to what might be the matter?

The code i am using to excecute the FFT is as follows:

cufftHandle plan;
cufftPlan1d(&plan, NX, CUFFT_C2C, BATCH);
// Execute the FFT plan
cufftExecC2C(plan, d_signal, d_fft, CUFFT_FORWARD);
//Get data off device
CUDA_SAFE_CALL( cudaMemcpy( h_fft, d_fft,sizeof(cufftComplexNXBATCH, cudaMemcpyDeviceToHost) );
//Destroy FFT plan

(I am currently using 512 data points (NX) and BATCH = 1, the input data is loaded as the real part of the input array (d_signal) and the imaginary part is set to zero)

Also, if i run an inverse FFT on the results of the FFT i do not get anything close to the original data out.

Daniel Schofield