CUFFT on image

Hi, I’m new in CUDA, and I’m trying to use cufft on image filtering.

I made the program which performs fft just R2C and C2R(forward and inverse).
But, result is weird and values are very large (e.g. 169622.67, 249807.91, 487279.47…)

What is wrong with this?
Below is the code: (textureData is just RGB image, float-scaled)

cufftHandle plan_fwd, plan_inv;

cufftReal *dev_real;
CUDA_HANDLE(cudaMalloc((void **)&dev_real, sizeof(cufftReal) * textureData.size()));
CUDA_HANDLE(cudaMemcpy(dev_real, &textureData.front(), sizeof(cufftReal) * textureData.size(), cudaMemcpyHostToDevice));

cufftComplex *dev_complex;
CUDA_HANDLE(cudaMalloc((void **)&dev_complex, sizeof(cufftComplex) * width * height * (3 / 2 + 1)));

CUDA_HANDLE(cufftPlan3d(&plan_fwd, height, width, 3, CUFFT_R2C));
CUDA_HANDLE(cufftExecR2C(plan_fwd, dev_real, dev_complex));
CUDA_HANDLE(cufftDestroy(plan_fwd));

CUDA_HANDLE(cufftPlan3d(&plan_inv, height, width, 3, CUFFT_C2R));
CUDA_HANDLE(cufftExecC2R(plan_inv, dev_complex, dev_real));
CUDA_HANDLE(cufftDestroy(plan_inv));

CUDA_HANDLE(cudaMemcpy(&textureData.front(), dev_real, sizeof(GLfloat) * textureData.size(), cudaMemcpyDeviceToHost));

CUDA_HANDLE(cudaFree(dev_real));
CUDA_HANDLE(cudaFree(dev_complex));
CUDA_HANDLE(cudaThreadSynchronize());

You have to divide the result by the size of the transform.

referring to the cufft documentation:

http://docs.nvidia.com/cuda/cufft/index.html#cufft-transform-directions

“cuFFT performs un-normalized FFTs; that is, performing a forward FFT on an input data set followed by an inverse FFT on the resulting set yields data that is equal to the input, scaled by the number of elements. Scaling either transform by the reciprocal of the size of the data set is left for the user to perform as seen fit.”

Thanks. I solved the problem by appending this code:

for (int i = 0; i < textureData.size(); ++i){
    textureData[i] /= textureData.size();
}