1D CUFFT results not matching FFTWF results source code attached


In this simple program I have attached, it appears that the results are not the same. Is this some byproduct of the way I am calculating the abs() function for the std::complex vs. the cuComplex value?

I plotted this and it seems like the CUFFT bars are scaled somewhat more than the FFTWF bars across the 50 rows.
Any idea whats goign on there?
driver.cpp (1.89 KB)

Upon looking at my results…the CUFFT values are scaled 1.41x the FFTWF values

1.41 ~ sqrt(2).

There is probably a difference in how FFTWF normalizes the forward transform from CUFFT (which doesn’t do any normalization…)

So what do people do to fix this? I am working on some already existing code that is “correct” and I am trying to do this GPU implementation. do I have to manually fix the normalization?

Does this have anything to do with the compatibilityMode? I tried a couple of those, couldnt seem to make a difference.

Also, I modified the output to try putting out the real and imaginary parts of each of these. It is different for each. IN the FFTW case the 2-49 entries real component is -25. The values are different int he CUFFT version. The DC component of the FFTW is (1225,0) and (1225,1225)

And from the FFTW site:

The DFT results are stored in-order in the array out, with the zero-frequency (DC) component in out[0]. The array in is not modified. Users should note that FFTW computes an unnormalized DFT, the sign of whose exponent is given by the dir parameter of fftw_create_plan. Thus, computing a forward followed by a backward transform (or vice versa) results in the original array scaled by n. See Section What FFTW Really Computes, for the definition of DFT.

So maybe it is CUFFT doing the normalization, or FFTW doing none?

CUFFT does not normalize the results:
(IFFT(FFT(A))) =len(A)*A

You just need to scale the coefficients by 1/len(A) to get the same results.

I guess I am a little confused by what you mean there. Are you saying to scale the output data from the CUFFT?

As in, make a scaleFactor cuComplex like so:

cuComplex scaleFactor = make_cuComplex(1.0f/cols, 0.0);

and change my print statement to something like this:

sprintf(tmp, “%f\n”, cuCabsf(cuCmulf(h_outdata[i], scaleFactor)));

This is what I tried, and it did not make the results correct. It just made the CUFFT output scaled smalled by 50 (since I had 50 cols in my arrays)

is it the case that normalization factor is only after doing the Inverse FFT?

In other words - should I even bother checking these intermediate results, or should I run the FFT and IFFT and then check the results?


It appears that when I run IFFT(CUFFT(A)) I get a value that is scaled by (sqrt(2)*50) . This is contrary to the documentation, which says t his will give me something scaled by the number of elements. (which is 50 for my test case)

…reserved for futuer post (couldnt delete my old post)