I am doing a 1D FFT. I have the same input data as would go in FFTW, however, the return from CUFFT does not seem to be “aligned” the same was FFTW is. That is, In my FFTW code, I could calculate the center of the zero padding, then do some shifting to “left-align” all my data, and have trailing zeros.

In CUFFT, the result from the FFT is data that looks like it is the same, however, the zeros are not “centered” in the output, so the rest of my algorithm breaks. (The shifting to left-align the data still has a “gap” in it after the bad shift).

Can anyone give me any insight? I thought it had something to do with those compatibility flags, but even with cufftSetCompatibilityMode(plan, CUFFT_COMPATIBILITY_FFTW_ALL); I am still getting a bad result.

Heres a screenshot of the data of the first row. This is a plot of the magnitude of the data of the first row, right after the inverse FFT has been taken. On the left is CUFFT, on the right in FFTW

One suggestion is to input a signal with known transform ( for example sin or cos) and see where the non-zero values end up.
Could you post your code? Is your input data real or complex?

This gave me a value of 100 for the magnitude of the 76th element when I dumped those to file, and when i tried it with the forward FFT, i got the 100 in the 26th element. Zeros every where else. Also, this was exactly the same between CUFFT and FFTW. Now I guess I am even more stumped as to why my other code isnt working.

There is nothing mathematically incorrect in the fact that the non-zero element would come out at different locations for forward and inverse (after you write down the DFT expression for the input signal).

Here is a small codfe I got by modifying the cuftt_library.pdf example. It takes a signal with the real part cos(i2pi/16) (zero imaginary part) and makes the Fourier transform. The transform has 2 points non-zero one at 16 and one at 240. The first half contains the values for positive k while the second half the negative k.
Heere is the output

No, that was not the original code. That was just a contrived example to see if the FFTW and CUFFT code had matching output, which they did. For some reason they dont for my actual data though.