What I am noticing is that d_fft_img is different in the second print call, leading me to to believe that cufft is destroying its input data. However, I can’t find this mentioned anywhere in the documentation, and it seems like the kind of thing you would want to document. I scoured the forums and found only one other person mentioning the same issue, without a definitive answer.
Does anyone know for certain, one way or another, if the CUFFT functions (namely, C2R) ruin their input data?
Or can you offer any other plausible explanation as to why this would be happening?
Thanks in advance for any help, suggestions, or ideas.
[edit]
First, thanks to the people who have commented so far.
A few clarifications I should have made the first time around:
the results of each transform are correct
there’s nothing before, after, or in between these two calls - something else modifying the data is a non-issue.
“print” is a simplification here - I’m not accidentally printing the address of the pointer, or cpu mem instead of gpu mem, or anything like that : )
(Sorry, my mistake - I should’ve checked the manual 1st - padding only applies to in place transforms, which you don’t use - plus you say it works the 1st time)
Since your 2nd call is a C2R transform, it’s most likely your data wasn’t in the expected format (the manual doesn’t make this very clear). For R2C and C2R,
you need to have the correct #elements in each row:
C2R:
n complex/ row => 2 n - 2 reals / row
R2C
n reals / row => n / 2 + 1 complex / row
The Fourier transform doesn’t change the degrees of freedom, so the “padding” is probably to simplify addressing. Intel IPP’s complex transform format doesn’t have any padding.
This is a very old post, but it affects me. I similar behaviour. My code is a set of iterations of the same task.
I start with a matrix psi and its fourier transform.
I apply nsteps the following algorithm psik
calculate with a kernel a matric nt[i]=psi[i]^3
take the FT of the matrix nt → ntk
update in k space psik[i]=psik[i]*f1[i]+ntk[i]*fk[i]
make IFT of psik–>psi to obtain the new result
Now we go back to step 1. Since nothing is done in between the following matrices should survive
psi and psik, If I go back to step 1 it should have the psuk matrix in the memory, but it gets lost somehow. So the algorith only works if there is another step added
5) FT of psi to psik
Hi all, I know this is an old post, but I had the same question and I comment the answer just for ones who might experience the same issue.
This is a natural behavior of cufftExecC2R which is documented:
The complex-to-real transform is implicitly inverse. For in-place complex-to-real FFTs where FFTW compatible output is selected (default padding mode), the input size is assumed to be ⌊N2⌋+1⌊N2⌋+1cufftComplex elements. Note that in-place complex-to-real FFTs may overwrite arbitrary imaginary input point values when non-unit input and output strides are chosen. Out-of-place complex-to-real FFT will always overwrite input buffer. For out-of-place transforms, input and output sizes match the logical transform non-redundant size ⌊N2⌋+1⌊N2⌋+1 and size NN, respectively.