cufftExecC2R and cufftExecR2C Memory Layout


im currently trying to implement some fourier Filters for 2D data. Recently i implemented them with the complex to complex transformation functions, which work like i wanted them to work ;). But i think i unterstood something wrong with the real2complex functions. Ill try to show what i do by a little 2x2 image example.

my image looks like:
I1 I2
I3 I4

and is represented in gpu space by [I1 I2 I3 I4]. (where Ii, 1<=i<=4 stands for pixel values in input image).

now comes the tricky part.
I’m doing the not-in-place fourier Transformation and get an array with interleaved complex data (page3 in the manual).

the result array should look like: [R1 C1 R2 C2] (with Ri and Ci Real and Complex part of resulting Fourier coefficients).

But im not tooo sure about that.
Acording to the manual again i have to do some padding because my transformation is not in-place.

Now i would do some resorting
R1 C1 -> C2 R2
R2 C2 -> C1 R1

But after performing the inverse transformation my ouput array/image doesn’t looke like the input :no:

Can anybody help me and tell me which step is wrong? Because i’m reading manuals for some hours and tried nearly everything.
Greetings XLRO