Trouble with cudaFFT real to complex

I try to use cufftExecR2C and print out the result. That show the complex fft result for the input

0       1       2       3
 4       5       6       7
 8       9      10      11
12      13      14      15

is

1.2e+02+ 0i -8+ 8i -8+ 0i -32+ 32i
0+ 0i 0+ 0i -32+ 0i 0+ 0i
0+ 0i -32+ -32i 0+ 0i 0+ 0i
-32+ -32i 0+ 0i 0+ 0i 0+ 0i

however the right result with cufftExecC2C is

1.2e+02+ 0i -8+ 8i -8+ 0i -8+ -8i
-32+ 32i 0+ 0i 0+ 0i 0+ 0i
-32+ 0i 0+ 0i 0+ 0i 0+ 0i
-32+ -32i 0+ 0i 0+ 0i 0+ 0i

that is the same with result i got from Matlab.
Can some one explain what happen with cufftExecR2C. Is that a bug or smth else
How can i use that one for FFT convolution .

Thank you

For real-to-complex transforms the output array contains only the non-redundant coefficients - i.e. N/2+1 complex values for an input of size N.

This is explained in the CUFFT documentation.

Can you explain clearly, just because i read the document , it say about non-redundent

coefficient, but don’t really explains the meaning, how the data organize.

For example , in full FFT complex result i can see the value component -8 - 8i but that one

does not appear in non-redundant coefficient result.

if i want to perform FFT convolution, i have to multiply 2 FFT transform signal, can i do it directly with non-redundant coefficient result ? If i can not do that so what is the point of using non-redundant result while i can not perform simple operation on that one

Many thanks