Hello, I am new in cuda and I have some questions regarding the way that cufft works:

Shouldn’t cufftExecR2C and cufftExecC2C give the same output, when the input is a real array? I tried running both for the same array and the results were nowhere close.

If not, in which cases do I use the first and in which the second?

I noticed that cufftExecC2C is the equivalent of MATALB’s fft but when the output values are of high order (ie e+05) and many, there is a small deviation. Why is that and how can I prevent it?

cufftExecR2C and cufftExecC2C will give the same results. However when you do the R2C transform you will get only half of the inverse space because of symmetry. Please note that if you define a float matrix dev_rdfata and do the R2C transform, you can not use the same matrix for the C2C transsform. You hav e to define a complex matrix dev_cdata in which the imaginary parts are 0.

I defined a complex matrix with imaginary parts set to zero, and used it for both transforms (casting as a cuffetReal* in R2C). I used the same matrix as input and output in order to get an in-place transform, which I missunderstood as the typical full space fft. So, thank you.

I defined a complex matrix with imaginary parts set to zero, and used it for both transforms (casting as a cuffetReal* in R2C). I used the same matrix as input and output in order to get an in-place transform, which I missunderstood as the typical full space fft. So, thank you.

If you do a R2C transform the result is store in a matrix which only slighter larger than the real. If you have matrix (lx by ly), in order to do icplace transform you need to define the input as matrix (lx by 2*(ly/2+1)) real elements. When you define the input with loops:

ccc=0;
for (i=0;i<lx;i++)
{
for (j=0;j<ly+2;j++)
{
if(j<ly)
{
input[ccc]=...;
}
}
ccc=ccc+1;
}

the put put will be to a normal complex matrix of size (lx by ly/2+1) complex elements.

If you do a R2C transform the result is store in a matrix which only slighter larger than the real. If you have matrix (lx by ly), in order to do icplace transform you need to define the input as matrix (lx by 2*(ly/2+1)) real elements. When you define the input with loops:

ccc=0;
for (i=0;i<lx;i++)
{
for (j=0;j<ly+2;j++)
{
if(j<ly)
{
input[ccc]=...;
}
}
ccc=ccc+1;
}

the put put will be to a normal complex matrix of size (lx by ly/2+1) complex elements.