I use the cuda 1D_FFT (real to complex) function with the following parameters: n = 1024 sample points; batch = 512.
The 1D_FFT (real to complex) function calculate a 1D array with a complex data type. I get (n/2 + 1) real and (n/2 + 1) imaginary results. But I need the point symmetric FFT result. But I dont know how I create this array.
I think I must first create an empty array with 1024x512 complex data type. Then I copy the FFT result (n/2 + 1) in this array. Then I must copy the point symmetric results in this array. This step must be retry 512 times.
What do you mean by point symmetric? Instead of real transforms you can define a complex array with imaginary parts zero. Then make the transform complex to complex. The result will be a complex array with n elements. The first n/2+1 elements will correspond for k from 0 to kmax, n/2+1 will correspond to both kmax and -kmax and the rest from n/2+1 to n will correspond to negative k from -kmax to kmax.
If I need the negative k from -kmax to kmax I must take a complex to complex FFT. At the moment I use only the real to complex FFT. How can I convert a 1D Array with real datatype (float) to a complex data type with imgaginary parts = 0?
It is not really necessary to do complex to complex transform. If you do a real transform you get the values corresponding to positive k. If you have the value of the k component psik(k) then the component corresponding to the -k is just the complex conjugate of (psik(k)). So psik(-k)=complex_conjugate(psik(k)).
Thanks for the example. I think I can use the real to complex fft, but then I must insert the negative k from -kmax to kmax in the 1D array. I think teh example is the beste method.
Need the complex to complex fft more execution time as the real to complex fft?
Yes if you do a real to complex transform you need to construct a new array and insert the values of the negative k. The complex to complex transform will have automatically all the values but it will be 2 times slower than the real to complex transform. It is now up to you to choose between comfort and speed. For the beginning I would suggest to go with the simple way (complex to complex transform).
What do you mean you have no idea how to copy the negative k values? You do not know how to do it in CUDA, or you do not know how to do it at all even in simple C?
Here simple code in C for batch=1
for(int i=0;i<513;i++)
{
newvec[i+512]=vec[i];
newvec[513-i]=vec[i];
}
newvec[0]=vec[512];
the newvec array will be of size 1025 with newvec[0] corresponding to k=-kmax, newvec[512] corresponding to k=0 and newvec[1024] to k=kmax.
Maybe if you would tell more details we can see why do you need to have the full spectrum. Maybe I can suggest a way to go around the need to copy the redundant data.
I need the hole fft results for a cross correlation of 2 pictures. The cross correlation is based on a FFT algorithm. To solve the copy process in C is no problem. But at the moment I have problem to solve this in cuda.