I need the point symmetric FFT results

Pimbolie1979 · October 31, 2011, 12:34pm

I use the cuda 1D_FFT (real to complex) function with the following parameters: n = 1024 sample points; batch = 512.

The 1D_FFT (real to complex) function calculate a 1D array with a complex data type. I get (n/2 + 1) real and (n/2 + 1) imaginary results. But I need the point symmetric FFT result. But I dont know how I create this array.

I think I must first create an empty array with 1024x512 complex data type. Then I copy the FFT result (n/2 + 1) in this array. Then I must copy the point symmetric results in this array. This step must be retry 512 times.

Can somebody help me pls.

pasoleatis · October 31, 2011, 4:25pm

Hello,

What do you mean by point symmetric? Instead of real transforms you can define a complex array with imaginary parts zero. Then make the transform complex to complex. The result will be a complex array with n elements. The first n/2+1 elements will correspond for k from 0 to kmax, n/2+1 will correspond to both kmax and -kmax and the rest from n/2+1 to n will correspond to negative k from -kmax to kmax.

Pimbolie1979 · October 31, 2011, 5:10pm

If I need the negative k from -kmax to kmax I must take a complex to complex FFT. At the moment I use only the real to complex FFT. How can I convert a 1D Array with real datatype (float) to a complex data type with imgaginary parts = 0?

pasoleatis · October 31, 2011, 8:29pm

Something like this:

global realtocomplex(cufftReal *in,cufftComplex *out)

{

idx=threadIdx.x+blockDim.x*blockIdx.x;

if(idx<N)

{

out[idx].x=int[idx];

out[idx].y=0.0;

}

call with:

realtocomplex<<<grid,threads>>>(in,out);

This is a rough code you might have to adjust to work.

pasoleatis · October 31, 2011, 8:33pm

It is not really necessary to do complex to complex transform. If you do a real transform you get the values corresponding to positive k. If you have the value of the k component psik(k) then the component corresponding to the -k is just the complex conjugate of (psik(k)). So psik(-k)=complex_conjugate(psik(k)).

Pimbolie1979 · October 31, 2011, 10:27pm

Thanks for the example. I think I can use the real to complex fft, but then I must insert the negative k from -kmax to kmax in the 1D array. I think teh example is the beste method.

Need the complex to complex fft more execution time as the real to complex fft?

pasoleatis · November 1, 2011, 5:22am

Yes if you do a real to complex transform you need to construct a new array and insert the values of the negative k. The complex to complex transform will have automatically all the values but it will be 2 times slower than the real to complex transform. It is now up to you to choose between comfort and speed. For the beginning I would suggest to go with the simple way (complex to complex transform).

Pimbolie1979 · November 1, 2011, 1:34pm

But I need the speed version because I have a great data stream.

First I malloc a new array in the GPU memory.
Then I calculate the FFT from 1024 sample points (batch = 512) .
After that I copy 513 complex FFT results to the new array. Then I must copy the 511 negative k values to the array.
Repeat step 3 512 times because batch = 512.

But I have no idea how I can copy the negative k values and the fft results to the new array

Can sombody post an example for the copy process?

pasoleatis · November 1, 2011, 2:04pm

What do you mean you have no idea how to copy the negative k values? You do not know how to do it in CUDA, or you do not know how to do it at all even in simple C?

Here simple code in C for batch=1

for(int i=0;i<513;i++)

{

newvec[i+512]=vec[i];

newvec[513-i]=vec[i];

}

newvec[0]=vec[512];

the newvec array will be of size 1025 with newvec[0] corresponding to k=-kmax, newvec[512] corresponding to k=0 and newvec[1024] to k=kmax.

For batch >1

for(int offset=0;offset<batch; offset++)

{

for(int i=0;i<513;i++)

{

newvec[offset1025+i+512]=vec[offset512+i];

newvec[offset1025+513-i]=complex_conjg(vec[offset512+i]);

}

newvec[offset1025+0]=vec[offset512+512];

}

Is should be straight forward to convert it to CUDA and take into account you have complex numbers.

pasoleatis · November 1, 2011, 2:25pm

Maybe if you would tell more details we can see why do you need to have the full spectrum. Maybe I can suggest a way to go around the need to copy the redundant data.

Pimbolie1979 · November 1, 2011, 6:09pm

I need the hole fft results for a cross correlation of 2 pictures. The cross correlation is based on a FFT algorithm. To solve the copy process in C is no problem. But at the moment I have problem to solve this in cuda.

Pimbolie1979 · November 1, 2011, 8:35pm

Today Im sick. I will test it in 1 or 2 days