CUDA FFT different from Matlab FFT

Manugal · September 21, 2010, 9:04am

Any other suggestions?

jan.heckman · September 23, 2010, 12:01am

[attachment=18316:simpleCUFFT2D.zip]

I worked out an example. The examples has a signal size of 1013 as specified. The 8 signals are multiples of sine 50 hz, which makes the desired outcome obvious.

Cudacode and matlab script are added. My matlab skills are so-so, It could be that I didn’t use Matlab quite right; but the cuda outcomes appear irreproachable.

Please see for yourself - and tell me about it!

The cuda example, as the name says, is derived in a straigtforward manner from the SDK example.

The matlab script is mostly copy-paste as well…

Jan

jan.heckman · September 23, 2010, 12:01am

[attachment=23755:simpleCUFFT2D.zip]

I worked out an example. The examples has a signal size of 1013 as specified. The 8 signals are multiples of sine 50 hz, which makes the desired outcome obvious.

Cudacode and matlab script are added. My matlab skills are so-so, It could be that I didn’t use Matlab quite right; but the cuda outcomes appear irreproachable.

Please see for yourself - and tell me about it!

The cuda example, as the name says, is derived in a straigtforward manner from the SDK example.

The matlab script is mostly copy-paste as well…

Jan

pavanky · September 23, 2010, 6:09am

Hi!

I’m porting a Matlab application to CUDA. I need to calculate FFT by cuFFT library, but results between Matlab fft() and CUDA fft are different. I’ve seen around this forum ( http://forums.nvidia.com/index.php?showtop…mp;#entry589016 ) and others, that the problems resides in a different representation of matrix (row-major orderin CUDA vs column-major order in Matlab) and in a different representation of complex values (in CUDA we have a Array of Struct, while in Matlab we have a Struct of Array). The Matlab instruction that I’m trying to convert:
% A is a 1013*256-matrix

% B will be a 1024*256-matrix

B = fft(A, 1024, 1);
So I’ve made these trials:

n-point cudaFFT on A in 1-dimension
cufftHandle plan;

cudaPlan1d(&plan,1024*256,CUFFT_C2C,1);

cufftExecC2C(plan,A,B,CUFFT_FORWARD);
cudaFFT on A in 2-dimension
cufftHandle plan;

cudaPlan2d(&plan,1024,256,CUFFT_C2C);

cufftExecC2C(plan,A,B,CUFFT_FORWARD);
n-point cudaFFT on A^t (the transpose of A) in 1-dimension
cufftHandle plan;

cudaPlan1d(&plan,1024*256,CUFFT_C2C,1);

cufftExecC2C(plan,At,B,CUFFT_FORWARD);
cudaFFT on A in 2-dimension with args N and M swapped
cufftHandle plan;

cudaPlan2d(&plan,256,1024,CUFFT_C2C);

cufftExecC2C(plan,A,B,CUFFT_FORWARD);
None of these gave me results more or less equals to Matlab fft(). What can I do?

It involves the following steps

Convert your data to be complex, always
Use only Z2Z / C2C plans
if using inverse fourier transform normalize it (divide every element with number of elements in output)

And here http://developer.download.nvidia.com/compu…Library_1.1.pdf

Page-7 talks about how to use cufft library in column major format.

pavanky · September 23, 2010, 6:09am

Hi!

I’m porting a Matlab application to CUDA. I need to calculate FFT by cuFFT library, but results between Matlab fft() and CUDA fft are different. I’ve seen around this forum ( http://forums.nvidia.com/index.php?showtop…mp;#entry589016 ) and others, that the problems resides in a different representation of matrix (row-major orderin CUDA vs column-major order in Matlab) and in a different representation of complex values (in CUDA we have a Array of Struct, while in Matlab we have a Struct of Array). The Matlab instruction that I’m trying to convert:
% A is a 1013*256-matrix

% B will be a 1024*256-matrix

B = fft(A, 1024, 1);
So I’ve made these trials:

n-point cudaFFT on A in 1-dimension
cufftHandle plan;

cudaPlan1d(&plan,1024*256,CUFFT_C2C,1);

cufftExecC2C(plan,A,B,CUFFT_FORWARD);
cudaFFT on A in 2-dimension
cufftHandle plan;

cudaPlan2d(&plan,1024,256,CUFFT_C2C);

cufftExecC2C(plan,A,B,CUFFT_FORWARD);
n-point cudaFFT on A^t (the transpose of A) in 1-dimension
cufftHandle plan;

cudaPlan1d(&plan,1024*256,CUFFT_C2C,1);

cufftExecC2C(plan,At,B,CUFFT_FORWARD);
cudaFFT on A in 2-dimension with args N and M swapped
cufftHandle plan;

cudaPlan2d(&plan,256,1024,CUFFT_C2C);

cufftExecC2C(plan,A,B,CUFFT_FORWARD);
None of these gave me results more or less equals to Matlab fft(). What can I do?

It involves the following steps

Convert your data to be complex, always
Use only Z2Z / C2C plans
if using inverse fourier transform normalize it (divide every element with number of elements in output)

And here http://developer.download.nvidia.com/compu…Library_1.1.pdf

Page-7 talks about how to use cufft library in column major format.

Philipp82 · September 23, 2010, 11:31am

Do you know Cuda uses the row-major format for the arrays? I don’t know what matlab uses.
In your 2D FFT you have to read/write your arrays with index [Y + Ny * X],
when Y is your y-index, X is your x-index and Ny the number of points in y-direction.

Philipp82 · September 23, 2010, 11:31am

Do you know Cuda uses the row-major format for the arrays? I don’t know what matlab uses.
In your 2D FFT you have to read/write your arrays with index [Y + Ny * X],
when Y is your y-index, X is your x-index and Ny the number of points in y-direction.

Manugal · September 23, 2010, 12:12pm

I’m using complex data natively.

I’m using only Z2Z or C2C as can you see.

I’m not using inverse FFT, so I don’t need to normalize it.

I’ve already read it, but it didn’t help me too much.

@jan.heckman

Thank you, I’ve seen your program. However, it’s an example that doesn’t fit at all with my program (and I don’t know how adjust it for my needs). I need “simply” understand why my CUDA instructions doesn’t produce same results of Matlab and how resolve my situation.

However, now I’m waiting for an answer from my professor about this problem. Stay tuned! :)

Manugal · September 23, 2010, 12:12pm

I’m using complex data natively.

I’m using only Z2Z or C2C as can you see.

I’m not using inverse FFT, so I don’t need to normalize it.

I’ve already read it, but it didn’t help me too much.

@jan.heckman

Thank you, I’ve seen your program. However, it’s an example that doesn’t fit at all with my program (and I don’t know how adjust it for my needs). I need “simply” understand why my CUDA instructions doesn’t produce same results of Matlab and how resolve my situation.

However, now I’m waiting for an answer from my professor about this problem. Stay tuned! :)

pavanky · September 23, 2010, 8:36pm

Hi,

It looks like you want to convert A (1013 x 256) to B (1024 x 256)

You will need to pad each column with 11 elements (all zeros) before you send it into the FFT.

pavanky · September 23, 2010, 8:36pm

Hi,

It looks like you want to convert A (1013 x 256) to B (1024 x 256)

You will need to pad each column with 11 elements (all zeros) before you send it into the FFT.

sirmark · March 29, 2011, 12:13pm

Have you solved in somehow ?

wanderine · March 29, 2011, 1:31pm

My guess is that you have forgotten to change between column-major order in Matlab (y + x * DATA_H) and row-major order in CUDA ( x + y * DATA_W).