CUDA FFT different from Matlab FFT

Any other suggestions?

[attachment=18316:simpleCUFFT2D.zip]

I worked out an example. The examples has a signal size of 1013 as specified. The 8 signals are multiples of sine 50 hz, which makes the desired outcome obvious.

Cudacode and matlab script are added. My matlab skills are so-so, It could be that I didn’t use Matlab quite right; but the cuda outcomes appear irreproachable.

Please see for yourself - and tell me about it!

The cuda example, as the name says, is derived in a straigtforward manner from the SDK example.

The matlab script is mostly copy-paste as well…

Jan

[attachment=23755:simpleCUFFT2D.zip]

I worked out an example. The examples has a signal size of 1013 as specified. The 8 signals are multiples of sine 50 hz, which makes the desired outcome obvious.

Cudacode and matlab script are added. My matlab skills are so-so, It could be that I didn’t use Matlab quite right; but the cuda outcomes appear irreproachable.

Please see for yourself - and tell me about it!

The cuda example, as the name says, is derived in a straigtforward manner from the SDK example.

The matlab script is mostly copy-paste as well…

Jan

It involves the following steps

  1. Convert your data to be complex, always

  2. Use only Z2Z / C2C plans

  3. if using inverse fourier transform normalize it (divide every element with number of elements in output)

And here http://developer.download.nvidia.com/compu…Library_1.1.pdf

Page-7 talks about how to use cufft library in column major format.

It involves the following steps

  1. Convert your data to be complex, always

  2. Use only Z2Z / C2C plans

  3. if using inverse fourier transform normalize it (divide every element with number of elements in output)

And here http://developer.download.nvidia.com/compu…Library_1.1.pdf

Page-7 talks about how to use cufft library in column major format.

Do you know Cuda uses the row-major format for the arrays? I don’t know what matlab uses.
In your 2D FFT you have to read/write your arrays with index [Y + Ny * X],
when Y is your y-index, X is your x-index and Ny the number of points in y-direction.

Do you know Cuda uses the row-major format for the arrays? I don’t know what matlab uses.
In your 2D FFT you have to read/write your arrays with index [Y + Ny * X],
when Y is your y-index, X is your x-index and Ny the number of points in y-direction.

I’m using complex data natively.

I’m using only Z2Z or C2C as can you see.

I’m not using inverse FFT, so I don’t need to normalize it.

I’ve already read it, but it didn’t help me too much.

@jan.heckman

Thank you, I’ve seen your program. However, it’s an example that doesn’t fit at all with my program (and I don’t know how adjust it for my needs). I need “simply” understand why my CUDA instructions doesn’t produce same results of Matlab and how resolve my situation.

However, now I’m waiting for an answer from my professor about this problem. Stay tuned! :)

I’m using complex data natively.

I’m using only Z2Z or C2C as can you see.

I’m not using inverse FFT, so I don’t need to normalize it.

I’ve already read it, but it didn’t help me too much.

@jan.heckman

Thank you, I’ve seen your program. However, it’s an example that doesn’t fit at all with my program (and I don’t know how adjust it for my needs). I need “simply” understand why my CUDA instructions doesn’t produce same results of Matlab and how resolve my situation.

However, now I’m waiting for an answer from my professor about this problem. Stay tuned! :)

Hi,

It looks like you want to convert A (1013 x 256) to B (1024 x 256)

You will need to pad each column with 11 elements (all zeros) before you send it into the FFT.

Hi,

It looks like you want to convert A (1013 x 256) to B (1024 x 256)

You will need to pad each column with 11 elements (all zeros) before you send it into the FFT.

Have you solved in somehow ?

My guess is that you have forgotten to change between column-major order in Matlab (y + x * DATA_H) and row-major order in CUDA ( x + y * DATA_W).