simple CUFFT problem / time to frequency domain

Hi,

I 've been trying to transform an audio input array that holds the amplitude values to an array that holds the corresponding frequency values. But what I’m getting is just a copy of the input array at the end. Here is some simple code Im using:

[codebox]

cufftResult result;

cufftHandle plan;

cufftComplex* COMPLEX_DATA, *COMPLEX_DATA2;

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA,

				sizeof(cufftComplex)*len));

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA2,

				sizeof(cufftComplex)*len));

for(int i = 0;i<len;i++){

COMPLEX_DATA[i].x = AUDIO_INPUT_ARRAY[i];

COMPLEX_DATA[i].y = 0;}

result = cufftPlan1d(&plan, len, CUFFT_C2C, 1);

result = cufftExecC2C(plan, COMPLEX_DATA, COMPLEX_DATA2, CUFFT_FORWARD);

for(int i = 0;i<len;i++){

AUDIO_INPUT_ARRAY[i] = COMPLEX_DATA2[i].x;}

[/codebox]

So the AUDIO_INPUT_ARRAY now contains the same values it did before the transform. What am I getting wrong here?

I thought that it would be fairly simple to do this… Just want a transformation from time to frequency domain. Also, how would that be implemented for an input array with size much larger than 512? Would I have to use a 2d plan?

Thanks

Question 1: AUDIO_INPUT_ARRAY now contains the same values it did before the transform
Ans: you must copy data to device memory and copy data back. The general procedures are

// step 1: allocate host memory and set up data

cufftComplex* h_COMPLEX_DATA ;

h_COMPLEX_DATA = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;
for(int i = 0;i<len;i++){
h_COMPLEX_DATA[i].x = AUDIO_INPUT_ARRAY[i];
h_COMPLEX_DATA[i].y = 0;
}

// step 2: allocate device memory and create plane
cufftComplex* d_COMPLEX_DATA, *d_COMPLEX_DATA2;
cufftHandle plan;

cutilSafeCall(cudaMalloc((void**)&d_COMPLEX_DATA , sizeof(cufftComplex)len));
cutilSafeCall(cudaMalloc((void
*)&d_COMPLEX_DATA2,sizeof(cufftComplex)*len));
cufftPlan1d(&plan, len, CUFFT_C2C, 1);

// step 3: copy data from host to device (you miss this part)

cutilSafeCall( cudaMemcpy( d_COMPLEX_DATA, h_COMPLEX_DATA, sizeof(cufftComplex)*len, cudaMemcpyHostToDevice) );

// step 4: do FFT
cufftExecC2C(plan, d_COMPLEX_DATA, d_COMPLEX_DATA2, CUFFT_FORWARD);

// step 5: copy data back from device to host
cufftComplex* h_COMPLEX_DATA2 ;
h_COMPLEX_DATA2 = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;
cutilSafeCall( cudaMemcpy( h_COMPLEX_DATA2, d_COMPLEX_DATA2, sizeof(cufftComplex)*len, cudaMemcpyDeviceToHost) );

Question 2: how would that be implemented for an input array with size much larger than 512?
Ans: 1D transform sizes up to 8 million elements, see CUFFT_Library_2.3.pdf

Thanks for the reply. I did what you told me to… now its not giving me a copy of the input array. Which is good.

What I get in the output array is positive and negative values. No matter what frequency the input signal is, the result I get is an array that contains values in the first (more or less) 100 elements (0-100) and then the rest of the elements in the array have the same value .

Here is what I get in the console if I printf the first 100 elements of the output array:

[codebox]

-213743.812500

-176505.718750

-115979.562500

345738.625000

439145.875000

-65905.585938

-518402.437500

-287294.156250

345113.562500

544822.625000

56839.781250

-449380.968750

-297208.812500

277916.437500

467816.093750

13720.697266

-441029.562500

-250939.437500

329600.687500

494040.093750

2007.912598

-480179.687500

-292709.468750

308756.875000

503257.937500

32308.912109

-448932.562500

-279286.562500

297578.750000

475534.500000

5035.835938

-460362.468750

-269870.468750

320328.906250

496866.625000

12095.294922

-471240.437500

-291524.156250

301058.437500

490825.750000

21761.238281

-452810.375000

-276111.031250

304008.687500

479906.468750

3855.109863

-466959.687500

-278453.062500

314138.000000

495632.000000

15149.760742

-467023.062500

-289711.218750

298603.000000

485155.125000

15784.248047

-456199.000000

-275788.562500

306821.687500

482510.718750

3822.507324

-470168.625000

-283326.593750

310090.750000

494290.812500

16624.316406

-464499.812500

-288506.468750

297085.718750

481328.656250

11616.010742

-458595.531250

-275391.718750

309216.656250

484694.437500

3746.154297

-473114.718750

-287792.875000

306560.750000

493535.031250

18746.214844

-461231.968750

-286756.156250

295499.687500

476829.093750

6518.214355

-461432.812500

-274290.312500

313239.093750

488824.968750

4850.279297

-476520.843750

-294269.031250

300632.312500

491726.468750

22211.029297

-454718.812500

-281607.625000

295470.718750

470599.750000

[/codebox]

This is what I get when I printf the next 100 elements (100-200) of the array :

[codebox]

-2845.558105

-468292.562500

-274327.750000

320492.156250

499066.375000

11242.167969

-479138.000000

-305618.218750

286485.312500

483671.406250

26349.957031

-439682.843750

-264424.906250

302933.437500

460902.875000

-26848.669922

-493156.218750

-282722.312500

338926.937500

538223.375000

48154.726563

-474002.250000

-350902.093750

200838.984375

402640.468750

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

-0.001327

[/codebox]

After that all elements have the same value (-0.001327)…

First of all I wanted to ask why I m getting these negative number results? Also why are those values so large?

I thought that I should end up with an array that holds frequency values …or something like that. Definitely not values up to 500000 and -500000. The signal I m using for input in this case is at 500Hz and has a size of 1000 elements.

Also, when I try with a larger input array size (e.g 5000 elements) I get this ( these are the first 10 elements printed… but the rest of the array is the same too) :

[codebox]

1.#QNAN0

1.#QNAN0

1.#QNAN0

1.#QNAN0

1.#QNAN0

1.#QNAN0

1.#QNAN0

1.#QNAN0

1.#QNAN0

1.#QNAN0

[/codebox]

I thought the FFT should be returning something like an array that holds the values of the frequencies contained in the input signal… But it doesn’t obviously. Or I m getting it completely wrong.

I could upload the whole files with the code but I don’t really do anything more than what I posted in my previous post. Its very simple code… The only difference is that I m sending the input array from a c++ file… But still I don’t think it matters as I do copy the host array to a device array properly before I execute the FFT transform and vice versa afterwards.

Can anyone help with this?

you can upload your code then we can check it.

However you can check it yourself with harmonic vector.

for example, if you have X(0:N-1) data element, then you can check

X(j) = cos( 2pikj/N) for k = 0, 1, 2, …, N-1 or
X(j) = sin(2
pikj/N) for k = 1,2,…, N-1

frequency component of harmonic vector is very simple, then you may find the error easily

Thanks very much again for the reply LSChien. I cannot upload right for some reason I don’t know…

but the code is not a lot so I ll just post it here.

This is the main.cpp file from which I m sending the audio array:

[codebox]

// includes, system

#include

#include <stdlib.h>

#include

#include

#include <cufft.h>

#include <cuda.h>

// Required to include CUDA vector types

#include <vector_types.h>

#include “cutil_inline.h”

const float PI = 3.14159265;

extern “C” void runTest(unsigned int len, cufftComplex* c1, cufftComplex* c2);

int main(int argc, char** argv)

{

// input data

const int len = 1000;

int memSize = len * sizeof(float);

float* floatArray;

cufftComplex* h_COMPLEX_DATA, *h_COMPLEX_DATA2;

// Memory allocation

floatArray = (float *) malloc(memSize);

h_COMPLEX_DATA = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;

h_COMPLEX_DATA2 = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;

for(int i=0;i<len;i++)

{

	floatArray[i] = sin(2*PI*500/len*i) + 1;	// create sine wave at 500Hz and convert -1 - 1 range to 0 - 1 range

	floatArray[i] *= 0.5f;						

	h_COMPLEX_DATA[i].x = floatArray[i];

	h_COMPLEX_DATA[i].y = 0;

}

// run the device part of the program

runTest(len, h_COMPLEX_DATA, h_COMPLEX_DATA2);

for(int x=0;x<len;x++)

{

	floatArray[x] = h_COMPLEX_DATA2[x].x;

}

for(int i=0;i<100;i++)

{

	printf("%f\n", floatArray[i]);

}

cutilExit(argc, argv);

free(floatArray);

return 0;

}

[/codebox]

And this is the .cu file thats receives the array, transforms it and sends it back :

[codebox]

#include <stdlib.h>

#include <stdio.h>

#include <string.h>

#include <math.h>

#include <cufft.h>

#include <cutil_inline.h>

const float PI = 3.14159265;

extern “C” void

runTest(unsigned int len, cufftComplex* c1, cufftComplex* c2)

{

cudaSetDevice( cutGetMaxGflopsDeviceId() );

cufftResult result;

cufftHandle plan;

cufftComplex* COMPLEX_DATA, *COMPLEX_DATA2;

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA, 

				sizeof(cufftComplex)*len));

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA2, 

				sizeof(cufftComplex)*len));

// copy data from host to device

cutilSafeCall(cudaMemcpy(COMPLEX_DATA, c1, len, cudaMemcpyHostToDevice));

result = cufftPlan1d(&plan, len, CUFFT_C2C, 1);

result = cufftExecC2C(plan, COMPLEX_DATA, COMPLEX_DATA2, CUFFT_FORWARD);

// copy results from device to host

cutilSafeCall(cudaMemcpy(c2, COMPLEX_DATA2, len, cudaMemcpyDeviceToHost));

// cleanup memory

cutilSafeCall(cudaFree(COMPLEX_DATA2));

cutilSafeCall(cudaFree(COMPLEX_DATA));

cufftDestroy(plan);

cudaThreadExit();

}

[/codebox]

Well thats about it really. Thanks for all the help again.

Your code for cudaMemcpy is incorrect, the function is expecting the number of bytes, not the number of elements.
It should be:
cudaMemcpy(COMPLEX_DATA, c1, lensizeof(cufftComplex), cudaMemcpyHostToDevice);
cudaMemcpy(c2, COMPLEX_DATA2, len
sizeof(cufftComplex), cudaMemcpyDeviceToHost);

OK. I changed the size parameter in cudaMemcpy and I m getting some results that look more correct.

These are the values I get for an input signal that holds the following frequencies:500,510,520,530 Hz. These values are in the X axis of the transformed cufftComplex variable. Here I printf the elements from 490 to 550 of the array. The total size of the array is 2856 elements:

[codebox]

0.000092

-0.000064

-0.000046

0.000048

-0.000065

-0.000087

0.000075

0.000056

-0.000145

0.000044

0.003866

0.000050

-0.000002

0.000117

-0.000051

0.000072

0.000054

-0.000038

0.000094

0.000104

0.007853

0.000213

0.000079

0.000087

0.000079

-0.000274

-0.000020

-0.000203

0.000289

0.000190

0.016487

0.000383

-0.000011

0.000258

-0.000081

-0.000254

-0.000189

0.000267

0.000127

0.000261

0.033529

0.000275

0.000483

-0.000092

0.000189

0.000648

-0.000386

0.000277

0.000726

-0.000005

0.001615

0.001314

-0.001143

-0.000569

-0.000141

0.000359

-0.000197

-0.001105

-0.002979

0.000835

[/codebox]

And these are the results I get in the Y axis in the same elements:

[codebox]

-0.000803

-0.001028

-0.000822

-0.000748

-0.001049

-0.000942

-0.001055

-0.001198

-0.001385

-0.002090

-89.250854

0.000447

-0.000306

-0.000743

-0.000852

-0.000956

-0.001202

-0.001505

-0.002023

-0.003520

-178.500946

0.001507

0.000316

-0.000436

-0.000977

-0.001039

-0.001864

-0.002051

-0.003430

-0.005864

-357.000427

0.003887

0.001179

0.000646

0.000599

-0.000579

-0.002479

-0.002260

-0.004177

-0.010168

-713.998657

0.011294

0.005692

0.004234

0.003320

0.002610

0.002332

0.001629

0.002315

0.001429

0.001440

0.003169

0.002326

0.001524

0.001356

0.000504

0.004754

0.000965

0.001553

-0.000439

[/codebox]

Basically it looks right because the higher values appear on the elements with an

index equivalent to the frequencies contained in the input signal. Why are they negative tho?

I m doing a CUFFT_FORWARD type transform. If I do an inverse fft then negative

becomes positive and vice versa.

So with all that I think I m on the right path…

The problem is that I still don’t understand what these numbers represent… How am I going

to convert from Hertz to angular frequency? Because this is what I want to do at the end :

[codebox]

for(int i =0;i<len;i++)

{

omega[i] = 2 * PI * Frequency[i];  // this is the array containing results from the fft converted to angular freq

GAIN[i] = 1 / sqrt(1 + (pow(omega[i] * RC, 2)));

}

[/codebox]

The formula above is a simple LP filter. Poles depend on the power at the end.

The variable RC is actually “Resistance * Capacitance” which I’ve calculated using

this formula : “Fc = 1 / 2 * PI * R *C”. The cut-off frequency is a known parameter.

It will be entered by the user. So I just took R*C to the left side of the equation and Fc to the right.

So how am I going to get that angular frequency there. The results I have are obviously not in Hz. And also

how am I going to find frequencies in the input signal that are higher than the size of the array? For example,

in this case I ve got an array of 2856 elements ( which I think is the number of sampleFrames in VST plugins).

If an element number is equivalent to a frequency then how am i going to get frequencies up to 20kHz?

Thanks very much

Filippos