simple CUFFT problem / time to frequency domain

FLag · September 20, 2009, 11:34am

Hi,

I 've been trying to transform an audio input array that holds the amplitude values to an array that holds the corresponding frequency values. But what I’m getting is just a copy of the input array at the end. Here is some simple code Im using:

[codebox]

cufftResult result;

cufftHandle plan;

cufftComplex* COMPLEX_DATA, *COMPLEX_DATA2;

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA,

				sizeof(cufftComplex)*len));

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA2,

				sizeof(cufftComplex)*len));

for(int i = 0;i<len;i++){

COMPLEX_DATA[i].x = AUDIO_INPUT_ARRAY[i];

COMPLEX_DATA[i].y = 0;}

result = cufftPlan1d(&plan, len, CUFFT_C2C, 1);

result = cufftExecC2C(plan, COMPLEX_DATA, COMPLEX_DATA2, CUFFT_FORWARD);

for(int i = 0;i<len;i++){

AUDIO_INPUT_ARRAY[i] = COMPLEX_DATA2[i].x;}

[/codebox]

So the AUDIO_INPUT_ARRAY now contains the same values it did before the transform. What am I getting wrong here?

I thought that it would be fairly simple to do this… Just want a transformation from time to frequency domain. Also, how would that be implemented for an input array with size much larger than 512? Would I have to use a 2d plan?

Thanks

LSChien · September 20, 2009, 2:18pm

Question 1: AUDIO_INPUT_ARRAY now contains the same values it did before the transform
Ans: you must copy data to device memory and copy data back. The general procedures are

// step 1: allocate host memory and set up data

cufftComplex* h_COMPLEX_DATA ;

h_COMPLEX_DATA = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;
for(int i = 0;i<len;i++){
h_COMPLEX_DATA[i].x = AUDIO_INPUT_ARRAY[i];
h_COMPLEX_DATA[i].y = 0;
}

// step 2: allocate device memory and create plane
cufftComplex* d_COMPLEX_DATA, *d_COMPLEX_DATA2;
cufftHandle plan;

cutilSafeCall(cudaMalloc((void**)&d_COMPLEX_DATA , sizeof(cufftComplex)len));
cutilSafeCall(cudaMalloc((void*)&d_COMPLEX_DATA2,sizeof(cufftComplex)*len));
cufftPlan1d(&plan, len, CUFFT_C2C, 1);

// step 3: copy data from host to device (you miss this part)

cutilSafeCall( cudaMemcpy( d_COMPLEX_DATA, h_COMPLEX_DATA, sizeof(cufftComplex)*len, cudaMemcpyHostToDevice) );

// step 4: do FFT
cufftExecC2C(plan, d_COMPLEX_DATA, d_COMPLEX_DATA2, CUFFT_FORWARD);

// step 5: copy data back from device to host
cufftComplex* h_COMPLEX_DATA2 ;
h_COMPLEX_DATA2 = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;
cutilSafeCall( cudaMemcpy( h_COMPLEX_DATA2, d_COMPLEX_DATA2, sizeof(cufftComplex)*len, cudaMemcpyDeviceToHost) );

Question 2: how would that be implemented for an input array with size much larger than 512?
Ans: 1D transform sizes up to 8 million elements, see CUFFT_Library_2.3.pdf

FLag · September 21, 2009, 1:49pm

Thanks for the reply. I did what you told me to… now its not giving me a copy of the input array. Which is good.

What I get in the output array is positive and negative values. No matter what frequency the input signal is, the result I get is an array that contains values in the first (more or less) 100 elements (0-100) and then the rest of the elements in the array have the same value .

Here is what I get in the console if I printf the first 100 elements of the output array:

[codebox]

-213743.812500

-176505.718750

-115979.562500

345738.625000

439145.875000

-65905.585938

-518402.437500

-287294.156250

345113.562500

544822.625000

56839.781250

-449380.968750

-297208.812500

277916.437500

467816.093750

13720.697266

-441029.562500

-250939.437500

329600.687500

494040.093750

2007.912598

-480179.687500

-292709.468750

308756.875000

503257.937500

32308.912109

-448932.562500

-279286.562500

297578.750000

475534.500000

5035.835938

-460362.468750

-269870.468750

320328.906250

496866.625000

12095.294922

-471240.437500

-291524.156250

301058.437500

490825.750000

21761.238281

-452810.375000

-276111.031250

304008.687500

479906.468750

3855.109863

-466959.687500

-278453.062500

314138.000000

495632.000000

15149.760742

-467023.062500

-289711.218750

298603.000000

485155.125000

15784.248047

-456199.000000

-275788.562500

306821.687500

482510.718750

3822.507324

-470168.625000

-283326.593750

310090.750000

494290.812500

16624.316406

-464499.812500

-288506.468750

297085.718750

481328.656250

11616.010742

-458595.531250

-275391.718750

309216.656250

484694.437500

3746.154297

-473114.718750

-287792.875000

306560.750000

493535.031250

18746.214844

-461231.968750

-286756.156250

295499.687500

476829.093750

6518.214355

-461432.812500

-274290.312500

313239.093750

488824.968750

4850.279297

-476520.843750

-294269.031250

300632.312500

491726.468750

22211.029297

-454718.812500

-281607.625000

295470.718750

470599.750000

[/codebox]

This is what I get when I printf the next 100 elements (100-200) of the array :

[codebox]

-2845.558105

-468292.562500

-274327.750000

320492.156250

499066.375000

11242.167969

-479138.000000

-305618.218750

286485.312500

483671.406250

26349.957031

-439682.843750

-264424.906250

302933.437500

460902.875000

-26848.669922

-493156.218750

-282722.312500

338926.937500

538223.375000

48154.726563

-474002.250000

-350902.093750

200838.984375

402640.468750

-0.001327

[/codebox]

After that all elements have the same value (-0.001327)…

First of all I wanted to ask why I m getting these negative number results? Also why are those values so large?

I thought that I should end up with an array that holds frequency values …or something like that. Definitely not values up to 500000 and -500000. The signal I m using for input in this case is at 500Hz and has a size of 1000 elements.

Also, when I try with a larger input array size (e.g 5000 elements) I get this ( these are the first 10 elements printed… but the rest of the array is the same too) :

[codebox]

1.#QNAN0

[/codebox]

I thought the FFT should be returning something like an array that holds the values of the frequencies contained in the input signal… But it doesn’t obviously. Or I m getting it completely wrong.

I could upload the whole files with the code but I don’t really do anything more than what I posted in my previous post. Its very simple code… The only difference is that I m sending the input array from a c++ file… But still I don’t think it matters as I do copy the host array to a device array properly before I execute the FFT transform and vice versa afterwards.

Can anyone help with this?

LSChien · September 21, 2009, 3:06pm

you can upload your code then we can check it.

However you can check it yourself with harmonic vector.

for example, if you have X(0:N-1) data element, then you can check

X(j) = cos( 2pikj/N) for k = 0, 1, 2, …, N-1 or
X(j) = sin(2pikj/N) for k = 1,2,…, N-1

frequency component of harmonic vector is very simple, then you may find the error easily

FLag · September 21, 2009, 7:34pm

Thanks very much again for the reply LSChien. I cannot upload right for some reason I don’t know…

but the code is not a lot so I ll just post it here.

This is the main.cpp file from which I m sending the audio array:

[codebox]

// includes, system

#include

#include <stdlib.h>

#include

#include <cufft.h>

#include <cuda.h>

// Required to include CUDA vector types

#include <vector_types.h>

#include “cutil_inline.h”

const float PI = 3.14159265;

extern “C” void runTest(unsigned int len, cufftComplex* c1, cufftComplex* c2);

int main(int argc, char** argv)

{

// input data

const int len = 1000;

int memSize = len * sizeof(float);

float* floatArray;

cufftComplex* h_COMPLEX_DATA, *h_COMPLEX_DATA2;

// Memory allocation

floatArray = (float *) malloc(memSize);

h_COMPLEX_DATA = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;

h_COMPLEX_DATA2 = (cufftComplex*)malloc( sizeof(cufftComplex)*len ) ;

for(int i=0;i<len;i++)

{

	floatArray[i] = sin(2*PI*500/len*i) + 1;	// create sine wave at 500Hz and convert -1 - 1 range to 0 - 1 range

	floatArray[i] *= 0.5f;						

	h_COMPLEX_DATA[i].x = floatArray[i];

	h_COMPLEX_DATA[i].y = 0;

}

// run the device part of the program

runTest(len, h_COMPLEX_DATA, h_COMPLEX_DATA2);

for(int x=0;x<len;x++)

{

	floatArray[x] = h_COMPLEX_DATA2[x].x;

}

for(int i=0;i<100;i++)

{

	printf("%f\n", floatArray[i]);

}

cutilExit(argc, argv);

free(floatArray);

return 0;

}

[/codebox]

And this is the .cu file thats receives the array, transforms it and sends it back :

[codebox]

#include <stdlib.h>

#include <stdio.h>

#include <string.h>

#include <math.h>

#include <cufft.h>

#include <cutil_inline.h>

const float PI = 3.14159265;

extern “C” void

runTest(unsigned int len, cufftComplex* c1, cufftComplex* c2)

{

cudaSetDevice( cutGetMaxGflopsDeviceId() );

cufftResult result;

cufftHandle plan;

cufftComplex* COMPLEX_DATA, *COMPLEX_DATA2;

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA, 

				sizeof(cufftComplex)*len));

cutilSafeCall(cudaMalloc((void**)&COMPLEX_DATA2, 

				sizeof(cufftComplex)*len));

// copy data from host to device

cutilSafeCall(cudaMemcpy(COMPLEX_DATA, c1, len, cudaMemcpyHostToDevice));

result = cufftPlan1d(&plan, len, CUFFT_C2C, 1);

result = cufftExecC2C(plan, COMPLEX_DATA, COMPLEX_DATA2, CUFFT_FORWARD);

// copy results from device to host

cutilSafeCall(cudaMemcpy(c2, COMPLEX_DATA2, len, cudaMemcpyDeviceToHost));

// cleanup memory

cutilSafeCall(cudaFree(COMPLEX_DATA2));

cutilSafeCall(cudaFree(COMPLEX_DATA));

cufftDestroy(plan);

cudaThreadExit();

}

[/codebox]

Well thats about it really. Thanks for all the help again.

mfatica · September 21, 2009, 8:28pm

Your code for cudaMemcpy is incorrect, the function is expecting the number of bytes, not the number of elements.
It should be:
cudaMemcpy(COMPLEX_DATA, c1, lensizeof(cufftComplex), cudaMemcpyHostToDevice);
cudaMemcpy(c2, COMPLEX_DATA2, lensizeof(cufftComplex), cudaMemcpyDeviceToHost);

FLag · September 22, 2009, 11:08am

OK. I changed the size parameter in cudaMemcpy and I m getting some results that look more correct.

These are the values I get for an input signal that holds the following frequencies:500,510,520,530 Hz. These values are in the X axis of the transformed cufftComplex variable. Here I printf the elements from 490 to 550 of the array. The total size of the array is 2856 elements:

[codebox]

0.000092

-0.000064

-0.000046

0.000048

-0.000065

-0.000087

0.000075

0.000056

-0.000145

0.000044

0.003866

0.000050

-0.000002

0.000117

-0.000051

0.000072

0.000054

-0.000038

0.000094

0.000104

0.007853

0.000213

0.000079

0.000087

0.000079

-0.000274

-0.000020

-0.000203

0.000289

0.000190

0.016487

0.000383

-0.000011

0.000258

-0.000081

-0.000254

-0.000189

0.000267

0.000127

0.000261

0.033529

0.000275

0.000483

-0.000092

0.000189

0.000648

-0.000386

0.000277

0.000726

-0.000005

0.001615

0.001314

-0.001143

-0.000569

-0.000141

0.000359

-0.000197

-0.001105

-0.002979

0.000835

[/codebox]

And these are the results I get in the Y axis in the same elements:

[codebox]

-0.000803

-0.001028

-0.000822

-0.000748

-0.001049

-0.000942

-0.001055

-0.001198

-0.001385

-0.002090

-89.250854

0.000447

-0.000306

-0.000743

-0.000852

-0.000956

-0.001202

-0.001505

-0.002023

-0.003520

-178.500946

0.001507

0.000316

-0.000436

-0.000977

-0.001039

-0.001864

-0.002051

-0.003430

-0.005864

-357.000427

0.003887

0.001179

0.000646

0.000599

-0.000579

-0.002479

-0.002260

-0.004177

-0.010168

-713.998657

0.011294

0.005692

0.004234

0.003320

0.002610

0.002332

0.001629

0.002315

0.001429

0.001440

0.003169

0.002326

0.001524

0.001356

0.000504

0.004754

0.000965

0.001553

-0.000439

[/codebox]

Basically it looks right because the higher values appear on the elements with an

index equivalent to the frequencies contained in the input signal. Why are they negative tho?

I m doing a CUFFT_FORWARD type transform. If I do an inverse fft then negative

becomes positive and vice versa.

So with all that I think I m on the right path…

The problem is that I still don’t understand what these numbers represent… How am I going

to convert from Hertz to angular frequency? Because this is what I want to do at the end :

[codebox]

for(int i =0;i<len;i++)

{

omega[i] = 2 * PI * Frequency[i];  // this is the array containing results from the fft converted to angular freq

GAIN[i] = 1 / sqrt(1 + (pow(omega[i] * RC, 2)));

}

[/codebox]

The formula above is a simple LP filter. Poles depend on the power at the end.

The variable RC is actually “Resistance * Capacitance” which I’ve calculated using

this formula : “Fc = 1 / 2 * PI * R *C”. The cut-off frequency is a known parameter.

It will be entered by the user. So I just took R*C to the left side of the equation and Fc to the right.

So how am I going to get that angular frequency there. The results I have are obviously not in Hz. And also

how am I going to find frequencies in the input signal that are higher than the size of the array? For example,

in this case I ve got an array of 2856 elements ( which I think is the number of sampleFrames in VST plugins).

If an element number is equivalent to a frequency then how am i going to get frequencies up to 20kHz?

Thanks very much

Filippos

Topic		Replies	Views
[SOLVED] cuFFT data storage, maybe I'm operating on the wrong elements CUDA Programming and Performance	16	2399	June 11, 2022
Cufft 1D transform CUDA Programming and Performance	6	3135	January 8, 2015
cufft problem with real-to-complex 1d transform CUDA Programming and Performance	3	1371	July 7, 2010
Wrong results in CUFFT! CUDA Programming and Performance	4	5553	March 22, 2011
Printing out values after cufft GPU-Accelerated Libraries cufft	5	1112	June 28, 2021
cufftExecR2C only gives half the answer..?! CUDA Programming and Performance	2	4308	July 24, 2009
Some part out CUFFT output is missing.(or inaccurate.) CUDA Programming and Performance	2	780	March 2, 2012
Problem with cufft test (computing forward and inverse fft) CUDA Programming and Performance	1	474	July 6, 2020
about frequency spectrum computed from cuFFT GPU-Accelerated Libraries	9	1646	January 5, 2018
about cufft for 2D array CUDA Programming and Performance	0	1192	January 9, 2018

simple CUFFT problem / time to frequency domain

Related topics