parralell array processing

hello everyone,

I have 1000 arrays(lets call them G1, G2, G3…) of length 256 in memory as a 1d array and I need to multiply each of the “G” arrays by Array F

I currently have this kernel:

[codebox]

global void MulArray(cufftComplex *A_d,cufftComplex *B_d,cufftComplex *C_d)

{

int idx = blockIdx.x+threadIdx.x;//index of element along array B_d (G1, G2, G3... are all in memory back to back)

int idx2 = threadIdx.x;//index of element in array A_d (array F)

C_d[idx].x=(A_d[idx2].y*(B_d[idx].y))+(A_d[idx2].x*B_d[idx].

x);

C_d[idx].y=(A_d[idx2].y*B_d[idx].x)+(A_d[idx2].x*(-1*B_d[idx].y));

}

[/codebox]

now the problem is that I only get results for the first 256 elements…

I think this might be because multiple threads try to read the same element at the same time…

what could i do to fix this?? would i have to just make enough copies of F to match up with every G??

any pointer appreciated

thanks

actually ignore this… i screwed up else where…