parralell array processing

hello everyone,

I have 1000 arrays(lets call them G1, G2, G3…) of length 256 in memory as a 1d array and I need to multiply each of the “G” arrays by Array F

I currently have this kernel:


global void MulArray(cufftComplex *A_d,cufftComplex *B_d,cufftComplex *C_d)


int idx = blockIdx.x+threadIdx.x;//index of element along array B_d (G1, G2, G3... are all in memory back to back)

int idx2 = threadIdx.x;//index of element in array A_d (array F)






now the problem is that I only get results for the first 256 elements…

I think this might be because multiple threads try to read the same element at the same time…

what could i do to fix this?? would i have to just make enough copies of F to match up with every G??

any pointer appreciated


actually ignore this… i screwed up else where…