Array accesses

Hi All,

Suppose you have the following situation:

Two arrays reside in shared memory, one 4Kbytes(array_4K) and one 2Kbytes(array_2K). Assume the data in both is of type char, for example purposes. Threads are launched to initialize the data in the 4Kbyte array, each thread accesses independent locations of that array (i.e. no bank conflicts). Now, I want to take two independent elements from the 4K array, add them and place the result in the 2K array. For example, array_2K[0] = array_4k[0] + array_4K[1]; array_2K[1] = array_4K[2] + array_4K[3], etc.

I was thinking I can launch threads which would read independent locations from the 4K array and write to independent locations in the 2K array, thus not having any bank conflicts. Is this okay to do? Is there a better way to do it in terms of performance or will I be getting the maximum bandwidth possible? I am under the impression that to get best performance, a thread should access only one memory location (as opposed to two), but I think since there is no bank conflicts, this solution should be okay?

Thanks,
Ashu