Shared memory access question

Hi!
I’ve got a question regarding shared memory access. Let’s say that I have an instruction, which reads two input operands from two different locations in shared memory and stores the result in another location, also in shared memory.
I know that a shared memory request is split into two memory requests, one for each half-warp (for CC 1.x). What happens though when one instruction needs multiple accesses to shared memory, as mentioned above (e.g. such a situation appears when implementing reduction algorithms)? Will there be a different transaction for each input and output operand? My main concern is to find out how to think of the possible memory bank conflicts.

Thanks a lot.

Hi!
I’ve got a question regarding shared memory access. Let’s say that I have an instruction, which reads two input operands from two different locations in shared memory and stores the result in another location, also in shared memory.
I know that a shared memory request is split into two memory requests, one for each half-warp (for CC 1.x). What happens though when one instruction needs multiple accesses to shared memory, as mentioned above (e.g. such a situation appears when implementing reduction algorithms)? Will there be a different transaction for each input and output operand? My main concern is to find out how to think of the possible memory bank conflicts.

Thanks a lot.

Can you give an example? Are you thinking of something like smem += smem[y]? In this case there will be two reads and one write for each half-warp, and you have to worry about bank conflicts for each.

Can you give an example? Are you thinking of something like smem += smem[y]? In this case there will be two reads and one write for each half-warp, and you have to worry about bank conflicts for each.