Shared memory access question

Ionica · October 13, 2010, 10:08am

Hi!
I’ve got a question regarding shared memory access. Let’s say that I have an instruction, which reads two input operands from two different locations in shared memory and stores the result in another location, also in shared memory.
I know that a shared memory request is split into two memory requests, one for each half-warp (for CC 1.x). What happens though when one instruction needs multiple accesses to shared memory, as mentioned above (e.g. such a situation appears when implementing reduction algorithms)? Will there be a different transaction for each input and output operand? My main concern is to find out how to think of the possible memory bank conflicts.

Thanks a lot.

Ionica · October 13, 2010, 10:08am

Hi!
I’ve got a question regarding shared memory access. Let’s say that I have an instruction, which reads two input operands from two different locations in shared memory and stores the result in another location, also in shared memory.
I know that a shared memory request is split into two memory requests, one for each half-warp (for CC 1.x). What happens though when one instruction needs multiple accesses to shared memory, as mentioned above (e.g. such a situation appears when implementing reduction algorithms)? Will there be a different transaction for each input and output operand? My main concern is to find out how to think of the possible memory bank conflicts.

Thanks a lot.

mkaushik · October 20, 2010, 11:05pm

Can you give an example? Are you thinking of something like smem += smem[y]? In this case there will be two reads and one write for each half-warp, and you have to worry about bank conflicts for each.

mkaushik · October 20, 2010, 11:05pm

Can you give an example? Are you thinking of something like smem += smem[y]? In this case there will be two reads and one write for each half-warp, and you have to worry about bank conflicts for each.

Topic		Replies	Views
Shared memory with compute capability 3.x (in 32-bit mode) or compute capability 5.x and 6.x CUDA Programming and Performance	5	981	November 17, 2017
Conflict in shared memory CUDA Programming and Performance	5	5820	November 16, 2010
dont understand bank conflicts for shared mem CUDA Programming and Performance	7	2642	March 31, 2010
the relation between Thread Index and Shared Memory CUDA Programming and Performance	4	3244	February 14, 2009
Requesting clarification for Non contiguous shared memory access by threads of a warp with no bank conflicts CUDA Programming and Performance hw , cuda	5	405	February 21, 2024
Shared Memory "Bank Conflicts" I'am confused... CUDA Programming and Performance	11	3487	August 20, 2009
Does every thread block have its own 32 shared memory banks? CUDA Programming and Performance cuda	8	1709	February 6, 2023
Shared memory: Optimizing vectorized accesses vs bank conflicts CUDA Programming and Performance	4	226	August 2, 2024
CUDA shared memory CUDA Programming and Performance cuda	2	532	December 30, 2023
How to understand the bank conflict of shared_mem CUDA Programming and Performance	12	10870	January 16, 2025

Shared memory access question

Related topics