shared memory accesses for different compute capabilities

xinwu · July 27, 2011, 8:42am

Hi, everyone!

I’m not sure my understanding on shared memory accesses is correct or not.

For devices of compute capability 1.x (16 banks), only half-warp (16 threads) can access shared memory at a time.
For devices of compute capability 2.x (32 banks), one warp (32 threads) can access shared memory at a time. But for 64-bit accesses, only half-warp (16 threads) can access shared memory at a time, so there’s no bank conflict.

Please, figure out that whether my understanding is correct. Thanks!

hyqneuron · July 27, 2011, 3:18pm

Yes you are correct. The appendix of the programming guide explains things quite clearly.

xinwu · July 29, 2011, 8:00am

Thanks, hyqneuron! I’ll take a look!

Topic		Replies	Views
the relation between Thread Index and Shared Memory CUDA Programming and Performance	4	3289	February 14, 2009
confusion about 64 bit shared memory access CUDA Programming and Performance	1	1282	May 10, 2012
dont understand bank conflicts for shared mem CUDA Programming and Performance	7	2730	March 31, 2010
handle bank conflicts on shared memory of Fermi devices? How does the hardware work CUDA Programming and Performance	5	6962	November 15, 2010
Understanding bank conflicts in shared memory (fermi) CUDA Programming and Performance	4	11575	August 16, 2010
Shared Memory Bank Conflict Clarification CUDA Programming and Performance	2	805	April 16, 2011
Share memory and banks CUDA Programming and Performance	1	3269	August 5, 2009
Shared memory with compute capability 3.x (in 32-bit mode) or compute capability 5.x and 6.x CUDA Programming and Performance	5	1039	November 17, 2017
Shared Memory Bank Conflicts CUDA Programming and Performance	3	2348	February 24, 2012
Help understanding bank conflicts in transpose example CUDA Programming and Performance	5	6760	February 8, 2009

shared memory accesses for different compute capabilities

Related topics