About shared memory banks

user144648 · April 24, 2023, 5:19pm

I have a float array A[5].
The total size in Bytes will be 5 * 4 = 20B.
In compute capability 5.2 we have 8B/bank (and total 32banks and 256B)

How it is calculated the total size of shared memory per SM (in compute capabilty 5.2 is ~98KB)?
How are distributed the bytes of each element of array?
e.g.
the A[0] and A[1] will store in bank0
the A[2] and A[3] will store in bank1
etc
or
the 0 Byte of A[0] in bank0
the 1st Byte of A[0] in bank1
the 2nd Byte of A[0] in bank2
etc
or something else?

Thanks to all.

Robert_Crovella · April 24, 2023, 6:00pm

I don’t think so. See here. Compute capability 5.x is 4 bytes/bank.

For:

__shared__ float A[size];

A[0] is in bank 0.
A[1] is in bank 1.
A[2] is in bank 2.
…
A[31] is in bank 31.
A[32] is in bank 0.
A[33] is in bank 1.
etc.

user144648 · April 24, 2023, 7:17pm

Thanks for your reply.
In this case if A is float with 4B size
and we have 32bit word per bank (4B/bank)
and A elements distributed like above
so
if thread0 and thread1 have access to A[0]
then we havent conflict.

Is this true?

Robert_Crovella · April 24, 2023, 7:25pm

Yes, true. Two threads in a warp accessing the same bank may have a bank conflict. However, two or more threads accessing the same location are covered by the broadcast rule. The value will be broadcast to all threads requesting it, with no conflicts arising from the broadcast.

user144648 · April 25, 2023, 2:35pm

Thank you for your answer.

Another question:
In the compute capability 5.2 the shared memory has 64KB.
This means that 64KB divide in 32 banks?
e.g
The word index will be for
Bank0: 0, 32, 64, 96, 128, …
Bank1: 1, 33, 65, 97, 129, …
…
Bank31: 31, 63, 95, 127, 159, … 15999

Robert_Crovella · April 25, 2023, 2:49pm

Yes, correct. (Well, I don’t know about the 15999 number, but the rest of it is the way I would describe it.)

user144648 · April 26, 2023, 9:01am

Thanks again.

Topic		Replies	Views
shared memory banks CUDA Programming and Performance	7	2540	November 23, 2008
dont understand bank conflicts for shared mem CUDA Programming and Performance	7	2626	March 31, 2010
Shared Memory "Bank Conflicts" I'am confused... CUDA Programming and Performance	11	3467	August 20, 2009
Share memory and banks CUDA Programming and Performance	1	3243	August 5, 2009
Shared memory with compute capability 3.x (in 32-bit mode) or compute capability 5.x and 6.x CUDA Programming and Performance	5	974	November 17, 2017
question about the shared memory CUDA Programming and Performance	4	3865	October 30, 2007
you can access only 32 bits per bank on shared memory despite the fact a bank is 1ko ? CUDA Programming and Performance	2	3131	April 29, 2010
Does every thread block have its own 32 shared memory banks? CUDA Programming and Performance cuda	8	1599	February 6, 2023
shared memory accesses for different compute capabilities CUDA Programming and Performance	2	2840	July 29, 2011
Bank Conflicts CUDA Programming and Performance	2	1960	December 6, 2009

About shared memory banks

Related topics