Sonia
December 3, 2009, 7:42am
1
Hi,
Bank conflicts are avoidable in most CUDA computations if care is taken accessing shared memory arrays.
In convolution, for example this is just a matter of padding the 2D array to a width that is not evenly divisible by the number of shared memory banks.
I am unable to understand this padding funda related to avoiding bank conflicts.
Any pointers here ??
Thanks in advance
Hi,
Bank conflicts are avoidable in most CUDA computations if care is taken accessing shared memory arrays.
[b]In convolution, for example this is just a matter of padding the 2D array to a width that is not evenly divisible by the number of shared memory banks.
[/b]
I am unable to understand this padding funda related to avoiding bank conflicts.
Any pointers here ??
Thanks in advance
consider shared A[16][16],
A[i][j] belongs to bank-j for any i = 0:15
suppose you use 16 threads, then
A[threadIdx.x][0] is bank-conflict since A[0][0], A[1][0], … A[15][0] fall into bank-0
and cannnot be broadcasted.
however if you declare shared A[16][16+1]
then A[threadIdx.x][0] is not bank-conflict since
A[0][0] is bank-0
A[1][0] is bank-1
…
A[15][0] is bank-15
General speaking, under shared A[16][16+1], row-major index of A[i][j] is (i * 17 + j)
(i * 17 + j) = 16 * k + bank_id
or say bank_id = (i * 17 + j) mod 16
for j = 0, bank_id = (i * 17) mod 16
gcd(17,16) = 1, so A[i][0], i=0:15 is distributed into 16 bank.
1 Like
Sonia
December 3, 2009, 10:08am
3
consider shared A[16][16],
A[i][j] belongs to bank-j for any i = 0:15
suppose you use 16 threads, then
A[threadIdx.x][0] is bank-conflict since A[0][0], A[1][0], … A[15][0] fall into bank-0
and cannnot be broadcasted.
however if you declare shared A[16][16+1]
then A[threadIdx.x][0] is not bank-conflict since
[b][b]A[0][0] is bank-0
A[1][0] is bank-1
…
A[15][0] is bank-15
[/b][/b]
General speaking, under shared A[16][16+1], row-major index of A[i][j] is (i * 17 + j)
(i * 17 + j) = 16 * k + bank_id
or say bank_id = (i * 17 + j) mod 16
for j = 0, bank_id = (i * 17) mod 16
gcd(17,16) = 1, so A[i][0], i=0:15 is distributed into 16 bank.
Thanks for your reply!
A[1][0] is bank 1 but the statement:
A[i][j] belongs to bank-j for any i = 0:15
Says that A[1][0] is bank 0… as you said : A[ i ][ j ] belongs to bank-j for any i = 0:15
I am little bit confused!
sorry, I correct my setting, define type of matrix A is “float”,
shared float A[16][16]
and relationship between bank id and A[i][j] is shown in the following
bank 0 bank 1 bank 2 bank 3 .... bank 15
A[0][0] A[0][1] A[0][2] A[0][3] .... A[0][15]
A[1][0] A[1][1] A[1][2] A[1][3] .... A[1][15]
A[2][0] A[2][1] A[2][2] A[2][3] .... A[2][15]
...
A[15][0] A[15][1] A[15][2] A[15][3] .... A[15][15]
that’s why I say A[i][j] belongs to bank-j for any i = 0:15