Laue
August 19, 2009, 9:44am
1
Hi,
i’am confused with the part “Shared Memory Bank Conflicts”.
I understand: If 2 or more thread’s useing same bank results in a conflict.
But what is a bank?! Exists only 0…15 banks? How large is a bank (32bit)?
— snip —
// case 1
shared char shared1[8]; // int = 1 byte = 8bit
// case 2
shared short shared2[8]; // int = 2 byte = 16bit
// case 3
shared int shared3[8]; // int = 4 byte = 32 bit
— snip —
Case 1
Are shared1[0], shared1[1], shared1[2], shared1[3] saved at same bank?
Case 2
Are shared2[0], shared2[1] saved at same bank? Are shared2[2], shared2[3] saved at different banks?
Case 3
Are shared3[0], shared3[1], shared3[2], shared3[3] saved at different banks?
Big Thanks,
L.
// case 4
shared int shared3[32]; // int = 4 byte = 32 bit
Now I need 32 banks, or not? But there are only 16 possible! I don’t understand that! To which bank belongs shared3[16]? To the first bank of the second half warp, or what? → no conflict.
Please help me!
Is there a conflict in this case:
[code]shared float array[32];
unsigned int tid = threadIdx.x;
float data = array[tid] + array[tid + 16]; [\code]
Please explain that!
A short possibly helpful diagram of memory banks:
Bank 0 | Bank 1 | ... | Bank 15
shared[0] | shared[ 1]| ... | shared[15]
shared[16] | shared[17]| ... | shared[31]
....etc
From this you can see that array[tid] and array[tid + 16] fall into the same memory bank and therefore cause a conflict.
Thank you very much! Now, I understand it!
Laue
August 19, 2009, 3:17pm
8
Is there a conflict in this case:
[code]shared float array[32];
unsigned int tid = threadIdx.x;
float data = array[tid] + array[tid + 16]; [\code]
Please explain that!
correct?
shared short shared[64]; // short int = 2 byte = 16bit
[codebox]Bank 0 | Bank 1 | … | Bank 15
shared[0] shared[ 1] | shared[2] shared[ 3] |… | shared[30] shared[31]
shared[32] shared[33] |…[/codebox]
how large (how many) variables can be defined?
correct?
shared short shared[64]; // short int = 2 byte = 16bit
[codebox]Bank 0 | Bank 1 | … | Bank 15
shared[0] shared[ 1] | shared[2] shared[ 3] |… | shared[30] shared[31]
shared[32] shared[33] |…[/codebox]
how large (how many) variables can be defined?
Yes, I think it’s correct. Shared memory has a size of 16kB per multi processor.
Laue
August 19, 2009, 3:31pm
10
ahhh ok - i had read this, but forgotten… :)
BUT! The read accesses for array[tid] and array[tid+16] are in separate machine instructions, hence,… NO conflict.
Both memory accesses may appear to be in a C single code line, but the memory reads do happen sequentially.
Christian
BUT! The read accesses for array[tid] and array[tid+16] are in separate machine instructions, hence,… NO conflict.
Both memory accesses may appear to be in a C single code line, but the memory reads do happen sequentially.
Christian
Right, sorry about that. Conflicts can only happen when different threads read from conflicting addresses.