Shared Memory "Bank Conflicts" I'am confused...

Hi,

i’am confused with the part “Shared Memory Bank Conflicts”.

I understand: If 2 or more thread’s useing same bank results in a conflict.

But what is a bank?! Exists only 0…15 banks? How large is a bank (32bit)?

— snip —

// case 1
shared char shared1[8]; // int = 1 byte = 8bit

// case 2
shared short shared2[8]; // int = 2 byte = 16bit

// case 3
shared int shared3[8]; // int = 4 byte = 32 bit

— snip —

Case 1

  • Are shared1[0], shared1[1], shared1[2], shared1[3] saved at same bank?

Case 2

  • Are shared2[0], shared2[1] saved at same bank? Are shared2[2], shared2[3] saved at different banks?

Case 3

  • Are shared3[0], shared3[1], shared3[2], shared3[3] saved at different banks?

Big Thanks,
L.

yes, yes and yes.

ok, thx again :) !

// case 4
shared int shared3[32]; // int = 4 byte = 32 bit

Now I need 32 banks, or not? But there are only 16 possible! I don’t understand that! To which bank belongs shared3[16]? To the first bank of the second half warp, or what? → no conflict.

Please help me!

Is there a conflict in this case:

[code]shared float array[32];

unsigned int tid = threadIdx.x;

float data = array[tid] + array[tid + 16]; [\code]

Please explain that!

A short possibly helpful diagram of memory banks:

Bank 0	 | Bank 1	| ... | Bank 15

shared[0]  | shared[ 1]| ... | shared[15]

shared[16] | shared[17]| ... | shared[31]

....etc

From this you can see that array[tid] and array[tid + 16] fall into the same memory bank and therefore cause a conflict.

Thank you very much! Now, I understand it!

correct?

shared short shared[64]; // short int = 2 byte = 16bit

[codebox]Bank 0 | Bank 1 | … | Bank 15

shared[0] shared[ 1] | shared[2] shared[ 3] |… | shared[30] shared[31]

shared[32] shared[33] |…[/codebox]

how large (how many) variables can be defined?

Yes, I think it’s correct. Shared memory has a size of 16kB per multi processor.

ahhh ok - i had read this, but forgotten… :)

BUT! The read accesses for array[tid] and array[tid+16] are in separate machine instructions, hence,… NO conflict.

Both memory accesses may appear to be in a C single code line, but the memory reads do happen sequentially.

Christian

Right, sorry about that. Conflicts can only happen when different threads read from conflicting addresses.

Right, but does first pattern create 4 way bank conflict? With this exact pattern I guess there is just one 32 bit transaction, no?

All recent architectures have 32 banks with width 4 bytes.

It is only a bank conflict, if different addresses are read. Identical addresses from the same bank don’t count as bank conflict.

thanks! just one clarification. reading [0] [1] [2] [3] bytes by first 4 threads actually access the same “address” of smem making no conflict. so by “address” you mean address of 32 bit chunks, dropping two LSB of byte address, right?