Does this have bank conflict?

Hi all,

I have an array A in my shared memory, each thread reads two neighbor elements of them. For example, thread 0 reads A[0], A[1], thread 1 reads A[1], A[2]. Does it have bank conflict?

Thank you.

Casy

Yes, you have a bank conflict at A[1] …

When I try to access memory like this ( for example in a reduction) I use indexes like the following:

n is the number of elements:
Thread[ i ]: A[ i ] and A[ i + n >> 1 ]
… ( n >> 1 is the same as n / 2 if n is a power of 2 )

You also can use something like

unsigned int i = blockIdx.x * ( blockDim.x * 2 ) + threadIdx.x; // TODO adept
unsigned int ib = i + blockDim.x;
Thread[ i ] accesses A[ i ] and A[ ib ]

This options don’t have bank conflicts but the number of elements should be even.
( Have a look at SDK’s reduction sample! ;) )

This response makes some assumptions which are incorrect.

Let say we have the example of:

__shared__ uint foo[BLOCKSIZE + 1];

uint val1, val2;

val1 = foo[threadIdx.x];

val2 = foo[threadIdx.x + 1];

There is no bank conflict.

Now if you were doing something along the lines of:

__shared__ uint foo[BLOCKSIZE + 1];

uint2 val;

val = (uint2 *) &foo[threadIdx.x];

This would create a bank conflict.

There would be a quote-unquote “bank conflict.” But in fact both scenarios will simply execute in two cycles.