Hi,
I’ve got little confused because I thought that I understand problem with bank conflicts but after reading: Programming Guide F.4.3.2 Larger Than 32-Bit Access, i don’t know.
Bank conflict occurs when two thread access to the same memory bank. Example:
//64 threads in block
float data[64];
data[threadIdx.x] = 1.0;
It couses 2-way bank conflict, yes?
access to data[0] and data[1] doesn’t couse bank conflict but access to data[0] and data[32] couses.
Example in programming guide said that there isn’t any bank conflict but I think that shared[0] and shared[10] are in conflict.
Bank conflicts never occur between threads of different warps (sets of 32 threads), because memory accesses (and instructions in general) are executed for one warp at a time.
Yes, but that actually gets compiled to 3 accesses to 32 banks each (because float3 has no hardware support, .x, .y and .z are accessed separately). And because 3 (the stride in 32-bit words) and 32 (the number of banks) are relative primes, there is no bank conflict - each of the 32 threads accesses a different bank (if you are still unsure of this, work out the number of the bank that is accessed from each thread).