I don’t quite understand about the bank conflict as outlined on the page 57 of Cuda Programming Guide. It does not have conflict if the structure contains three floats, but have conflict if the structure contains two float.
If the structure has three floats, then :
Thread 0 - Bank 0 (0.x), Bank 1 (0.y), Bank 2(0.z)
Thread 1 - Bank 3 (1.x), Bank 4 (1.y), Bank 5(1.z)
Thread 2 - Bank 6 (2.x), Bank 7 (2.y), Bank 8(2.z)
But for Thread 6, it will wrap around and it will access Bank 1, 2, and 3 which are already accessed by Thread 0 and Thread 1. So, there should be bank conflict…
Can anybody please explain this ? Thanks.