I just need someone to clarify that my understanding of bank conflicts in shared memory on the Fermi is correct. It says in the CUDA Programming Guide 3.1, that there can now be bank conflicts between threads in different half-warps in GPUs of compute capability 2.0. Is this because the number of banks has increased to 32, so two half-warps can access the banks at the same time. Hence, there’s potential for 2 threads in the different half-warps to access the same bank? Or is there another explanation, which I have missed.
I’m also a little unsure about why doubles are subject to 2-way bank conflicts in shared memory for compute capability 1.3. If anyone is a whizz at this and can explain it to me I would be very grateful indeed.
Actually, I’ve just worked out why doubles suffer 2-way bank conflicts. It’s because the doubles are split into 2 32-bit words and put into successive banks. So, for a half-warp accessing 16 doubles, there will be 2 threads accessing each each bank.
Still unclear on the first point however, so please reply with possible explanations.
Actually, I’ve just worked out why doubles suffer 2-way bank conflicts. It’s because the doubles are split into 2 32-bit words and put into successive banks. So, for a half-warp accessing 16 doubles, there will be 2 threads accessing each each bank.
Still unclear on the first point however, so please reply with possible explanations.
You r correct with your thought on the first point. Pre-Fermi accesses were handled per half-warp so it didnt matter if say thread 0 and 16 accessed the same bank. This has changed with Fermi as you said.
You r correct with your thought on the first point. Pre-Fermi accesses were handled per half-warp so it didnt matter if say thread 0 and 16 accessed the same bank. This has changed with Fermi as you said.