Ncu detects bank conflicts in matrix transposition after padding

mstrengert · January 30, 2023, 5:34pm

Hi zhaopeng_eng,

there is indeed a difference in the reported bank conflicts on the Source Page versus the Details Page.

On the Source Page the reported bank conflicts solely originate from the memory access pattern of the corresponding source line. For every executed shared memory access, we calculate the conflicts within the warp due to the access pattern for the active threads of the warp. For your updated code sample, this is now reduced to zero.

The reported bank conflicts on the Details Page include all these conflicts plus additional conflicts that are caused by multiple clients trying to access the memory banks at the same time. For more details, please also have a look at How to Understand and Optimize Shared Memory Accesses using Nsight Compute | NVIDIA On-Demand. The difference and root cause are briefly discussed around minute 21 in the recording. In short, as the L1 Cache and Shared Memory are both backed by the same physical memory banks, there may be additional conflicts across warps from different clients accessing this physical memory. The numbers on the Details Page include these additional conflicts.

Topic		Replies	Views
The question of the example of "3.2.2.3 Shared Memory in Matrix Multiplication(C=A*A(T)" i CUDA Programming and Performance	0	1905	September 17, 2009
weird bank conflict when matrix transpose Nsight Compute	1	623	February 10, 2020
Help understanding bank conflicts in transpose example CUDA Programming and Performance	5	6719	February 8, 2009
example project "transpose" CUDA Programming and Performance	1	2014	March 13, 2009
Bank conflicts with 2D shared mem array Resolving bank conflicts CUDA Programming and Performance	1	2021	July 18, 2008
Avoiding Bank Conflicts in convolution CUDA Programming and Performance	3	3005	December 3, 2009
Avoiding shared memory bank conflicts CUDA Programming and Performance	3	3072	October 12, 2010
Optimizing bank conflicts - problem with occupancy CUDA Programming and Performance	12	2311	April 22, 2010
trading memory to negate bank conflicts CUDA Programming and Performance	0	479	June 2, 2015
bank conflict question CUDA Programming and Performance	3	2304	December 28, 2009

Ncu detects bank conflicts in matrix transposition after padding

Related topics