Avoiding Bank Conflics in SCAN


I am writing a program for Pre-fix Sum i.e scan. I have 128 elements in the array. I am not able to identify the proper value of offset that I should use for avoiding Bank Conflicts. I have seen one example scan code on SDK (please refer to scan in the SDK), and one paper by Mark Harris here on page number 11, that uses the following macro for offset:

1-In Paper by Mark Harris on page number 11.

#define NUM_BANKS 16
#define LOG_NUM_BANKS 4
((index) >> NUM_BANKS + (index) >> (2 * LOG_NUM_BANKS))
#define CONFLICT_FREE_OFFSET(index) ((index) >> LOG_NUM_BANKS)

I have grasp the meaning of offset

((index) >> LOG_NUM_BANKS)
in else part of above macro. It is simply because we are adding to the index, Index/NUM_BANKS, which when used with >> converts to (index) >> LOG_NUM_BANKS. This is quite clear why we are adding this value.

However, I am not able to understand the purpose/meaning of

((index) >> NUM_BANKS + (index) >> (2 * LOG_NUM_BANKS))

I am sure some of you might have done this algorithm before. So I need some pointers here.

Thanks guys !!

Got it guys!!

It removes additional bank conflicts. But you need to assign more shared memory, which is the disadvantage of this option.