question about the sample of histogram64

when I read the sample of histogram64, several questions puzzled me. In the kernel program, the threadPos is computed by three bit operation:

//Encode thread index in order to avoid bank conflicts in s_Hist[] access:

//each half-warp accesses consecutive shared memory banks

//and the same bytes within the banks

const int threadPos = 

//[31 : 6] <== [31 : 6]

((threadIdx.x & (~63)) >> 0) |

//[5  : 2] <== [3  : 0]

((threadIdx.x &    15) << 2) |

//[1  : 0] <== [5  : 4]

((threadIdx.x &    48) >> 4);

why do it like that? is there any reason?

the another question is that:

in the main program code, the h_Data array is generated in the range of [0,255], why in the computing, it said: only 64-bit histogram of 8-bit input data array is calculated, only highest 6 bits of each 8-bit data element are extracted?

//Cycle through current block, update per-thread histograms

    //Since only 64-bit histogram of 8-bit input data array is calculated,

    //only highest 6 bits of each 8-bit data element are extracted,

    //leaving out 2 lower bits.

    for(int pos = threadIdx.x; pos < dataSize; pos += blockDim.x){

        unsigned int data4 = d_Data[baseIndex + pos];

        addPixel64(s_Hist, threadPos, (data4 >>  2) & 0x3FU);

        addPixel64(s_Hist, threadPos, (data4 >> 10) & 0x3FU);

        addPixel64(s_Hist, threadPos, (data4 >> 18) & 0x3FU);

        addPixel64(s_Hist, threadPos, (data4 >> 26) & 0x3FU);

I am now been confused by the code, expecting any one can give me some detail explanation. I have read the histogram.pdf, but can’t find the answer.

any reply is grateful.

The first serves to work around shared memory bank conflicts inside the code; if you remove it, the code still works the same, but will be (a bit) slower.

Then, why is it that the gray level is 0-255, but the bins are just 64 and the value of h_Data is only highest 6 bits are extracted? we know that the shared memory access will generate a 4 way conflict, but how do we know how to shift all the bits?