h_Kernel[kernelR - k] or h_Kernel[kernelR + k] ? may your help

the source code “convolutionSeparable” bundled with cudasdk 2.3 has a convolutionSeparable_gold.cpp file, the convolutionRowCPU function in this file wrote:

extern “C” void convolutionRowCPU(
float *h_Result,
float *h_Data,
float *h_Kernel,
int dataW,
int dataH,
int kernelR
){
int x, y, k, d;
float sum;

for(y = 0; y < dataH; y++)
    for(x = 0; x < dataW; x++){
        sum = 0;
        for(k = -kernelR; k <= kernelR; k++){
            d = x + k;
            if(d >= 0 && d < dataW)
                sum += h_Data[y * dataW + d] * h_Kernel[kernelR - k];
        }
        h_Result[y * dataW + x] = sum;
    }

}

// I am not sure about the line “sum += h_Data[y * dataW + d] * h_Kernel[kernelR - k];”.
I think it should be sum += h_Data[y * dataW + d] * h_Kernel[kernelR + k]; that is h_Kernel[kernelR - k] should be h_Kernel[kernelR + k].
Could anyone so kind to explain why the author use h_Kernel[kernelR - k] here instead of h_Kernel[kernelR + k]?
Thanks in advanced.