the source code “convolutionSeparable” bundled with cudasdk 2.3 has a convolutionSeparable_gold.cpp file, the convolutionRowCPU function in this file wrote:
extern “C” void convolutionRowCPU(
float *h_Result,
float *h_Data,
float *h_Kernel,
int dataW,
int dataH,
int kernelR
){
int x, y, k, d;
float sum;
for(y = 0; y < dataH; y++)
for(x = 0; x < dataW; x++){
sum = 0;
for(k = -kernelR; k <= kernelR; k++){
d = x + k;
if(d >= 0 && d < dataW)
sum += h_Data[y * dataW + d] * h_Kernel[kernelR - k];
}
h_Result[y * dataW + x] = sum;
}
}
// I am not sure about the line “sum += h_Data[y * dataW + d] * h_Kernel[kernelR - k];”.
I think it should be sum += h_Data[y * dataW + d] * h_Kernel[kernelR + k]; that is h_Kernel[kernelR - k] should be h_Kernel[kernelR + k].
Could anyone so kind to explain why the author use h_Kernel[kernelR - k] here instead of h_Kernel[kernelR + k]?
Thanks in advanced.