ConvolutionSeparable: Gaussian blur to mean filter

I was looking around for a quick method for applying a mean filter to an image and I found this website – http://suja-cuda-optimization.synthasite.com, which takes the ConvolutionSeparable example from the SDK and optimizes it using some tweaks, such as separating the color channels into 3 arrays and then apply the filter to them separately – http://suja-cuda-optimization.synthasite.c…—version2.php. I understand all the code, for the exception of the filter algorithm, which is very confusing.

This generates the Gaussian blur filter kernel:

for ( i = 0; i < KERNEL_W; i++ )

{

	 float dist = (float)(i - KERNEL_RADIUS) / (float)KERNEL_RADIUS;

	 h_Kernel[i] = expf (-dist * dist / 2);

	 kernelSum += h_Kernel[i];

}

for ( i = 0; i < KERNEL_W; i++ )

h_Kernel[i] /= kernelSum;

This is the reference row convolution filter, to compare with its GPU counterpart:

extern "C" void convolutionRowCPU(float *h_Result, float *h_Data, float *h_Kernel, int dataW, int dataH, int kernelR)

{

	int x, y, k, d;

	float sum;

	for(y = 0; y < dataH; y++)

		for(x = 0; x < dataW; x++){

			sum = 0;

			for(k = -kernelR; k <= kernelR; k++){

				d = x + k;

				if(d >= 0 && d < dataW)

					sum += h_Data[y * dataW + d] * h_Kernel[kernelR - k];

			}

			h_Result[y * dataW + x] = sum;

		}

}

And the column one:

extern "C" void convolutionColumnCPU(float *h_Result, float *h_Data, float *h_Kernel, int dataW, int dataH, int kernelR)

{

	int x, y, k, d;

	float sum;

	for(y = 0; y < dataH; y++)

		for(x = 0; x < dataW; x++){

			sum = 0;

			for(k = -kernelR; k <= kernelR; k++){

				d = y + k;

				if(d >= 0 && d < dataH)

					sum += h_Data[d * dataW + x] * h_Kernel[kernelR - k];

			}

			h_Result[y * dataW + x] = sum;

		}

}

Initially, I thought modifying the filter kernel (not the CUDA kernel) would be sufficient, but it seems it’s more complicated than that. Perhaps someone can help me to figure out how to “downgrade” the code to a mean fiter.

Thank you.

I was looking around for a quick method for applying a mean filter to an image and I found this website – http://suja-cuda-optimization.synthasite.com, which takes the ConvolutionSeparable example from the SDK and optimizes it using some tweaks, such as separating the color channels into 3 arrays and then apply the filter to them separately – http://suja-cuda-optimization.synthasite.c…—version2.php. I understand all the code, for the exception of the filter algorithm, which is very confusing.

This generates the Gaussian blur filter kernel:

for ( i = 0; i < KERNEL_W; i++ )

{

	 float dist = (float)(i - KERNEL_RADIUS) / (float)KERNEL_RADIUS;

	 h_Kernel[i] = expf (-dist * dist / 2);

	 kernelSum += h_Kernel[i];

}

for ( i = 0; i < KERNEL_W; i++ )

h_Kernel[i] /= kernelSum;

This is the reference row convolution filter, to compare with its GPU counterpart:

extern "C" void convolutionRowCPU(float *h_Result, float *h_Data, float *h_Kernel, int dataW, int dataH, int kernelR)

{

	int x, y, k, d;

	float sum;

	for(y = 0; y < dataH; y++)

		for(x = 0; x < dataW; x++){

			sum = 0;

			for(k = -kernelR; k <= kernelR; k++){

				d = x + k;

				if(d >= 0 && d < dataW)

					sum += h_Data[y * dataW + d] * h_Kernel[kernelR - k];

			}

			h_Result[y * dataW + x] = sum;

		}

}

And the column one:

extern "C" void convolutionColumnCPU(float *h_Result, float *h_Data, float *h_Kernel, int dataW, int dataH, int kernelR)

{

	int x, y, k, d;

	float sum;

	for(y = 0; y < dataH; y++)

		for(x = 0; x < dataW; x++){

			sum = 0;

			for(k = -kernelR; k <= kernelR; k++){

				d = y + k;

				if(d >= 0 && d < dataH)

					sum += h_Data[d * dataW + x] * h_Kernel[kernelR - k];

			}

			h_Result[y * dataW + x] = sum;

		}

}

Initially, I thought modifying the filter kernel (not the CUDA kernel) would be sufficient, but it seems it’s more complicated than that. Perhaps someone can help me to figure out how to “downgrade” the code to a mean fiter.

Thank you.

Anyone?

Anyone?

  1. check what separable means, exactly, at page 4 of convolutionSeperable.pdf (doc folder in the src of the SDK). Btw, the doc says that the code example in the SDK uses a separable gaussian filter, however, it uses random values for a(n identical row and column)vector

  2. seperable means that you need a vector for row convolution and one for columnconvolution, and together they apply the filter. It works iff the rows of your kernel are linearly dependent.

The advantage is that the complexity of filtering is reduced, see pdf. It is questionable if a 3x3 filter is worth al the trouble.

  1. assuming that mean filter means a 3x3 filter of value 1/9, you can separate this in a row vector of (1/3,1/3,1/3) and the same as column vector, generally 1/(2*KERNEL_RADIUS+1).

  2. the rest is straightforward?

Let me know if it works.

  1. check what separable means, exactly, at page 4 of convolutionSeperable.pdf (doc folder in the src of the SDK). Btw, the doc says that the code example in the SDK uses a separable gaussian filter, however, it uses random values for a(n identical row and column)vector

  2. seperable means that you need a vector for row convolution and one for columnconvolution, and together they apply the filter. It works iff the rows of your kernel are linearly dependent.

The advantage is that the complexity of filtering is reduced, see pdf. It is questionable if a 3x3 filter is worth al the trouble.

  1. assuming that mean filter means a 3x3 filter of value 1/9, you can separate this in a row vector of (1/3,1/3,1/3) and the same as column vector, generally 1/(2*KERNEL_RADIUS+1).

  2. the rest is straightforward?

Let me know if it works.

Indeed, setting each value of the kernel as 1/(2*KERNEL_RADIUS+1) worked fine. Thanks.

Indeed, setting each value of the kernel as 1/(2*KERNEL_RADIUS+1) worked fine. Thanks.