Launching kernels with „replaceable“ constants
I am trying to program an convolution-filter, which I will use at several places in my project. But with different operator-masks, that will also have different sizes.
My first approach looks like the following code. NCC compiles it without error, but the kernel execution fails:
__constant__ float OperatorMask[25];
__global__ void ImageConvolution(uchar4* pResult, int rX, int rY, float OperatorMask[] )
{
const int ix = blockDim.x * blockIdx.x + threadIdx.x;
const int iy = blockDim.y * blockIdx.y + threadIdx.y;
const float x = (float)ix + 0.5f;
const float y = (float)iy + 0.5f;
float4 tempResVal = {0, 0, 0, 0};
int index = 0;
for(float j = -rY; j <= rY; j++)
{
for(float i = -rX; i <= rX; i++)
{
float4 aktTexel = FromUchar4(tex2D(texRef_Image_1, x+i , y+j ));
tempResVal.x += ( aktTexel.x * OperatorMask[index] );
tempResVal.y += ( aktTexel.y * OperatorMask[index] );
tempResVal.z += ( aktTexel.z * OperatorMask[index] );
index++;
}
}
*(pResult + iy * 704 + ix) = FromFloat4(tempResVal);
}
...
float op[5][5] =
{ { 0, 0, 1, 0, 0},
{ 0, 1, 2, 1, 0},
{ 1, 2, 6, 2, 1},
{ 0, 1, 2, 1, 0},
{ 0, 0, 1, 0, 0} };
cudaMemcpyToSymbol(OperatorMask, op, sizeof(op));
ImageConvolution<<<gridDim_2D_2,blockDim_2D_2>>>
(tempIamge_2_uchar4, 2, 2, OperatorMask);
...
Does anyone have an idea, how to create a kernel, where I can “insert” different OperatorMask’s without changing the code of the kernel? (e.g. operator-masks of the size 3x3 or 6x6)
Thanks for your help!