Extract a part f image


I would like to apply a filter which size is wk and hk on image.
So I store first the image on local tab and I apply my filter fisrt. On the border ofimage I add zeros.

He is my code:

extern __shared__ u_int8_t LocalStore[];

    int x = blockIdx.x*blockDim.x;
    int y = blockIdx.y*blockDim.y;
    int xglobal = x+threadIdx.x;
    int yglobal = y+threadIdx.y;
    int iMidX = 1;//(wk-1)/2;
    int iMidY = 1;//(hk-1)/2;

    if( (xglobal)  >= w  || (yglobal)  >= h)

    int ygloball = yglobal-1;
        for(int j=threadIdx.y;j<(BLOCK_SIZE_Y+hk) ;j=j+blockDim.y)
            int xgloball = xglobal-1;
            for(int i=threadIdx.x;i<(BLOCK_SIZE_X+wk) ;i=i+blockDim.x)
                if(xgloball>=0 && ygloball>=0 && xgloball<w && ygloball<h)
                    LocalStore[i+(j)*(BLOCK_SIZE_X+wk)] = (u8_ImageIn)[xgloball + (ygloball)*w];




    int ptrl = threadIdx.x+iMidX + (threadIdx.y+iMidY) *(BLOCK_SIZE_X+wk);
    u8_ImageOut[xglobal+yglobal*w] =  LocalStore[ptrl];

The problem is the image Out is not exactly the same so my local store is not good.
Can someone help me about that?
And Is there a better way to to that?