I wrote a program by CUDA/C++ that it scans image with a 20×20 block. It jump 20 pixels in cols and rows,but I want it jump only 1 pixel in rows and cols in each time. For example start 20x20 block is from 0,0 and then it jump to 20,0, both of in rows and cols. But I want after 0,0 and read a 20x20 block, it starts from 1,0 and read next block.
GPU Function:
__global__ void _TEST_GPU(uchar* mt, uchar* motion, size_t step, int h, int w)
{
int row = blockIdx.y * blockDim.y + threadIdx.y;
int col = blockIdx.x * blockDim.x + threadIdx.x;
int index = col + row*(step / sizeof(uchar));
mt[index] = 255;
}
Call GPU Function:
dim3 block(20, 20);
dim3 grid(image.cols / block.x, image.rows / block.y);
_TEST_GPU << <grid, block >> > ((uchar *)GMat.data, (uchar *)GMotionMat.data, GMat.step, dst.rows, dst.cols);