Hello every one.
I am a begineer in cuda programming. I am using GeForce GTX 650 GPU for image processing. I am experimenting by copying the image from host to GPU and back. The problem i am facing is in the indexing part of the code. I have shared the snippet of the code below.
***************************************** HOST *********************
int img_dimx=257;
int img_dimy=257;
// i get the size of the image as 66049
int input[257][257];
int output[257][257];
dim3 threadID(32,32); // i want to use all the threads in the block and in 2D format.
dim3 blockID( (img_dimx + (threadID.x -1))/threadID.x , (img_dimy + (threadID.y -1))/threadID.y);
// from the above line, block size is 9x9
// total threads is 32329*9 = 82944 threads
***************************************** DEVICE *******************
int x = threadIdx.x + (blockIdx.x * blockDim.x);
int y = threadIdx.y + (blockIdx.y * blockDim.y);
int Gloc = x + (y * img_dimx); ----------(1)
// int Gloc = x + (y * blockDim.x * gridDim.x); ----------(2)
dev_out[Gloc] = dev_in[Gloc];
When i use Gloc = x + (y * img_dimx); ----------(1)
if i have x=0 and Y=1, then i get Gloc as 257. So in the output image it will be stored at the location (0,1) i,e 1st row and 0th column.
if i have x=3 and Y=2, then i get Gloc as 517. So in the output image it will be stored at the location (3,2) i,e 2nd row and 3rd column.
But when i use Gloc = x + (y * blockDim.x * gridDim.x); ----------(2)
if i have x=0 and Y=1, then i get Gloc as 288. So in the output image it will be stored at the location (31,1) i,e 1st row and 0th column.
if i have x=3 and Y=2 then i get Gloc as 517. So in the output image it will be stored at the location (59,2) i,e 2nd row and 3rd column.
The problem is when i use eq 1 the CUDA program hangs and when i use eq 2 it works correctly. Why is it so? It should be the other way around?
and eq 2 dosent hold good if the image size is not multiple of 32 or 16.
Why is this happening? Is there any fixed way to achieve the correct indexing when the size of the image is odd or even?
Sincerely, Demonferrari