Codes as follows, THETA is a const, I can define outImage as 1 in GPU function. The expected output should be 1 but they are NaN.
I think that there is an error in 3D thread ID computation. But I don’t know where? So could you tell me?
dim3 dimBlock(BLOCK_SIZE, BLOCK_SIZE, BLOCK_SIZE);
dim3 dimGrid(iRotated / dimBlock.x , iRotated / dimBlock.y, THETA/dimBlock.z);
rotationD<<<dimGrid, dimBlock>>> (imageRotatedDevice, iRotated, iRotated);
global void roatationD(float* outImage, int widthR, int heightR)
int ix = blockIdx.x * BLOCK_SIZE + threadIdx.x; int iy = blockIdx.y * BLOCK_SIZE + threadIdx.y; int iz = blockIdx.z * BLOCK_SIZE + threadIdx.z; int idx = ix + iy * widthR + iz * widthR * heightR; // index outImage[idx] = 1.0f;