__global__ void kTest(GridInfo gridInfo, int3 test, unsigned int* res, bool cond) {
*res = getCellIdx(gridInfo, test.x, test.y, test.z, cond);
}
void test(GridInfo gridInfo) {
int3 test = make_int3(283, 10, 418);
unsigned int* d_res;
cudaMalloc(reinterpret_cast<void**>(&d_res),
sizeof(unsigned int) );
kTest<<<1, 1>>> (gridInfo, test, d_res, true);
unsigned int h_res;
cudaMemcpy(
reinterpret_cast<void*>( &h_res ),
d_res,
sizeof( unsigned int ),
cudaMemcpyDeviceToHost
);
printf("%d\n", h_res);
}
I encountered a weird bug involving passing parameters to kernel functions. kTest
is the kernel function, and the last parameter is a bool variable. The kernel function kTest
calls a __device__
function getCellIdx
, whose last parameter is basically the bool passed in from the kernel.
What I found is that if I pass true
to the kernel function (like the code shown above), the result is incorrect. But if I directly set the cond
parameter in the getCellIdx
function (i.e., *res = getCellIdx(gridInfo, test.x, test.y, test.z, true);
), then I get the correct result. Function-wise, both are equivalent, right? Or am I missing something here?
I have this bug on an RTX 2080Ti GPU, nvcc V10.0.130, Driver Version: 470.42.01 and CUDA Version: 11.4, but didn’t have this issue on another machine with a RTX 2080, nvcc V11.3.109, Driver Version: 465.19.01 and CUDA Version: 11.3.