Hello everyone!
I’d like to know the differences between the two following implementation to go through an array.
considering both functions are called with this:
array << <blocks, threads, 0, 0>> >(input, tab_size);
Implem1
static global void array(
int* input,
const unsigned int input_size)
{
unsigned int index = blockIdx.x * blockDim.x + threadIdx.x;
while (index < input_size)
{
input[index] = 1 // apply something on the array
index += blockDim.x * gridDim.x;
}
}
Implem2
static global void array(
int* input,
const unsigned int input_size)
{
unsigned int index = blockIdx.x * blockDim.x + threadIdx.x;
if (index < input_size)
{
input[index] = 1 // apply something on the arry
}
}
I already know that both works but I’d like to know if one of the implementation is more efficient than another and more importantly why. I did few benchmarks but the result seems quite similar.
Considering that the application I’m developing is very resource consuming every performance improvement (even small) is very important.
Thanks,
Kawa