Thread return on conditional

Hello. Is it safe to have threads that map outside some array return on a conditional, for example, is it safe to do

int tid = threadIdx.x + blockDim.x * blockIdx.x;

if (tid >= size) return;

// do some useful job

instead of

int tid = threadIdx.x + blockDim.x * blockIdx.x;

if (tid < size)

{

   // do some useful job

}

How about __syncthreads()? It is unsafe to do a block synchronization point inside a conditional for which some threads of a same block will return false, what happens if we do a block sync after we have some threads of the block return, for example

int tid = threadIdx.x + blockDim.x * blockIdx.x;

if (tid >= size) return;

// do some useful job

__syncthreads();

// do some other useful job

Is this unsafe? What about multiple sync points after the return?

Thanks in advance.

As you suggested, it is fine to conditionally return from a kernel unless you use __syncthreads() after some thread has returned (regardless of whether the remaining threads encounter __syncthreads() just once or multiple times). The rule is that each __syncthreads() must be encountered either by all threads or none at all.

If my memory serves my correctly, compute capability 1.x and 2.x devices act slightly differently: 1.x devices may execute a return instruction conditionally, while on 2.x devices a conditional return is implemented as a conditional branch to the end of the kernel. This should not concern the correctness of your kernel, but it may have performance implications as reconvergence points may be chosen differently.