strange loop problem

I have the sequent loop

 const int              tid = blockDim.x * blockIdx.x + threadIdx.x;

  const int  THREAD_N = blockDim.x * gridDim.x;

int numiter = 16200;

	for(int iRng = tid; iRng < numiter; iRng+=THREAD_N){...}

Knowing that THREAD_N is 8192(=64 * 128) apparently not all the loops are executed. It stops at 12196. Insted if i launch 16384 threads (thus THREAD_N = 16384) all the loops are executesd.

What could be the possible reasons? end of memory perharps?


How do you determine when the loops stop? For 16834 threads, some of the threads won’t execute the loop at all (tid 16200 through 16384). For 8192 threads, some threads will execute 2 iterations, while the rest just 1. Which threads don’t do the expected number of iterations in your case?