Synchronizing cooperative groups inside a while loop

I have the below code structure which uses Cooperative groups.

The number of thread blocks is less than the total computation size and the code tries to to do grid-stride looping over the entire problem size.

Each block performs a different number of computations (size).

When the number of thread blocks launched are a multiple of the total blocks of computation, the program run without error.

However, when the number of thread blocks is not a multiple of the total blocks of computation, the execution returns garbage results.

All the threads in a block execute the same computations, so what is the mistake here?

while (timeStep < limit) {
		for (int l = blockIdx.x; l < totalNumberOfLs; l += gridDim.x) {
			int size;
			if(threadIdx.x == 0) {
				size = sizeStruct[l];
			}
			if(threadIdx.x < size) {
				//load to shared memory
			}
			__syncthreads();

			if(threadIdx.x < size) {
				//computation
			}
		}
		coopGrid.sync();

		for (int l2 = blockIdx.x; l2 < totalNumberOfLs; l2 += gridDim.x) {
			if (threadIdx.x == 0) {
				//load to shared memory
			}
			__syncthreads();

			if (threadIdx.x == 0) {
				//computation
			}
		}
		coopGrid.sync();

		timeStep++;
	}