In a kernel where each cycle requires less threads, is there any issue with some threads returning from the kernel while the other threads continue to execute? Can the remaining threads still syncthreads()? Are the threads that return available to be used in other blocks?
Thanks,
/Chris