Is it possible to make one dimension of thread index wait for the other?

If I have a 2D thread index, is it possible to get the kernel to go through all of the x threads for the first y thread before getting it to move to 2nd y thread and then go through all the x threads again?

Many thanks,

Im not quite sure what you mean. Your threads work in parallel mainly - not in sequence. Dont know why you would like to run a certain amount of threads in sequence. But if you need to do this you could use if-statements to make all threads with “unwanted” ids to stay idle while you let another bunch of threads - lets say with threadIdx.y == 0 - work. Then use __syncthreads to wait for all threads in this block to proceed to this point and repeat the if for the next bunch…
And for the warp a couple of threads belong to it works in the following way:
You have sth like e.g. id = threadIdx.y * blockDim.x + threadIdx.x and set a blockDim.x of 32.
All threads with threadIdx.y == 0 belong to the same warp then and have a threadIdx.x ranging from 0 to 31.

Just have a look at the first few pages of the Programming Guide. Pages 9 and 10 in the Programming Guide 3.1 show you an example.