Resident threads is the ammount/number of threads the GPU can load into it’s chip’s memory.
The GPU has a limited ammount of cores available. It cannot execute millions of threads at the same time. The GPU can only execute as many threads at the same time as it has cores available.
However sometimes some of these threads may stall for different reasons. Therefore the GPU uses a little trick. It has some additional memory which is used to store/load additional threads onto the GPU. These threads are not yet executed but they are initialized I suppose so that they can be executed at a moments notice.
This is what is referred to as resident threads… think of these as “on chip threads”. Like a cpu may store thread contexts on the stack in some cache somewhere I suppose.
So the GPU does not have to load threads from main memory or something… but it can quickly switch to these resident threads and execute those… a sort of hardware thread context switching.
It can then later return to stall threads and execute those if those are unblocked.
If all resident threads stall and get blocked the GPU will ultimately dead-lock.
So think of GPU as a batch based processor. The kernel’s threads must all exit if the GPU is not to dead-lock. No gpu thread must wait on the results of another thread or it may dead lock.
For example thread 1 to 10000 must not wait on the result of thread 1000000.
Because this would consume the GPU with threads 1 to 10000 or whatever it’s maximum resident threads is… and then it will never execute thread 1000000.
Thus threads 1 to 10000 will be waiting forever ;)