What are the implications on having a potentially huge grid?
I ask this because, if there is a limited number of registers that is equally divided between all threads, it could overcome this limit, right?
Let’s say for instance that I only have 1 MP, which means a total of 8192 available registers. So if I launch my kernel with <<<4,256>>> each thread has a total of 8 available registers. What if a launch it with <<<40,256>>>, meaning a total of 10240. Since this number overcomes the total of 8192 available registers on my MP, will it start using local memory to allocate what could potentially be allocated in registers?