I am a little confused about the numbers.
In the programming guide, appendix A.1 general specifications:
- The maximum number of threads per block is 512
- The maximum number of active threads per multiprocessor is 768
If I am not wrong, a block is executed on a single multiprocessor. Therefore, why are these numbers different ? I guess it is because I can run for example 2 blocks one having 512 other having 256 threads, right ?
What happens if a block contains more than 512 threads e.g. 64x64 size ? Are they scheduled to run serially ?
Also, it says:
- The number of registers per multiprocessor is 8192
Can any processor access all the registers ? or it is like each processor has 8192/8=1024 registers ?