Hard limit of 64 regs/thread on Fermi? fermi registers limit


So is there a ‘hard limit’ of 64 registers per thread on the Fermi…I have a program that uses more than 64 regs/thread on the GTX 285 but it’s apparently limited to 64 registers/thread on the Fermi (I’m using the 470…)…anyone know anything about this/is there any documentation available about this? (I see a reference to it here: http://forums.nvidia.com/index.php?showtopic=167735 but can’t find anything official…)


Yes, this is correct, although on the other hand spilling registers to local memory is a lot faster on Fermi because of the L1 cache.

Thanks for the response, any reason for this/might it be changed in future releases/architectures?

It certainly might change, although I don’t imagine it would go lower. It’s a constant balancing act between the RAM sizes, registers per thread and the number of threads per multiprocessor.