GPU architecture and warp scheduling

txbob, for you it may be easier to ask inside nvidia

it’s not clearly documented. in Fermi whitepapers, register file (page 8) was pictured as monolith, but page 10 shown that left scheduler executes only even warps, while right scheduler executes only odd warps: http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf

Kepler whitepaper also pictured register file as monolith: http://www.geforce.com/Active/en_US/en_US/pdf/GeForce-GTX-680-Whitepaper-FINAL.pdf

Fortunately, hardware.fr publications contained more exact picture of each SM, in particular for Kepler: http://www.hardware.fr/medias/photos_news/00/44/IMG0044011_1.jpg

Finally, starting with Maxwell, NVidia fixed their picture and started to show register file as individual per scheduler. See page 8 in documents below:

http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF

And individual register file per scheduler means that warp cannot be quickly moved to other scheduler - it will require to copy all register contents. Or it will require to make all registers available to all schedulers, but this will require to make 4x more read/write ports (which is very precious resource) in the register file and in this case, it will be more logical to continue show register file as shared by the all schedulers