Scheduler concept inside FERMI

sir_underground · March 24, 2011, 8:14pm

Hi,
I need a confirmation about one fact. In the page 10 of Fermi white paper, we can see an diagram concept of scheduler. If we take in consideration it, each instruction rectangle represent a package of instructions for each CUDA core? Or, it’s a instruction package for one CUDA Core at the time? I’m trying to figure out what it happen exactly.

hamster143 · March 24, 2011, 10:41pm

Each rectangle represents 16 copies of the same instruction sent to 16 cores within the same SM.

Because there are 32 cores in the SM, both schedulers can send a total of 32 instructions to cores at the same time.

To quote the programming guide: “At every instruction issue time, each scheduler issues:
ï± One instruction for devices of compute capability 2.0,
ï± Two instructions for devices of compute capability 2.1,
for some warp that is ready to execute, if any. The first scheduler is in charge of the warps with an odd ID and the second scheduler is in charge of the warps with an even ID. Note that when a scheduler issues a double-precision floating-point instruction, the other scheduler cannot issue any instruction. A warp scheduler can issue an instruction to only half of the CUDA cores. To execute an instruction for all threads of a warp, a warp scheduler must therefore issue the instruction over two clock cycles for an integer or floating-point arithmetic instruction.”

sir_underground · March 25, 2011, 5:50pm

I understand! Thank you very much!

Each rectangle represents 16 copies of the same instruction sent to 16 cores within the same SM.

Because there are 32 cores in the SM, both schedulers can send a total of 32 instructions to cores at the same time.

To quote the programming guide: "At every instruction issue time, each scheduler issues:

ï± One instruction for devices of compute capability 2.0,

ï± Two instructions for devices of compute capability 2.1,

for some warp that is ready to execute, if any. The first scheduler is in charge of the warps with an odd ID and the second scheduler is in charge of the warps with an even ID. Note that when a scheduler issues a double-precision floating-point instruction, the other scheduler cannot issue any instruction. A warp scheduler can issue an instruction to only half of the CUDA cores. To execute an instruction for all threads of a warp, a warp scheduler must therefore issue the instruction over two clock cycles for an integer or floating-point arithmetic instruction."

Topic		Replies	Views
warp scheduler of Fermi architecture CUDA Programming and Performance	2	3209	February 5, 2012
regarding transcendental instruction execution cycles in Fermi CUDA Programming and Performance	7	2382	November 19, 2010
How do CUDA cores on a SM execute warps concurrently? CUDA Programming and Performance	8	28699	July 4, 2019
Understanding fermi warp scheduler CUDA Programming and Performance	0	2384	December 2, 2011
Clarifing the process of issuing instructions on CUDA devices CUDA Programming and Performance	5	332	March 26, 2024
Execution of a warp CUDA Programming and Performance	0	460	November 28, 2013
"Half-warps", scheduling, and branch divergence CUDA Programming and Performance	3	4302	February 24, 2013
Understanding CUDA scheduling CUDA Programming and Performance	4	15453	May 20, 2014
Fermi doesn't keep all execution units busy? CUDA Programming and Performance	2	4756	February 24, 2010
Warp threads execution model CUDA Programming and Performance	8	2770	January 19, 2010

Scheduler concept inside FERMI

Related topics