Instruction scheduling in Ampere

Robert_Crovella · February 22, 2021, 10:16pm

I’m not aware that the details of this are fully exposed. The GA10X architecture (cc8.6) has 128 FP32 cores per SM, whereas the GA100 architecture (cc8.0) has 64 FP32 cores per SM. This dual datapath architecture was introduced in the Volta/Turing generation. I think this statement from the reference rs277 gave you is trustworthy:

" * 4 warp schedulers.

An SM statically distributes its warps among its schedulers. Then, at every instruction issue time, each scheduler issues one instruction for one of its assigned warps that is ready to execute, if any."

Yes, I realize that doesn’t provide a complete description of how the SM works, exactly. Please see my statement here which governs how I respond to some questions.

Topic		Replies	Views
A Question about how Ampere/Lovelace (RTX 3000/4000, GA10X/AD10X) cards handle Warp Dispatching CUDA Programming and Performance	13	450	June 1, 2024
Clarifing the process of issuing instructions on CUDA devices CUDA Programming and Performance	5	332	March 26, 2024
Threads Dispatching : 2 different instructions per cycles? CUDA Programming and Performance	2	33	January 31, 2025
Is there a document about in which hardware unit(ie. ALU FMU...) an instruction is executed? CUDA Programming and Performance	35	2900	October 5, 2022
I need help understanding how concurrency of CUDA Cores and Tensor Cores works between Turing and Ampere/Ada? CUDA Programming and Performance cuda , tensorflow , rtx , ampere	10	1759	September 27, 2024
Cuda operations along side Tensor operations CUDA Programming and Performance	2	478	October 12, 2021
Understanding instruction dispatching in Volta architecture CUDA Programming and Performance	5	3502	December 12, 2019
Understanding CUDA scheduling CUDA Programming and Performance	4	15453	May 20, 2014
Can warps from different CTAs be coscheduled? CUDA Programming and Performance	5	230	July 6, 2024
Scheduler concept inside FERMI CUDA Programming and Performance	2	7245	March 25, 2011

Instruction scheduling in Ampere

Related topics