GPU architecture and warp scheduling

Robert_Crovella · February 9, 2018, 2:52pm

I said “most”

Volta (sm_70) went back to single-issue, I believe (and also doubled the number of warp schedulers per SM, compared to a sm_60 SM). NVIDIA talks about the reasons for this in such presentations as GTC 2017 Inside Volta (you may have to listen to the recording).

Fermi 2.0 was not dual issue either, although 2.1 was dual-issue capable. Kepler, Maxwell, and Pascal should all be dual-issue capable, I believe.

THe kepler description in the programming guide certainly indicates this:
[url]Programming Guide :: CUDA Toolkit Documentation

Here’s a comment from Greg at NV indicating Kepler and Maxwell are dual-issue capable:

[url]Understanding CUDA scheduling - CUDA Programming and Performance - NVIDIA Developer Forums

I admit that seems to contradict the wording in the cc 5.0 programming guide description.

I don’t have a crisp explanation for every reference you have found, but thanks for pointing those out.

I certainly would like to retract my statement about warp assignment. I agree that in Volta the indications are that it is static, with no migration. At some point I think this must have changed, but I’m not really sure. Maybe it has been static assignment all the way back to Fermi.

Topic		Replies	Views
warp and core What's the relationship between warp and core? CUDA Programming and Performance	12	15715	February 4, 2011
Warp scheduling - have I got this right? CUDA Programming and Performance	17	12327	February 12, 2013
GPU architecture and CUDA kernel execution CUDA Programming and Performance	13	25014	September 6, 2009
Warp Size Question CUDA Programming and Performance	21	14167	June 18, 2010
Branch Divergence Serialization (Threads/hardware stalls ?) Performance Impact ? Branch divergence s CUDA Programming and Performance	3	1640	June 15, 2011
Can threads in a warp from different blocks? CUDA Programming and Performance	17	11992	March 26, 2010
questions about thread execution & volatile CUDA Programming and Performance	19	17028	December 29, 2008
Multiprocessors or Cuda Cores CUDA Programming and Performance	25	19993	July 5, 2011
Nvidia GF104 vs GF100 CUDA Programming and Performance	24	23106	October 12, 2010
Programming Model/Hardware Implementation mapping CUDA Programming and Performance	4	5034	February 4, 2008

GPU architecture and warp scheduling

Related topics