Why GTX460 has 48 CUDA cores, but the warp size is also 32? warp

ee06b109 · March 7, 2011, 2:57am

I’m new to CUDA and confusing with the relationship between the warp size and the number of CUDA cores.
It’s known that one SM of a GT200 GPU has 8 CUDA cores, and there are 4 thread run concurrently, so the warp size is 32.
In a CTX460 GPU, a SM has 48 CUDA cores. Why it has the same size of warp with GT200?
Are there any idle CUDA cores in a SM?

tera · March 7, 2011, 3:17am

Compute capability 2.1 devices are able to issue multiple (actually, 2) arithmetic instructions from the same warp in parallel.

Starting with compute capability 2.0 (e.g. GTX 480), instructions from two warps are issued in parallel (a half-warp each, so instructions for a full warp are issued over 2 cycles). In 2.1, one of the two warps executed in parallel is allowed to issue two arithmetic instructions, so that 48 cores can be saturated.

Issuing multiple instructions per cycle isn’t actually new. 1.x devices already could issue an fmad and an fmul in parallel

ee06b109 · March 9, 2011, 7:05am

Thank you so much!

Topic		Replies	Views
Quadro 2000M spec's Number of cores CUDA Programming and Performance	3	3290	June 7, 2012
GTX 460: number of cores per multiprocessor? CUDA Programming and Performance	6	10761	July 12, 2010
Scheduling threads as Warps CUDA Programming and Performance	3	908	July 11, 2013
Fermi architecture CUDA Programming and Performance	2	765	May 24, 2011
warp scheduler of Fermi architecture CUDA Programming and Performance	2	3259	February 5, 2012
Stupid (?) questions about Warp vs. Half Warp vs. SM width CUDA Programming and Performance	3	43818	November 12, 2010
Warp thread Scheduling CUDA Programming and Performance	7	2301	June 28, 2010
warp and core What's the relationship between warp and core? CUDA Programming and Performance	12	15713	February 4, 2011
Wrap size depending on the number of SP/SM CUDA Programming and Performance	1	11497	March 10, 2011
Question about Fermi 2.1 architecture of SM(s) of 48 cores and warps of 32 threads (from a Newbie) CUDA Programming and Performance	2	1961	December 6, 2015

Why GTX460 has 48 CUDA cores, but the warp size is also 32? warp

Related topics