Unexpected behavior from maxrregcount

Cuda_Libre · July 19, 2010, 8:11pm

Hi

I have a kernel that takes 10 registers, and if I compile it with --maxrregcount=16, it takes 9 registers.
Is it normal ? I’d rather keep it using 10 registers (in my mind, the maxrregcount optimizations often produce a little overhead (I’m not talking about register spilling))

Thanks !

PS : I use cuda 3.1

tera · July 19, 2010, 11:28pm

Does the [font=“Courier New”]launch_bounds()[/font] directive (see appendix B.16 of the Programming Guide) produce a better result?

Jimmy_Pettersson · July 19, 2010, 11:36pm

Among other strange results:

regcount = 25 , maxrregcount = 33 => kerneltime = 80 ms
regcount = 25 , maxrregcount = 54 => kerneltime = 100 ms

with cuda 2.1. Never had time to investigate further, maybe someone knows why ?

Topic		Replies	Views
When -maxrregcount option is used, kernel fail to run CUDA Programming and Performance	8	14650	February 10, 2011
two questions about maxrregcount parameter of nvcc CUDA Programming and Performance	1	13746	July 27, 2010
Register usage CUDA Programming and Performance	4	1149	March 13, 2012
Problems with maxrregcount and dynamic parallelism CUDA Programming and Performance	2	862	June 5, 2015
register count frustration CUDA Programming and Performance	4	4486	September 29, 2011
implications of the default setting(0) for Max Used Register (maxrregcount) CUDA Programming and Performance	4	8791	March 28, 2013
Multiple arch and -maxrregcount CUDA Programming and Performance	0	773	May 26, 2010
register count explodes with CUDA 1.1 CUDA Programming and Performance	2	7341	December 12, 2007
Maxrregcount ignored by compiler CUDA Programming and Performance	2	1595	November 16, 2014
--maxrregcount: sm_10 VS sm_20 If --maxrregcount too low, nvcc aborts for sm_10, continues for sm_20 CUDA Programming and Performance	0	6317	June 8, 2010

Unexpected behavior from maxrregcount

Related topics