reducing the number of used registers

xargon · September 17, 2009, 2:28pm

Hello,

What are the strategies for reducing the number of registers that are being used in your code. Currently, I have a bit of code that I ported to CUDA and even after much tinkering, I am unable to get the number of registers that are being used to come down (currently, it stands at 38!), which restricts the number of concurrent threads quite severely and I am unable to make full use of my card.

I packed a lot of variables into shared memory but I am still unable to get the register usage to an acceptable range. I looked at breaking up my kernel into multiple kernels but it seems quite impossible, I am afraid.

So, what are the things one should look for when attempting to reduce register usage. Sometimes, I am flummoxed by how the compiler decies to use registers (and there is a difference in optimization between windows and linux as well for the same CUDA version).

Cheers,

/x

eyalhir74 · September 17, 2009, 2:45pm

Hello,

What are the strategies for reducing the number of registers that are being used in your code. Currently, I have a bit of code that I ported to CUDA and even after much tinkering, I am unable to get the number of registers that are being used to come down (currently, it stands at 38!), which restricts the number of concurrent threads quite severely and I am unable to make full use of my card.

I packed a lot of variables into shared memory but I am still unable to get the register usage to an acceptable range. I looked at breaking up my kernel into multiple kernels but it seems quite impossible, I am afraid.

So, what are the things one should look for when attempting to reduce register usage. Sometimes, I am flummoxed by how the compiler decies to use registers (and there is a difference in optimization between windows and linux as well for the same CUDA version).

Cheers,

/x

Looks like you’ve mentioned most of the options :)

maybe you can post the kernel code… are you using doubles?

eyal

xargon · September 18, 2009, 8:26am

Nah, not using doubles. Currently, I am looking into maybe rewriting the whole thing to somehow split the damn thing into multiple kernels, but it is quite tough.

Tigga · September 18, 2009, 9:03am

I take it you already know about the -maxrregcount compiler flag (you don’t mention it). I have used compressed integer types with some success before (I seem to recall I had 4 8-bit integers packed into an int data type). Apart from that my only other suggestion would be rearranging code.

xargon · September 18, 2009, 11:05am

Yeah, I used the maxregcount option but that just kills my kernel :(

Yes, I will try and rearrange code. Would be a good exercise.

/x

Ailleur · September 18, 2009, 12:41pm

Reusing variables seems like it has helped some in the past.

nitin.life · September 19, 2009, 2:47am

Declare the per-thread variables (registers and local memory arrays) as volatile. It has helped me reduce the variable usage by a significant amount… all the time.

NA

hdinh · September 21, 2009, 9:02pm

i thought bit arrays are not allowed?? is there an advantage to this over having four individual int8_t type?

Tigga · September 22, 2009, 9:40am

I seem to recall that using one variable and doing bitwise operations to access it used fewer registers than declaring many variables - I think I tried many variables and it didn’t make a difference. Can’t find the code unfortunetly. Might have just been my special case.

Topic		Replies	Views
how to reduce registers in each kernel CUDA Programming and Performance	2	1188	November 4, 2009
Reduce no. of registers CUDA Programming and Performance	6	2761	July 23, 2008
reduce he number of register per thread in the kernel CUDA Programming and Performance	3	728	June 23, 2014
Reducing register usage CUDA Programming and Performance	1	1170	October 3, 2009
Problem with reducing registers CUDA Programming and Performance	6	708	June 22, 2011
Is it possible to use more than 124 registers in kernel? CUDA Programming and Performance	15	4286	October 16, 2009
Number of registers CUDA Programming and Performance	6	2234	March 24, 2009
reduce the no of register per thread used CUDA Programming and Performance	2	2983	October 15, 2009
Use of register An odd problem CUDA Programming and Performance	12	2462	August 12, 2010
save registers CUDA Programming and Performance	1	3084	April 1, 2010

reducing the number of used registers

Related topics