decreased register usage: a proposal Easy way to decrease register usage

erlebach · May 26, 2008, 5:27pm

Hi,

I am having serious problems with register usage explosion. I have read
all the messages on the subject, and it is clear that the optimization of register
usage is complex and takes many things into account (bank conflicts, etc.)
However, while you decrease bank conflicts, you might increase bandwidth
in other parts of the code.

I would like to propose a simple solution to give the users “some” control.
Perhaps there could be compiler directives inserted into the CUDA code, such as

//BEGIN_REGISTER_OPTIMIZE

//END_REGISTER_OPTIMIZE

The compiler would only optimize register usage between such pairs of statments.
In the absence of such statements, the compiler would optimize the entire
code. This should be very easy to do since the optimizer more than likely
optimizes code between some initial statement and some end statements. The point is
that optimization could not cross these “barrier” statements.

Could somebody please comment on this idea’s feasibility. This would help many people
working with complex simulations. Thanks.

Gordon

erlebach · May 27, 2008, 5:16pm

Can anybody tell me why my proposal is impractical, not doable, etc? Some feedback from some of the experts would be nice? Or some support :-)

Gordon

seibert · May 27, 2008, 11:30pm

While you wait for an expert to take notice of this thread, you might want to see if you can construct a kernel (or pair of kernels) which shows this huge jump in number of registers. I can almost guarantee that the NVIDIA folks will want to see that first. :)

Mu-Chi_Sung · May 28, 2008, 12:02am

Well, maybe it’s practical, but it’s not a common practice to put such compiler control statement in comments…:D
Usually we would use #pragma blablabla to control compiler optimization flags. You can specify optimization level by this, so I believe the register allocation optimization control can be specified in the same way. However, the actual problem here is not how to define or control such optimization but how to do this kind of register optimization. I believe they should have some smart way to solve this already, just need some time to implement it. Let’s wait for CUDA 3.0! (is it too far far far away? :D)

erlebach · May 28, 2008, 4:09am

Thanks for the two replies. I don’t mind constructing an example kernel, but by definition, it will have to be rather a large kernel. Register explosion typically hapens only with large problems. It has been reported in this Forum before. When calling a method multiple times, register usage is often greater than calling it once. There has been discussion of using volatile to decrease usage (it has for me), and I even got 30 percent speedup with with the NVCC compiler option.

However, these techniques are very hardware dependend.

I know that optimization comments (or pragmas) inside code is not common
practice. However, languages like CUDA with explicit control of so many
threads is also not common practice. So common practice is not the reference.
There is precedent: OpenMP works through compiler directives. Baiscally,
I was just offering a quick solution.

I Thanks,

 Gordon

Topic		Replies	Views
reducing the number of used registers CUDA Programming and Performance	8	6319	September 22, 2009
Use of register An odd problem CUDA Programming and Performance	12	2301	August 12, 2010
Reduce no. of registers CUDA Programming and Performance	6	2655	July 23, 2008
Anyone help me with the link to the register usage break points? CUDA Programming and Performance	4	346	July 14, 2022
The optimization options in nvcc have resulted in increased register pressure CUDA Programming and Performance cuda	8	115	December 13, 2024
how to reduce the number of registers CUDA Programming and Performance	5	8919	July 8, 2010
Problem with reducing registers CUDA Programming and Performance	6	623	June 22, 2011
What, other than variables, consume registers? Help me understand where my register usage is going CUDA Programming and Performance	9	2132	June 15, 2011
Reducing register usage CUDA Programming and Performance	1	1125	October 3, 2009
Register usage of a device function for vector rotation CUDA Programming and Performance	14	709	June 12, 2022

decreased register usage: a proposal Easy way to decrease register usage

Related topics