Using shared memory as extra register space Doesnt work?

Ailleur · October 21, 2008, 6:40pm

Ive tried using shared memory as a source of extra register…

Assigning a “private” block of shared memory to every thread in a block and using it as an extension of register space.

Something like :

__shared__ float shValues[DIMBLOCK*4];

would hold 4 variables in shared memory instead of registers.

With no syncthreads whatsoever so it wouldnt slow me down.

I figure i could run into some bank conflicts, but itd still be faster than only running one block per MP

But when i try to do that, the register usage actually goes up.

Does anybody know why?

–edit

Some more volatile fun.

If instead of accessing a shared register with

shValues[threadidx.x+2*DIMBLOCK]

I use

volatile int two = 2;

shValues[threadIdx.x+two*DIMBLOCK]

then the register usage doesnt go up, or down. Its the same as if i had used a register.

So the “2” was stored in a register for the whole execution while the “two” is stored in some constant (hence the …volatile?!?) segment somewhere…

Which brings us back to… what the hell does volatile do?

–edit2

Some more register allocation fun!

volatile float alphamin = 0;

imin=ceilf((x1+alphamin*(x2-x1)-bx)/dx);

repeat for y and z and for x1<x2 and x2<x1…

This way i get 27 registers

stupid way to do things right? alphamin is 0, just skip the whole calculation right?

wrong, it seems.

if i do

imin=ceilf((x1-bx)/dx);

then i hike to 30 registers…

Reimar · October 22, 2008, 6:46am

volatile on a local variable does not make much sense, better make your shared memory array volatile.
Also unless you are using marregcount and/or the compiler uses lmem already, optimizing away one register might just mean the compiler will optmize some other code even more by using additional registers.

JHHPC · October 22, 2008, 9:19am

From my experience the only way to force the kernel to better use registers is to explicitly reuse the variables.
Produces very inconvenient code, however, it works.

Topic		Replies	Views
Question about shared memory usage How to use as reg and volatile effect CUDA Programming and Performance	11	7060	November 28, 2007
How to force nvcc to use registers instead of shared memory? CUDA Programming and Performance	7	1122	December 26, 2018
Register Usage & Shared Memory How to limit usage properly? CUDA Programming and Performance	1	4913	June 30, 2008
spill register to shared mem CUDA Programming and Performance	2	4321	March 10, 2012
How to force nvcc to use registers instead of shared memory? Need help to understand compiler option CUDA Programming and Performance	2	1119	March 16, 2009
Volatile - when to use? (Regarding registers) CUDA Programming and Performance	3	12455	March 17, 2012
Register usage How good is the compiler? CUDA Programming and Performance	6	3186	April 3, 2008
optimizing registers by using shared memory when specifying -maxregcount maximizing the utility of s CUDA Programming and Performance	13	12100	March 3, 2010
How to force to use registers instead of shared memory? CUDA Programming and Performance	6	5502	March 8, 2010
Register or shared memory? CUDA Programming and Performance	5	4389	July 31, 2009

Using shared memory as extra register space Doesnt work?

Related topics