Recyclation of variables stored in registers

Hi,
I have a newbie question. If I have 2 arrays in kernel for example:

__global__ void Kernel(...){
double a [3];
double b [3];
...
}

I assume they are stored in registers. Now, the block of code with operations on array a is separated from block of code block with operations on b . Simply:

__global__ void Kernel(...){
double a [3];
double b [3];

OperationsOnA(a);
OperationsOnB(b);
}

Now the question is: Is the space in registers allocated for both arrays equal to 3sizeof(double) (space for further unused variables is recycled - space where a [] was is then occupied by b []), or = 6sizeof(double) (space for further unused variables is not recycled - a and b are independently stored in registers till the termination of kernel)?

Thanks,

Dalibor

Either and/or both. The register allocator is a very complicated beast and is combined with a lot of other heuristics in the compiler. In general, it is fairly aggressive at minimizing register usage.

This type of array declaration works fine in registers, but only if you ever index with compile time constants. If you index with a variable (that is not a loop unrolled index), they arrays will be dropped to local memory. I have one kernel with 2 10 element arrays and the sm_35 compiler puts both of them in registers. On sm_20, it spills them to local memory because the kernel goes over the 63 reg limit.

I got it, thanks for reply.
I am going to try, for the first time, the Nsight Eclipse for further register optimization. Is it a good tool for this task? Or is there any other method for register usage optimization.

I have a kernel where the , recyclation , of register space of unused variables is essential. That is why I ask.

You don’t need anything more than nvcc. It performs all of the register allocation with no input by the programmer. The only things you can do is force it to spill to local memory using the maxregcount argument, or by setting __launch__bounds (see the programming guide for info on that).