How can this code be amended to reduce register?

BY commenting out sections of code and compiling with ptxas option to get register and memory use reported I’m gaining insight into how many registers particular sections of my code require .

The following section apparently requires 11 because without it I need only 2 but with it I need 13.

How can this code be amended to reduce the number of registers and why?


u_min[i] = u[i];
temp_u = 0.0;
temp_u = -NSYM*p[i]*vx[i]/x[i]/rho[i];
if(u[i]<0.0) u[i] = 0.0;

v_min[i] = vx[i];
vx[i]+= dvx[i]*dt/2.0;

be careful drawing conclusions this way, the optimizer is very good at dead code removal, so when some calculations are not contributing in any way to a result that is written to global memory, all of those calculations are thrown away also.

I would do the following below:

u_min[i] = u[i];

v_min[i] = vx[i];

temp_u = -NSYM*p[i]*vx[i]/x[i]/rho[i];

float half_dt = dt/2.0f; // put the f, otherwise you use doubles on GT200


vx[i]+= dvx[i]*half_dt;

u[i] = fmaxf(u[i], 0.0f);