Hi,

for my understanding my kernel is supposed to store the result of a calculation in a register variable. but instead it seems that it does the same calculation each time im refering to this variable.

in real my calculations are much more complex but for better illustrating my issue i use a simplified example:

**global** myKernel(float arg1, float arg2, float arg3, float *globalArray)

{

int idx = … ;

```
float reg1 = __cosf(arg1) + arg2 * __sinf(arg1) / -arg3;
float reg2 = __sinf(arg1) + arg3 * __cosf(arg1) / -arg2;
float reg3 = reg1 * reg2 + 128.0f;
float reg4 = reg1 / reg2 - 128.0f;
globalArray[idx] = reg3 * reg4;
```

}

this kernel should store two results, in reg1 and reg2. and for calculating reg3 and reg4 i just want to reuse the precalculated results.

but for some reason there is no difference concerning performance to the following kernel:

**global** myKernel(float arg1, float arg2, float arg3, float *globalArray)

{

int idx = … ;

// float reg1 = __cosf(arg1) + arg2 * __sinf(arg1) / -arg3;

// float reg2 = __sinf(arg1) + arg3 * __cosf(arg1) / -arg2;

```
float reg3 = (__cosf(arg1) + arg2 * __sinf(arg1) / -arg3) * (__sinf(arg1) + arg3 * __cosf(arg1) / -arg2) + 128.0f;
float reg4 = (__cosf(arg1) + arg2 * __sinf(arg1) / -arg3) / (__sinf(arg1) + arg3 * __cosf(arg1) / -arg2) - 128.0f;
globalArray[idx] = reg3 * reg4;
```

}

as mentioned it seems that my kernel uses the variables reg1 and reg2 as some sort of inline methods.

does anyone know how to avoid that?

is it possible that maybe this scenario appears, when i use too many register varibles in my kernel?

best regards, rob