CUDA oddness

I recently ran into the following issue:

Testing was done on a single thread only with no shared data what so ever. The following code snippet works fine:

pColor->x = ( maxAge - age ) / maxAge;
pColor->y = ( maxAge - age ) / maxAge;
pColor->z = ( maxAge - age ) / maxAge;

But the following does not:

float temp = ( maxAge - age ) / maxAge;
pColor->x = temp;
pColor->y = temp;
pColor->z = temp;

The second one seems to write 12 bytes to the left. Anyone seen the same behaviour?