Look at this code:
if (threadIdx.x == 0)
{
temp1 = h * con[v];
temp2 = 1.0 + con[v + 2];
temp3 = -con[v + 2];
}
__syncthreads();
// ... other stuff, not harming the variables
__syncthreads();
y[threadIdx.x] = temp1 * y[threadIdx.x] + temp2*y2[threadIdx.x+1] + temp3 * y3[threadIdx.x];
where temp1-3 are shared double variables, con is a double array in constant space, y, y2 and y3 are shared double arrays.
Well, if I execute this stuff, y[i] is not calculated correctly. Every y[i] has added additional … lets say… 0.0135 to itself. This seems to be a hint, that a temp-variable has not been calculated correctly. y, y2 and y3 are initialized correctly. Well i tried much things… and now the strange effect: When I read the con[v] values from constant space BEFORE they are read by temp-calculation, everything will be computed right:
if (threadIdx.x == 0)
{
_y[0] = h * con[v];
_y[1] = 1.0 + con[v + 2];
_y[2] = -con[v + 2];
temp1 = h * con[v];
temp2 = 1.0 + con[v + 2];
temp3 = -con[v + 2];
}
__syncthreads();
// ... other stuff, not harming the variables
__syncthreads();
y[threadIdx.x] = temp1 * y[threadIdx.x] + temp2*y2[threadIdx.x+1] + temp3 * y3[threadIdx.x];
works well, where _y[i] is a double array in global memory. Strange, eh?
Does anybody has an idea?