Hello,
I was playing around with the code for oclTriDiagonal and found a strange behavior if I fill the d-vector with zeros. The result should be a x-vector with zeros, but I get different results depending on how many systems I calculate.
If I set the variables to this:
int num_systems = 1;
int system_size = 155;
I get this result (just a snippet):
***The following is the result of the equation set 0
153300889072603609041622800665673728.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
9581306186007745207791562491166720.000000
0.000000
0.000000
0.000000
0.000000
0.000000
.
.
.
If I set the number of systems to 2 I get this result:
***The following is the result of the equation set 0
-33.781109
33.781109
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
-5.340002
-0.825961
.
.
.
The CPU version that is used for comparison gets the correct results.
To fill the d-vector with zeros I changed the if-case for choice==0 function in test_gen_cyclic function to:
for (int j = 0; j < system_size; j++)
{
a[j]=(float)j;
b[j]=(float)(j+1);
c[j]=(float)(j+1);
d[j]=0;
x[j]=0.0f;
}
a[0]=0.0f;
c[system_size-1] = 0.0f;
Did someone else get the same problems with oclTriDiagonal, or does someone know another implementation to solve tridiagonal equation systems on the GPU?
Greetings,
Stefan