Hello,

I was playing around with the code for oclTriDiagonal and found a strange behavior if I fill the d-vector with zeros. The result should be a x-vector with zeros, but I get different results depending on how many systems I calculate.

If I set the variables to this:

```
int num_systems = 1;
int system_size = 155;
```

I get this result (just a snippet):

```
***The following is the result of the equation set 0
153300889072603609041622800665673728.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
9581306186007745207791562491166720.000000
0.000000
0.000000
0.000000
0.000000
0.000000
.
.
.
```

If I set the number of systems to 2 I get this result:

```
***The following is the result of the equation set 0
-33.781109
33.781109
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
-5.340002
-0.825961
.
.
.
```

The CPU version that is used for comparison gets the correct results.

To fill the d-vector with zeros I changed the if-case for choice==0 function in test_gen_cyclic function to:

```
for (int j = 0; j < system_size; j++)
{
a[j]=(float)j;
b[j]=(float)(j+1);
c[j]=(float)(j+1);
d[j]=0;
x[j]=0.0f;
}
a[0]=0.0f;
c[system_size-1] = 0.0f;
```

Did someone else get the same problems with oclTriDiagonal, or does someone know another implementation to solve tridiagonal equation systems on the GPU?

Greetings,

Stefan