Problem with oclTridiagonal

Hello,

I was playing around with the code for oclTriDiagonal and found a strange behavior if I fill the d-vector with zeros. The result should be a x-vector with zeros, but I get different results depending on how many systems I calculate.

If I set the variables to this:

int num_systems = 1;

	int system_size = 155;

I get this result (just a snippet):

***The following is the result of the equation set 0

153300889072603609041622800665673728.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

9581306186007745207791562491166720.000000

0.000000

0.000000

0.000000

0.000000

0.000000

.

.

.

If I set the number of systems to 2 I get this result:

***The following is the result of the equation set 0

-33.781109

33.781109

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

0.000000

-5.340002

-0.825961

.

.

.

The CPU version that is used for comparison gets the correct results.

To fill the d-vector with zeros I changed the if-case for choice==0 function in test_gen_cyclic function to:

for (int j = 0; j < system_size; j++)

    {

    a[j]=(float)j;

    b[j]=(float)(j+1);

    c[j]=(float)(j+1);

d[j]=0;

    x[j]=0.0f;

    }

      a[0]=0.0f;

      c[system_size-1] = 0.0f;

Did someone else get the same problems with oclTriDiagonal, or does someone know another implementation to solve tridiagonal equation systems on the GPU?

Greetings,

Stefan