Help please!

Blue_caterpillar · November 3, 2009, 8:20am

I tried to accelerate my program using CUDA. Routine listed below should calculate a cost of a route:

global void gputest(int *x, float *y, int *z, float *c, float *d, float *p, int N)

{

  int idx = blockIdx.x * blockDim.x + threadIdx.x;

  int i,ths,next;

  float t=0;  

  if (idx<N)

    {

       ths=0;

       for (i=1;i<=33;i++)

         {

            next=x[idx*33+ths];//x are routes

            t=t+c[ths*33+next];//Ñ is cost matrix

            ths=next;

         }

      y[idx]=t;//y is array of costs of routes

   }

}

The same routine on the CPU called in a cycle fills y array with costs. On the GPU this routine fills array with -431602080 values. When I reduce number of “for” cycle steps to 20 sometimes I have correct costs in some elements of y, but in other elements I have -35659499650496332000 values and #QNAN0’s. Could somebody explain what is happening?

Cygnus_X1 · November 3, 2009, 10:25am

Frankly, I do not understand what your code is doing.
What does x and c arrays hold exactly - what is
x[p33+q] (for example index of target of q’th road from node p?)
and
c[r33+s]

Blue_caterpillar · November 3, 2009, 10:44am

This code goes through route and calculates its cost.

Yes, x[p*33+q] is the index of target where to go from node q on route p.

c contains costs of steps. c[r*33+s] is a cost of step from node r to node s.

Cygnus_X1 · November 3, 2009, 10:54am

So what if there are more than 33 nodes in your graph? Did you mean a cost of step from node r using route s?

Blue_caterpillar · November 3, 2009, 11:06am

There are exactly 33 nodes in this version.

No, I mean a cost of step from node r to node s using any route.

This routine is a part of population search algorithm for kind of vehicle routing problem. Actually, it is more complex but on GPU it does not work even in this simple version and I want to understand why.

There is a population of routes and each should be tested. Testing usually takes about 50% of total CPU time. I thought that it is a smart idea to use GPU to accelerate this tests.

Blue_caterpillar · November 5, 2009, 3:19pm

Problem is solved, this topic can be deleted.