I tried to accelerate my program using CUDA. Routine listed below should calculate a cost of a route:
The same routine on the CPU called in a cycle fills y array with costs. On the GPU this routine fills array with -431602080 values. When I reduce number of “for” cycle steps to 20 sometimes I have correct costs in some elements of y, but in other elements I have -35659499650496332000 values and #QNAN0’s. Could somebody explain what is happening?
Frankly, I do not understand what your code is doing.
What does x and c arrays hold exactly - what is
x[p33+q] (for example index of target of q’th road from node p?)
and
c[r33+s]
No, I mean a cost of step from node r to node s using any route.
This routine is a part of population search algorithm for kind of vehicle routing problem. Actually, it is more complex but on GPU it does not work even in this simple version and I want to understand why.
There is a population of routes and each should be tested. Testing usually takes about 50% of total CPU time. I thought that it is a smart idea to use GPU to accelerate this tests.