concurrent float read/writes

grigor · January 21, 2010, 2:47am

Hello everyone!

I am CUDA beginner, it’s a question about basics, I’d appreciate any help. I’m trying to use concurrent ant colony optimization for solving TSP.

I got a graph with n cities and n^2 of edge weights. Matrix of edge weights is kept in global memory. I run a kernel of size m ants (grid may contain one or many blocks if it would help you). Every ant has generated a tour and now it’s time to update pheromone matrix P[n][n] (like an edge matrix, it’s allocated in global device memory). Pheromone value for every weight is a float number. Every ant, as a different thread, needs to peform addition on some value of P.

Now, what to do to perform synchronous addition by many ants on some value P[i][j]? If i only could, i’d use a float version of atomicAdd(), but obviously there isn’t such a function.

I’ve found some suggestion on the forum for atomicAdd on floats but it’s a very expensive workaround. Would you help me to point elegant and fast solution? I’ve read Programming Guide, but i don’t feel which of the tools leads to solution the simplest way.

Regards,
grigor

SPWorley · January 21, 2010, 3:46am

The weights are linear. So use fixed point representation… all you need is the fire and forget integer atomics for those.