Multiple Data Access on C870


I am working in the SPH/CFD area and am looking to process two large arrays holding pairs of indices of particles, pair_i and pair_j. To process these arrays in parallel I was thinking of assigning each thread to an index of these two arrays, but this would mean multiple writes from several threads to the same element of another array called rho in order for them to each add a value to that element. Is there a way in which this could be avoided?

Another method would be for each thread to read the two large arrays, finding the relevant indices to process, and each have its own local register called rho_i to accumulate the value of rho it is processing, writing the final result to the actual element of the rho[i] array to return to the host.

So I would like to know

  1. what happens exactly when several threads try to read from the same element of an array on a C870?

  2. what happens exactly when several threads on a C870 try to add a number of type float to the same number element of an array of floats? Is this a major/a minor/no problem?


If you add from multiple threads to a global memory location, the outcome is undefined in that it can be that only 1 thread added its value, or multiple threads added their value.

What exactly are you trying to do (do you have some pseudocode or example for a small array size)? Because I think you might want to do something that looks like something I did a while ago.