I am working in the SPH/CFD area and am looking to process two large arrays holding pairs of indices of particles, pair_i and pair_j. To process these arrays in parallel I was thinking of assigning each thread to an index of these two arrays, but this would mean multiple writes from several threads to the same element of another array called rho in order for them to each add a value to that element. Is there a way in which this could be avoided?
Another method would be for each thread to read the two large arrays, finding the relevant indices to process, and each have its own local register called rho_i to accumulate the value of rho it is processing, writing the final result to the actual element of the rho[i] array to return to the host.
So I would like to know
what happens exactly when several threads try to read from the same element of an array on a C870?
what happens exactly when several threads on a C870 try to add a number of type float to the same number element of an array of floats? Is this a major/a minor/no problem?