how to avoid race condition without using the Atomic function

Hello I’m studying physical simulation and I have trouble in sorting.

When I change my source code to CUDA version, there was race condition problem.

I had been finding the sort algorithm with cuda, and I decide. I have to develop the parallel algorithm to understand my code.

So my question is that, how I can avoid race condition without using the atomic function.

Or how I can implement the read - after - write function with simple code, or device function.

ArrayFire will be helpful to you. Use this:

and Is there any hardware restriction? My gpu board is Tesla C870,

It can’t use some additional function of CUDA latest version.

You can also do reductions. Google a little:

CUDA provides several functions which are useful. YOu need to take a look at the Programming guide and see a list of availabele functions.

There is no restriction. ArrayFire will run on the Tesla C870.

You can use the shared memory or just use the thrust library.

