Possible problem with global memory reads

I am trying to get a crowd simulation working on the GPU … I have skeleton data that i re-use for all the characters and several different animation data.

When I run the simulation in EmuDebug mode everything seems to be working fine. However in release mode the characters seem to be swapping between different animations.

In my program each thread handles a single character and reads skeleton and animation data from global device memory. My guess is the simultaneous access of the same animation data by different threads is causing the problem.

Any pointers to possible solutions would be very helpful.

Simultaneous loads are not a problem, but simultaneous stores are race conditions. EmuDebug serializes everything and runs on the CPU, so it will hide a lot of race conditions, but that’s not true on the GPU–scheduling is not fixed and will cause the issues you’re seeing.

It’s hard to suggest a solution without seeing code, but the easy solution is to only write back to memory if you’re sure that no other thread will be writing to the same address. Atomics might help you out here.

Solved it … thanks! Race condition it was

Would you mind sharing the solution with us?
I have a similar problem with multiple writes to the same address in global memory and atomics don’t help me as they are only for integers. I’m using chars.