I just started using CUDA and got some tests running. Now im working on a simple physic simulation. The engine only uses particles which are connected through springs.
I wanted to simulate like this:
- calc all spring forces and add forces to the particles
- apply forces and move particles
In step 1. i ran into a problem I did not yet found a way to fix: It seems that forces are not applied correctly because when I walk through the array of springs and calculate each force it could happen that spring[i] and spring[j] want to add a force to particle[i]. I think what happens is a typical “write-after-write error”. As a result the simulation doesn’t work at all. None of the forces seems to be any close to correct.
My question: Is it possible to synchronize or lock access to certain particles / array entries in an efficient way? I searched the web but did not find an answer. I read about shared memory. But I don’t know how to use it in my simulation.
You can use atomicAdd() to ensure the forces are correctly updated. Preferably you would use one thread per particle (instead of one thread per spring) which (unlike atomicAdd()) will give bit-for-bit identical results on each run and likely also run faster (avoids unnecessary writes of intermediate results to global memory).
Thanks for your reply.
I tried using atomicAdd functions but it didn’t work. (Is “-arch=compute_20” the correct parameter to set for nvcc?)
I managed to work around atomicAdd and got the simulation running correctly.
How did You ‘work around’ atomicAdd, if I may ask?
I calculate each force a spring would apply and than in the particle update step I add the force to each particle. (Each particle knows which springs are connected to itself)
That means: I calculate all springs in parallel and afterwards all particles in parallel.
Just wrote the same code for the CPU and in comparison GPU is a lot faster. Just need to speed up rendering now.
If you can reformulate your problem from the per spring basis, to the per particle basis, then
the usual approach is not to use Newton’s third law, but to calculate forces twice - separately for each particle.