As a school project I’ve made an experiment - an attempt to speed up game physics calculations using CUDA.
I’d like to share the results and get feedback about this stuff.
In this experiment I used a code of Cyclone Physics Engine - A physics engine made for learning purposes and is actually
an accompanyng code for Ian Millington’s book Game Physics Engine Development.
Info can be found here:
I’ve converted Rigid Body Integration function and collision detection functions to CUDA kernels and compared the
running times of the functions on GPU and CPU.
In first test I compared the rigid body integration performance. The workflow involved copying forces and torques
to device in every frame (two arrays, 4 floats in each array per body), running the kernel and copying transform matrices
for rendering the bodies with GLUT to device (16 floats per body, single array of matrices).
The execution times were as follows:
In the next test I compared overall performance of a workflow which included rigid body integration as before,
sphere-halfspace collision detection and contact resolution (Runs on CPU). The memory copies per frame were forces and
torques as before, generated contact data (An array of structures, 36 bytes per contact), copying body state data
(16 floats per body) to host and then copying body state adjustments to device (16 floats per body).
The performance comparision gave the following results:
I am still a CUDA newbie and I know there’s still a lot of optimizations and further work to be done,so
all this is just the beginning.
Anyway, I’d like to hear your suggestions, comments and feedback.