I have a fairly complex cellular automata application which uses a 154x154 grid.
In standard C++, 500 iterations take 258.22 seconds to complete.
Using CUDA, these 500 iterations only take 17.802 seconds!
On each iteration, a calculation takes place involving locating circular neighbours within a certain radius of each grid cell. So I suppose this application is very suited to GPU programming!
Have any other users founds similar speed ups?! I’m double checking my code as this seems an unbelievable increase