I’ve written a simple “game of life” program for cuda just to do some benchmarks and gain some experience. It calculates the next state of the game iterating 10 times on a 4Kx4K board. The program runs fine on my laptop and desktop (8400M GS & 8600 GTS). However, when I tried it on a friends 8800 GTX I got incorrect results. It seemed like the kernel function was not executed at all. I noticed that the 8800 GTX is a 1.0 compute capability device. However, I have not used any of the atomic arithmetic functions which are 1.1+ specific.
Do you have any idea what’s wrong? I have attached my program so you can test it if you like.
Normal output should be like this:
[font=“Courier New”]Game of life
5853405 total live points of 16777216
Total blocks 256
3.97 seconds total time
5416 total live points of 16777216[/font]
Warning: Correct execution is confirmed in the last line which should write the number “5416”.
gol.zip (1.78 KB)