I have just stared developing with CUDA and still have a lot to learn, but desperately need help for my unexplained problem.
Here’s the situation:
As a first try with CUDA, I have developed a Game Of Life. Everything works fine. However, when trying to collect information on execution time and number of operations per second a strange thing happened.
I am currently computing the total execution time of the kernel only, not the whole program. What is strange is that if I disable (in the code) everything that’s got to do with displaying the game and compute solely the kernel execution time, my performances drop by a factor 10%… Whereas when I leave all the displaying, the performances are better, which is conter-intuitive…
Is there something I didn’t get about NVIDIA’s architecture?
The displaying is done updating, in the kernel, a VBO with current data.
Any help appreciated.