Cuda Samples too slow on 980 Ti

I purchased 980 Ti and setup CUDA on my machine. I am trying the nBody with Visual Studio 2013. I get:

17 GFlops / second and it is way too slow. My laptop with 192 CUDA cores seems to be better! Any ideas?

Thank you

Did you build a debug or did you build a release project when you built the nbody sample?

If you built a debug project, try building a release project.

I love you! I am running at 2.8 TFlops (274 frames per second!).

Why would that be though, was the bottle neck the host?

And thank you very much!

For debug builds, all code optimizations are turned off in the compiler. The performance difference to a fully optimized release build can be huge, as you have found out. Make sure to use the -benchmark switch of the nbody application if you want to use this as a performance stress test, otherwise performance may be limited by the visualization.