You do not need to tell it that you will be using doubles, it will just run slower than if it is using 32 bit floats(as long as you have a GPU which can handle doubles, which you do).
The GTX 690 was designed for games mainly, and does not offer the same DP speed as the Teslas which were designed for scientific calcs.
Also the 690 is two 680s linked in one slot I believe.
Show the output from the device query, and the bandwidth speed test. They are in the SDK.
What is the actual problem when you run the code? You did not make that clear. If there is an error message, please show it.
Have you included all the necessary libraries, and set the dependencies? Are you using C++ or some other language? These things obviously matter.
Which Visual studio version and what operating system? Are you compiling as 64 bit?