On the GTX 295 card, there are 2 separate GPUs on it, each with 240 cores. I was able to get code working using the default setting without calling cudaSetdevice(), which uses GPU #0. However, I try to run the code after adding cudaSetDevice(1) to use GPU #1, I get a significantly different numerical result. I am currently using GPU #1 as my display device in X11 as well. There are known issues with using CUDA with a display device, but I’ve only seen performance issues with slower computation. Could it also affect its numerical precision and accuracy? Could there something wrong with the card?