I built the solution at Release mode and run the .exe file on the same computer, the results are correct.
But when I copy the .exe file to another computer and run it there, the results went wrong…
Actually I am comparing CPU computation with GPU computation, so I did the same thing both on CPUs and on GPUs. Both CPUs calculated correctly, but the two GPUs just did differently. Why is this happening?
The device which calculated correctly has an NVIDIA Quadro NVS 290, the one calculated wrongly has an NVIDIA Quadro FX 4600.