I have upgraded to CUDA 6.5. If I profile my application compliled with CUDA 6.0 with nvprof 6.0 it shows 3.6 seconds for GPU computation time, and nvprof 6.5 shows 4.1 seconds. I tried to compile the program with CUDA 6.5, and the results are still the same with nvprof 6.5.
What can be the problem?
Thanks in advance.