I run the same program in GTX970 and GTX1070TI. GTX 970 is faster than GTX 1070Ti. What shall I do？
The easiest solution is obviously to just keep using the GTX 970 for this program.
(1) Is this a controlled experiment, meaning all hardware and software is the same, and you swapped only the GPU?
(2) How was “faster” determined? Elapsed time? If so, what are the ties measuired for the two GPUs? What is included in the timed portion of the code?
(3) Did you specify the correct GPU architecture target when you built the code (the two GPUs belong to different architectures)
(4) Any chance one build was a debug build and the other a release build?
(5) Is the same amount of work performed with either GPU (e.g. different confiuration settings, or aut-configuration based on GPU)?
(6) Have you profiled the code with the CUDA profiler? If so, what are salient differences between the profiles on the two GPUs?