I am doing a project in which i have to make a flow chart of a same code on different GPUs (all are NVIDIA’s GPU). Initially the code was written for Ge-Force 310 which have only 16 CUDA cores and can accmodoate 512 threads per block. Later the respective execution performance in terms of time was noted. Now the same code without any modification is executed on GTX-680 which have 1536 CUDA cores and cane accommodate 1024 threads per block. When the time was execution performance for this CUDA was noted it was approximately same as with the previous GPU, even though the CUDA cores and other specifications of GTX-680 are very high as compared to Ge-Force 310.
So now my questions are
- What are the factors on which the performance of the GU matters?
- Am I doing right or there must be some modification needed for the new GPU?
- The code have 2 FFT using CuFFT library, so the performance of cuFFT has to be changed for both the GPU or it will be same?
I am newbie to the GPU computing world so don’t have much idea about the sense of the questions but i am looking forward to learn from it.