I am trying to use a system with multi gpus and I am having some extrange behaviours, hope some one here can help me solve this.
The machine on which I am running the code has 4 C2050 cards. I am using OMP for managing the threads
The code divides the work between the 4 of them on equal size for each. Each thread will calculate one part of a big array. All of this is done after some previous calculation that are done in all the gpus (this results will afterwards stay in the gpu and be used again)
I measure in each thread the execution time of the function that is divided using gettimeofday and 2 of the 4 GPUs give me the same amount of time, and the other two give me another. It is like 2 of the card run faster that the other 2. The 2 “slow cards” make the code run in allmost the same amount of time when run with 2 or 4 gpus.
I dont really know what is causing this difference.
I attach the source code
intentoMultiGPU.cu (10.8 KB)