Hi,
I have written a function that selects the device with maximum theoretical GFlops throughput. I don’t know if my calculation is correct.
//Returns the device index of the device with the maximal theoretical
//throughput of floating point multiplies and adds. This number
//is estimated from different parameters and can be seen as a hint
//to choose the correct device. This function should be rewritten
//for devices more recent than GeForce GTX 480.
int deviceManager::getMaxGFLOPSDevice()
{
//Query devices
std::vector<cudaDeviceProp> devices;
deviceQuery(devices);
//Receive number of devices
int devcount = getNumberOfDevices();
//Calculate theoretical GFLOPs
int flops = 0;
int bestdevice = 0;
for (int i = 0; i < devcount; i++)
{
int tempflops = (devices[i].major == 1 ? 8 : 32) * \ //FMADs per clock cycle, taken from Programming Guide chapter 5.4.1
(devices[i].major == 1 ? 1 : 2) * \ //Warps per multiprocessor
devices[i].multiProcessorCount * \
devices[i].clockRate;
if (tempflops > flops)
{
flops = tempflops;
bestdevice = i;
}
}
//Return index of best device found
return bestdevice;
}
Any suggestions?
Regards,
Kwyjibo