The measured performance of a single kernel running on a single GPU drops if the system has more than one GPU board. Googling indicates the cause might be automatic reduction in clock rate when a system has multiple boards. How to avoid this? OS is linus; our system has enough power and cooling to support multiple GPU cards – we want each and every card to run at full clock speed.
Could this be the reason same kernel function runs faster on my gt200 gpu than my fermi one? They both are on the same motherboard. I am trying to measure the performance of each gpu.
If what you say is true, then I shall take out one gpu to calculate the performance.
No; performance of just one kernel running on just one GPU – only one GPU is active during measurement. Performance measurement does not include data movement with host; performance being measured is the time taken by kernel only. Performance drops when system has two GPU cards. Only one user on system. GPU cards being used for computation only – not for display. No other activity going on in the system during measurements.
What was the power setting on your system? Some people are saying the power setting is being set to adaptable rather than “maximum performance” – how can one verify if this is the cause? I don’t know how to make the setting “maximum performance” without using the driver gui, and I don’t have access to the gui. What is the command line or configuraiton file method for ensuring power setting is permanently set to maximum performance?
The reported temperature is 74 C – full details in attached nvidia_smi_a.txt file. I ran the nvidia_smi command just before and just after kernel executed; attached file is the output from running nvidia_smi just after execution of kernel. nvidia_smi_a.txt (9.34 KB)