I have two 8800GTX cards on a Windows machine (Intel Q6600 CPU, 2x1GB DDR2 RAM, Asus Striker Extreme NForce680i SLI motherboard).
I have speed problem in “multiGPU” example of CUDA SDK. When i run in 2-GPU mode the processing time is 10x slower than the 1-GPU mode:
2 GPUs found
Processing time: 472.811005 (ms)
Only one GPU found
Processing time: 40.686001 (ms)
I attached two 600W psu to each of the cards. I think there is no power problem.
Has anyoen ever tried to run this application in 2-GPU mode? What would I see after the execution of this “multiGPU” application. Any help/suggestion will be useful.
I tried to run the original Nvidia sample code “multiGPU” from SDK 1.0.
There is no modifications on the code. I think there is a problem about my hardware/software configuration. But I can not figure it out. Any help will be appreciated…
I am using Asus Striker Extreme with chipset “Nvidia nForce 680i SLI”. As it says on the user guide, this motherboard supports SLI technology at full x16, x16 speed.
This motherboard has also one PCI Express x16 slot at x8 speed and one PCI Express x1 slot. I have installed 8800 GTX cards to the blue slots which has x16 speed (referring to the guide).
I’m seeing the same problem on a Dell XPS H2C (680i SLI chipset) with dual 8800GTX cards. The multiGPU example takes 10x longer with one GPU, although the monte-carlo example does run about twice as fast with dual GPUs (why the difference?). When I run a larger multi-GPU application, it seems that each thread spends about half its time idling, even though the problem is definitely compute-bound (very low IO).
Have you discovered the cause of/solution to this problem?