What is better for developing CUDA applications one GTX690 or two GTX 680? (in terms of how much a sequential application can be speed up)
I would like to know the answer to this as well. I have 2 GTX 680s, but so far have only been using one for my calculations. When you have to separate GPUs I know you can break up task in the code where each has their own problem to solve interdependently , but not sure if you can do the same with the 690.
I have also seen that for video game benchmarks the 690 outperforms the 680, but not by a factor of 2 as one might hope.
The GTX690 contains two GPU chip on the same card while the GTX680 only contains one. Performance wise this should give the GTX690 an advantage in computation by a factor of 2. The factor of 2 is almost true but there are some differences in clock speed of the 2 GPU cards. The GTX690 has 2x cuda cores to perform work on but it is clocked slightly slower than GTX680.
In order to compare the two GPUs you should compare 1 GTX690 vs 2x GTX680. Comparing the GPUs this way there is and advanted of using 2x GTX680. The advanted is that each GTX680 has its own PCI-e x16 gen3 bus to move data to/from the host while the GTX690 only has one. The buss bandwidth for 2 GTX680 is thus two times bigger than the bandwidth of one GTX690.
When using GTX690 with CUDA, CUDA will report two GPUs and you have to program them the same way you do for 2 GTX680.
Conclusion: I you have $ to spend and your system supports 2x PCI-e x16 gen 3 and you are looking for maximum performance go for 2x GTX680 otherwise go for the GTX690.
Thank you very much Brano, I agree.
I am using 2 GTX690 for CUDA. Each of the 4 GPU is only slightly slower than one GTX680 without overclocking. However, I have noticed the following behavior:
When a GPU within the GTX690 is used also to drive the display, the computational results are somewhat unstable, that is, the results are not always repeatable from run to run. However, when the display is swiched to a different GPU, the original GPU becomes numerically stable while the other GPU (with the display) becomes numerically unstable.
I don’t know whether this problem is unique to the GTX690, or whether it also occurs with the GTX680. Also, so far I have only tested the EVGA brand; so I don’t know whether the problem occurs with the ASUS brand also.
Are you checking return codes from your CUDA calls on the host to make sure they are not failing? Issues with “numerical stability” sound suspiciously like kernels running on the display GPU getting terminated prematurely by the watchdog for taking too long…
Yes, I have checked return codes using the command: CUT_CHECK_ERROR(“Kernel execution failed”).
CUDA did not report any error, and so I assume that kernels did not abort prematurely.
The instability problem I encounter is intermittent, and affects only the 4th or higher decimal places.
It is somewhat alleviated (occurs less frequently) by increasing the memory clock and reducing the GPU clock.
I am currently doing more tests on this problem, including using the ASUS brand.
The instability problem is resolved!
It turns out that there was a race condition in my code. After fixing it, the results are now completely stable from run to run, independent of which GPU has the display or which brand is used.
I tried to install up to 4 GTX690 on my PC motherboard, GA-990FXA-UD7, but it would not boot with more than 2 cards installed. (My power supply is 1600W and so that is not the problem.) I asked Gigabyte about this and they said there is a BIOS shadow memory limitation with the motherboard, which limits the number of GTX690 to 2, even though it has 4 PCIE slots.
Does anyone know of a way to get around this limitation, or does anyone know of a PC motherboard that has been proven to support up to 4 GTX690 without the BIOS limitation?
I believe I’ve seen this board mentioned being capable of 4 way SLI. so presumably 4 GPUs would be ok:
http://www.newegg.com/Product/Product.aspx?Item=N82E16813157327
It is quite expensive though (and huge) and your processor would be useless on it though… Not sure if there is an AMD similar-equivalent of the board I mentioned… perhaps others can comment.
Edit: I can’t find any 990FX boards that can FIT 4 GTX 690s!
Edit 2: This person is doing 4-way of GTX 480s on that same mobo you have… http://www.youtube.com/watch?v=Qa3cszgFrrE, presumably 4 GTX 680s would be okay… the problem is that you have GTX 690s…
Also an idea… time to get a GTX Titan? Is this DP CUDA code? 2 If so, even 1 Titan equals the DP performance of those 4 GTX 690s combined.
Note you need a mainboard whose BIOS is capable of 8-way SLI in order to drive four GTX690 cards.
Good catch, also, I don’t think a board capable of that even exists yet:
https://forums.geforce.com/default/topic/529038/6-way-or-8-way-sli-/
That is just for SLI, though… if nasacort just wants the 4 GTX 690s w/o SLI, I believe it is possible… just looking for some board that has been tested to work at the moment.
Edit:
@nasacort:
here are your choices (one choice includes the board I already mentioned):
https://devtalk.nvidia.com/default/topic/521385/possible-to-use-four-gtx-690-in-one-computer-/
You might want to verify that the AsRock X79 board indeed supports 4 GTX 690s with the manufacturer… all I’ve found is reports of 4x GTX 680s.
Or just sell those things and get 1 or 2 Titans… easier :)
If the goal is to use the GTX 690 cards for CUDA (and not graphics), then SLI support doesn’t matter. There are plenty of PCI Express 2.0 boards that support 8 GPUs (for example, most X58 workstation motherboards that have enough PCI-Express slots). I’m not sure about PCI-Express 3.0, however.
Thank you all for your input.
The AsRock board looks interesting, but I hesitate to get it unless someone has successfully used it with 4 dual-GPU cards. I prefer to use the PC rather than the X58 server platform.
I use the cards only for CUDA with single precision, but I still like the 6GB and 384-bit interface of Titan. Maybe I’ll sell my GTX 690s.
I’ve been using an ASUS P8Z77-V Premium with 4 GTX 690’s and it has been working well. My main problem has been all the heat the cards put out when doing long computations. Be sure to use a case with good airflow to the cards. I’ve been using a CoolerMaster HAF X. When busy, the cards report temperatures between 80 and 98 degrees C, which seems pretty abusive, but after running this way for over six months, none have failed or automatically downclocked themselves.
Thanks for the info. I’ll keep that in mind for my next computer build.