Optimal multi-GPU system

OldMonk · September 6, 2017, 4:00pm

Dear all,
I am building a 8-GPU system for machine learning (SVM) and looking for guidance/confirmation if I am doing this right.
So far, I am thinking the system would comprise of the following:
GPUs:
Titan Xp (1)
1080 (5)
1030 (2)
Motherboard:
One that supports NVLink (thinking SuperMicro)

My questions are:

is it ok to have different types of GPUs, or should they be all the same ?
What is the appropriate way to connect the GPUs ? Would I need SLI or something similar, given that the motherboard and the cards are all Pascal and would support NVLink ?

cbuchner1 · September 6, 2017, 4:04pm

this article seems to imply that you need a Pascal P100 or Volta V100 GPU to use nVLink.

njuffa · September 6, 2017, 4:14pm

All the GPUs on your list are consumer cards with a PCIe gen3 interface, not an NVlink interface.

What kind of system do you plan to build that can support eight GPUs, and where is that system going to be installed? A residential power outlet in the US will typically support up to 1600W, and unsurprisingly, that is the maximum nominal power rating of commonly available PSUs. Which means you can drive a load of about 1000W worth of electronic equipment reliably (accounting for spikes in power consumption etc), which equates more likely to four high-end GPUs rather than eight GPUs (note that NVIDIA’s DGX-1 system, which comes with 8 GPUs, has a 3200W power rating).

OldMonk · September 6, 2017, 4:18pm

This will be installed in a server room, not for a residential purpose. My IT guys do not have experience with such a system, and I have not worked on more than 1 GPU.
So trying to figure out how to do this.

njuffa · September 6, 2017, 4:44pm

Danger, Will Robinson! In that case, my recommendation is: Do not build your own system. Buy one from an integrator that partners with NVIDIA: http://www.nvidia.com/object/partner-locator.html.

There are numerous pitfalls when building such ambitious systems. Best of luck to you if you decide to go down the path of building your own.

Just one item to consider: consumer cards typically come with active cooling (built-in fans) which are not what you want in a server enclosure, where one typically uses passively-cooled GPUs.

OldMonk · September 6, 2017, 4:58pm

thank you cbuchner1, njuffa, this is really helpful.

So is it even worth it building a multi-GPU system with consumer cards?
Or
Would the money be better spent on a system with 1-2 high end cards, (eg. Quadro GP100) ?

As an aside, I just changed my card from an old Quadro FX1800 to TitanXp, and not seeing gain in performance.
I suspect that was because the Quadro was not a consumer card …
Would love to hear if you have any comments.

njuffa · September 6, 2017, 5:15pm

That’s a pretty good indication that either

(1) Your application is not actually using GPU acceleration

or

(2) Your application uses GPU acceleration, but is completely bottlenecked by host system performance.

If you look at the list of NVIDIA partners that I pointed to, you can filter by those who are specializing in deep-learning systems. You could also check out forums dedicated to deep learning frameworks to see what kind of systems people there recommend from first-hand experience.

My general host-system recommendation for applications that are well optimized for the GPU is to go relatively easy on the CPU core count but aggressive on CPU single-thread performance (basically: high base frequency), because you want to avoid getting bottlenecked on the non-parallel portions of your workload (Amdahl’s law). Four CPU cores per GPU is usually sufficient. System memory size should ideally be 4x the total GPU memory, and make it as fast as you can, e.g. four-channel DDR4 (e.g. NVIDIA’s GDX-1 has 128 GB of GPU memory and 512 GB of system memory). NVMe SSDs often make sense, but could be expensive. For the PSU(s), look for 80 PLUS Titanium, or at least 80 PLUS Platinum, compliant units.

CudaaduC · September 6, 2017, 8:14pm

IMO one is usually better off with 2 high-end GPUs rather than 4-8 moderate performance GPUs. It is generally more reliable and less difficult to troubleshoot problems. This is mainly true if you are building your own system, as a system from a qualified vendor will most likely be configured correctly.

Also SVMs really do not map as well to parallel architectures when compared to other more conventional deep learning approaches such as Convolutional Neural Networks. Also SVMs are highly susceptible to adversarial inputs, much more so than CNNs.

Topic		Replies	Views
Optimal multi-GPU system CUDA Programming and Performance	0	420	September 6, 2017
Optimal multi-GPU system CUDA Programming and Performance	2	639	September 7, 2017
4x RTX Titan and NVLink TensorRT	8	4246	February 15, 2019
Does Titan RTX support P2P access w/o NVLink? CUDA Programming and Performance	9	3776	December 15, 2019
How NVLink Will Enable Faster, Easier Multi-GPU Computing Technical Blog	10	721	June 15, 2016
Which NVIDIA GPUs are more suitable for high-performance computing? CUDA Programming and Performance	33	1249	November 13, 2024
Programming with NVLINK CUDA Programming and Performance	9	5642	April 18, 2018
RTX 3090 + NVLink + CUDA P2P - not working on Linux or Windows, in different ways? CUDA Programming and Performance	9	7201	May 24, 2023
NVLink, Pascal and Stacked Memory: Feeding the Appetite for Big Data Technical Blog	14	560	March 31, 2016
NvLink (V100) GPU - Hardware	4	1792	October 12, 2021

Optimal multi-GPU system

Related topics