Can I link 3090 using nvlink and nvswitch?

202476410arsmart · April 29, 2023, 9:35am

Hi! My group has 128 3090, how can I connect them together? I know like DGX A100 can connect 8 GPU together, and then use nvswitch to connect them, but can we do this on 3090? Thanks!!!

Robert_Crovella · April 29, 2023, 1:35pm

RTX 3090 can be connected pairwise with nvlink bridge. You cannot use NVSwitch with RTX3090.

202476410arsmart · April 29, 2023, 2:54pm

I see! So 3090 also can not use NVLINK? That means, if I have 128 number of 3090, the only way to connect them is, for pair, use NV bridge, for these 64 pairs, we can only use PCIe?
Thanks!!!

Robert_Crovella · April 29, 2023, 6:36pm

NVBridge is NVLink.

correct. The pairs can communicate with themselves using NVLink (over the NV Bridge). When a GPU in a pair wants to communicate with a GPU in another pair (or any other GPU or entity besides its “pair partner”) it will use PCIE.

202476410arsmart · May 3, 2023, 12:02pm

Emmm…So do you know how should we train this 128 piece of 3090 for LLM? Do you have any suggestion? I mean…seems the communication cost would be huge…

202476410arsmart · May 3, 2023, 12:40pm

Like… for such large communication cost, how to use computation to cover it? Any solid solutions? Thanks!

Robert_Crovella · May 3, 2023, 3:02pm

Training a LLM is certainly an advanced topic. Nobody at NVIDIA would recommend for serious work in that space that you lash together 128 consumer GPUs. However, the problem of communication hiding is present regardless of the underlying hardware platform.

The current NVIDIA primary solution in this space is Nemo Framework (and, somewhat related, Nemo Service). You can learn more about it using many of the resources already available such as here and here. To get another “view” of it, you can take a look at the foundational work done by NVIDIA research in this space, some of which is published here.

LLM training takes advantage of a number of characteristics of the underlying model training to exploit various parallelism avenues. Several of these avenues allow for the overlap of computation with communication, which is a key aspect of communication hiding. You can read more about it in the papers linked to the last link.

Topic		Replies	Views
Is the protocol support for nvlink complete for the 3090 graphics card? GPU - Hardware	0	659	July 29, 2023
RTX 3090 + NVLink + CUDA P2P - not working on Linux or Windows, in different ways? CUDA Programming and Performance	8	8734	May 24, 2023
Geporce RTX 3060 ti GPU - Hardware	3	1360	August 14, 2024
Making work NVLink on Windows 10 with dual 3090 CUDA Developer Tools	0	2670	April 13, 2021
NVLink bridge between RTX 3090 and Quadro RTX 6000 GPU - Hardware	1	407	December 9, 2024
How to enable P2P access? CUDA Setup and Installation cuda	3	5450	February 6, 2023
Does the RTX 3090 support P2P access over a PCIe switch? CUDA Setup and Installation	2	357	July 13, 2025
NVLINK port support for RTX 3090 Ti, RTX 4080/4090 Raytracing	1	16817	November 22, 2022
Data transfer between GPU of a workstation CUDA Programming and Performance	2	467	April 16, 2024
Multiple NVlinks with RTX 3090 on Linux Deep Learning (Training & Inference)	1	828	September 7, 2020

Can I link 3090 using nvlink and nvswitch?

Related topics