4RTX 3090 setup in single CPU

_Bi2022 · August 29, 2022, 8:58am

Hi,

Can I ask regarding hardware setup in this forum?

In 2 RTX3090 graphics cards setup O cam use 3/4(mainly 4) slot NVLink hardware to transfer data in-between.

But suppose I want to setup 4 RTX for very high rendering computation. I know it is possible with a distribution system, with two CPUs (each with 2 RTX) and connect over ethernet cable.

But, if I want to fit all the 4 RTX in a single machine, is there any specific motherboard available for that? If so, then how can I connect all 4 RTX? NVLink will definitely work in this case.

MarkusHoHo · August 29, 2022, 3:08pm

Hello @_Bi2022 ,

I don’t think it is possible to give a generic recommendation on a perfect 4xRTX GPU setup, it completely depends on your use case and budget. While there are for example single CPU Enterprise Desktop mainboards with 4 PCIe x16 slots available, you might run into power and cooling issues if you try to build the system to normal Desktop standards and with four RTX3090. The NVLink is not a limiting factor, as long as all PCIe slots give full bandwidth. Of course the CPU itself needs to support the required number of lanes, which limits your choices to the very high end.

That is why most developers look at high-end Graphics workstations or server setups for multi-GPU solutions. And those are sold as Enterprise solutions with proper support by ISPs.

I am sorry if I cannot offer a specific recommendation, but maybe someone else here in the forums had to make that same decision before.

Thanks!

_Bi2022 · August 29, 2022, 4:50pm

As there are rumors everywhere that the RTX price is dropping, I was thinking something like a crypto mining rig but for real-time ray tracing. Not limited to 4 GPUs, maybe 6/8 GPUs. From your answer, now I realize it would not be as easy as I thought. Except for the power supply and cooling mechanism, I did not understand the data transmission between different GPUs clearly. Did you mean the generic PCIe data transmission would be enough among multi-GPU setup?

The NVLink is not a limiting factor, as long as all PCIe slots give full bandwidth. Of course, the CPU itself needs to support the required number of lanes, which limits your choices to the very high end.

I actually don’t know it clearly yet, I currently have 2RTX and a 4 slot nvlink bridge. I thought the nvlink is the only way to communicate in-between two rtx. If I do not use the NVlink, can I still transfer data (may be at a lower transmission rate)?

MarkusHoHo · August 31, 2022, 9:49am

Hi again!

I am sorry if I was not too clear in my mention of NVLink. NVLink for consumer GPUs, that is RTX30xx generation cards, is limited to the RTX3090 and to only 2 GPUs connected at a time. That means if you use more GPUs you will run into Bandwidth bottlenecks depending on the underlying motherboard and the PCIe lanes, but not because of NVLink.

That said, if you plan to create a multi-GPU system with 4 or more GPUs, inter-GPU communication is still possible of course, but your bandwidth will be limited by PCIe speeds, since that bus is what will be used for data transfer.

For example on a normal desktop system nowadays with an end-user CPU, you would need to split your PCIe lanes between the GPUs, meaning for 2 GPUs you would get PCIe x8 each, for 4 it would be PCIe x4 each. Depending on your workloads, that can still be good enough. For real-time ray tracing it is the question of the software that actually supports workload balancing between multiple GPUs.

_Bi2022 · August 31, 2022, 10:46am

It’s all clear now. Thanks a lot for all the explanation.

Just to know, suppose there is one machine with 4 GPUs with PCIe x4 data transmission. And on the other side, there are two machines with 2 GPUs each (connected with NVLink), as a distributed system connected over the high speed ethernet cables for data transmission. In your knowledge, which one should be faster for real-time ray tracing application? I think the second set-up will be faster than the first setup. What do you think?

MarkusHoHo · August 31, 2022, 12:25pm

If we look again at normal Desktop machines then the typical Network connection is Gigabit Ethernet, meaning 1GB/s. In comparison a PCIe 4.0 x16 achieves about 32 GB/s. So no, normal Ethernet connected distributed systems are rarely faster than PCIe bus connected systems. To achieve that you would need to go into server systems with different specialized NICs and Switches, but then you would also not look at GeForce GPUs.

_Bi2022 · August 31, 2022, 3:33pm

Thanks a lot, I got the clear idea now.

system · September 14, 2022, 3:34pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
4x RTX Titan and NVLink TensorRT	8	4344	February 15, 2019
NVLINK port support for RTX 3090 Ti, RTX 4080/4090 Raytracing	2	14906	November 22, 2022
RTX 3090 + NVLink + CUDA P2P - not working on Linux or Windows, in different ways? CUDA Programming and Performance	9	7506	May 24, 2023
NVLINK support for connecting 4 GPUs GPU - Hardware	9	9003	May 29, 2023
NvLink (V100) GPU - Hardware	4	1905	October 12, 2021
RTX A6000 ADA - no more NV Link even on Pro GPUs? Raytracing	23	27082	November 28, 2024
How to enable P2P access? CUDA Setup and Installation cuda	3	4549	February 6, 2023
Optimal multi-GPU system CUDA Programming and Performance	7	1081	September 6, 2017
Programming with NVLINK CUDA Programming and Performance	9	5791	April 18, 2018
NVLink, Pascal and Stacked Memory: Feeding the Appetite for Big Data Technical Blog	14	562	March 31, 2016

4RTX 3090 setup in single CPU

Related topics