Ethernet v.s. Infiniband

heilaw · January 7, 2019, 9:03pm

I am trying to understand the performance difference between Ethernet and Infiniband.

I am testing a program which requires lots of communications between GPUs on different nodes. I tested it on two clusters. One of them used Infiniband as the communication backbone while the other one used Ethernet. My program ran much slower on the one with Ethernet. Will upgrading the Ethernet to 10GbE help? Or is the communications more sensitive to the low latency property of Infiniband?

njuffa · January 7, 2019, 9:58pm

Where two-way communication is required (i.e. pretty much all practical use cases in compute applications), high latency will have a negative impact on effective throughput. This is a reason why low-latency interconnects, and Infiniband in particular, dominate at the high end the supercomputer space.

Any real life case will likely also involve cost (or cost effectiveness) as a decision motivator, so as a first step you might want to profile your application on the two clusters you mentioned, paying particular attention to communication patterns. There may be various node characteristics that significantly impact the relative performance of the two clusters (e.g. amount of system memory per node, CPU/GPU balance in a node), so make sure you control for such effects as tightly as possible instead of chalking up performance differences simply to the characteristics of the interconnect.

This is outside my area of expertise, but the following (slightly dated) comparison of 10GbE and IB by the HPC Advisory Council shows how the impact of lower latency IB can differ substantially by application and that is why it is important to understand the characteristics of your app(s):

[url]http://www.hpcadvisorycouncil.com/pdf/IB_and_10GigE_in_HPC.pdf[/url]

As far as I know, the HPC Advisory Council is an organization that counts pretty much all major equipment suppliers in the HPC space (including NVIDIA) among their members, so this likely provides a fair comparison.

Topic		Replies	Views
connect s1070 with InfiniBand CUDA Programming and Performance	1	5170	January 22, 2010
HDR Infiniband and ConnectX-6 VPI interfaces Software And Drivers infiniband , iterations , bytes	1	790	January 11, 2022
Direct inter-pc connect using USB vs network connections (using Ethernet/InfiniBand)? Switches and Gateways	1	340	April 28, 2020
does multiple Infiniband switch adds up the latency between CPU nodes? Mellanox OFED	1	298	June 5, 2017
Theortical Bandwidth and Latency of NIC InfiniBand/VPI Adapter Cards	3	1256	November 20, 2023
Extremly slow iboip performance, nearly as slow as gbit ethernet. Software And Drivers	3	686	April 22, 2019
Infiniband performance tuning InfiniBand/VPI Adapter Cards iterations , bytes	2	1637	June 7, 2017
Issues with ConnnectX-6 Throughput Under Infiniband InfiniBand/VPI Adapter Cards	7	1270	November 16, 2023
Does Lan speed related with GPU? Jetson AGX Orin board-design , ethernet	9	712	March 19, 2024
IPoIB performance issue!! Mellanox OFED	4	703	April 26, 2014

Ethernet v.s. Infiniband

Related topics