Poor performance with ConnectX5

torkil · December 16, 2022, 11:49am

Hi

I have two SuperMicro servers running RHEL each with one of these cards:

Device type: ConnectX5
Name: MCX516A-CCA_Ax
Description: ConnectX-5 EN network interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; tall bracket; ROHS R6

Configured as LACP bond on the hosts with 40G optics:

"

ethtool bond0

Settings for bond0:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 80000Mb/s
Duplex: Full
Auto-negotiation: off
Port: Other
PHYAD: 0
Transceiver: internal
Link detected: yes
"

iperf3 -i 5 -s
iperf3 -i 5 -t 60 -c beast.drcmr

"
[root@beauty ~]# iperf3 -i 5 -t 60 -c beast.drcmr
Connecting to host beast.drcmr, port 5201
[ 5] local 172.21.15.51 port 36664 connected to 172.21.15.72 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-5.00 sec 2.04 GBytes 3.50 Gbits/sec 0 2.34 MBytes
[ 5] 5.00-10.00 sec 2.06 GBytes 3.55 Gbits/sec 0 3.11 MBytes
[ 5] 10.00-15.00 sec 2.06 GBytes 3.53 Gbits/sec 0 3.11 MBytes
[ 5] 15.00-20.00 sec 1.96 GBytes 3.37 Gbits/sec 0 3.11 MBytes
[ 5] 20.00-25.00 sec 1.96 GBytes 3.38 Gbits/sec 0 3.11 MBytes
[ 5] 25.00-30.00 sec 1.96 GBytes 3.37 Gbits/sec 0 3.11 MBytes
[ 5] 30.00-35.00 sec 1.96 GBytes 3.37 Gbits/sec 0 3.11 MBytes
[ 5] 35.00-40.00 sec 1.93 GBytes 3.32 Gbits/sec 0 3.11 MBytes
[ 5] 40.00-45.00 sec 2.01 GBytes 3.45 Gbits/sec 0 3.11 MBytes
[ 5] 45.00-50.00 sec 2.00 GBytes 3.44 Gbits/sec 0 3.11 MBytes
[ 5] 50.00-55.00 sec 1.98 GBytes 3.40 Gbits/sec 0 3.11 MBytes
[ 5] 55.00-60.00 sec 1.96 GBytes 3.37 Gbits/sec 0 3.11 MBytes

[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.00 sec 23.9 GBytes 3.42 Gbits/sec 0 sender
[ 5] 0.00-60.04 sec 23.9 GBytes 3.42 Gbits/sec receiver
"

Any suggestions as to why? Those numbers seem way off.

Thanks.

Mvh.

Torkil

ssimcoejr · December 19, 2022, 10:37pm

Hello Torkil,

Thank you for posting your inquiry to the NVIDIA Developer Forums.

We do not recommend using iperf3 for TCP benchmarking on Linux hosts.
iperf3 lacks several features that iperf2 contains, such as multithreading (and multicast test capabilities).
Multithreaded (parallel) testing (using multiple cores) are a much more realistic example of what you should expect for real-world throughput than single-stream, single-thread performance.

A quick example of iperf2 testing, using 8 cores, can be found here:
https://enterprise-support.nvidia.com/s/article/howto-install-iperf-and-test-mellanox-adapters-performance

If you are still experiencing lower-than-expected throughput while using iperf2, we would recommend reviewing our comprehensive host tuning guide, available here:
https://enterprise-support.nvidia.com/s/article/performance-tuning-for-mellanox-adapters

General OS tuning guidelines can be found here, as well as Mellanox-specific tuning guidelines.
We also discuss the importance of NUMA-locality and provide instructions for pinning your applications to local CPU cores (https://enterprise-support.nvidia.com/s/article/understanding-numa-node-for-performance-benchmarks).

If after following these guidelines you are still not able to reach line rate (or near line rate), and you have valid Enterprise support entitlement, we would recommend engaging our Enterprise support team via the NVIDIA Enterprise support portal (https://enterprise-support.nvidia.com/s/create-case).

Thanks, and have a great day;
NVIDIA Enterprise Support

torkil · December 20, 2022, 12:54pm

Hi

Thanks for the links. Iperf2 does indeed show resonable numbers for all but 2 hosts.

Can you also provide a link to the mlnx_tune script? I find it referenced a lot but with broken links.

Mvh.

Torkil

ssimcoejr · December 20, 2022, 2:07pm

Hi Torkil,

The mlnx_tune script is bundled with MLNX_OFED. This is our proprietary driver stack.

You can also find mlnx_tune on the Mellanox userland tools and scripts GitHub:

(It’s within the Python directory)

HTH,
NVIDIA Enterprise Support

daslolo · January 2, 2023, 2:41pm

what’s the windows version of mlx_tune?
i’m seeing 56gb/s on a 100gbe adapter with iperf2

system · January 16, 2023, 2:42pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Mellanox ConnectX-4 VPI in 100GbE ethernet mode cannot perform beyond ~52Gbps lspci	1	1138	March 14, 2017
Cannot get 40Gbps on Ethernet mode with ConnectX-3 VPI Ethernet Adapter Cards	3	553	November 17, 2014
MCX512A-ACAT problems with link status and ethtool on Centos 7 Ethernet Adapter Cards	1	1795	October 29, 2021
ConnectX6 (mlx5 kernel driver) strange behavior? Ethernet Adapter Cards kernel , ubuntu	2	2881	September 14, 2022
The speed and bandwidth of 200g mellanox network card(card Model:CX6141105A ConnextX-6 200GbE) are less than 200g Ethernet Adapter Cards	2	2631	November 25, 2021
100G Speed-tests VMWare Ethernet Switches	8	919	August 9, 2017
Performance issue with inbox drivers Software And Drivers	1	704	February 26, 2022
Unstable ConnectX-3 Ethernet Performance on ESXi 6.5 update 1 Virtualization For Infiniband And Ethernet	3	615	October 17, 2017
Performance test with RoCEv2 Mellanox OFED	3	2214	February 20, 2023
Question of the bandwidth of MCP1650-V003E26 when connecting to connectx-6 VPI (configure to Ethernet mode) Adapters and Cables mlxconfig , ibstatus , mlxcables	8	984	September 16, 2020

Poor performance with ConnectX5

ethtool bond0

Related topics