I’m trying to reach 100G over 2 directly connected MCX556A cards. I am using OFED 5.4.1 on Linux Centos 7.9 with stock kernel (3.10.0-1160).
I have executed mlnx_tune and set additional parameters:
sysctl net.ipv4.tcp_rmem=“4096 87380 2147483647”
sysctl net.ipv4.tcp_wmem=“4096 65536 2147483647”
The hosts are 2x AMD EPYC 7542 with 1 TB Memory, htop and top show utiliziation during tests of 1-2%. The CPU is configured for 4 NUMA nodes, and the adapter is bound to the corresponding one. The Adapter is connected via PCIex4 x16. RPS and XPS cpus are pinned.
The eth interfaces are set to mtu 9000.
I’m testing with iperf, iperf3 and raw_ethernet_bw. The maximum I was able to achieve was 73 Gbit/s. iperf and iperf3 are run as separate processes, I have tried from 2 to 8 processes and everytime the same result. I do not see some retries from iperf3, but they are around 200 - 500 for a 30 second test.
I did the same test with a switch in between (Dell S5232F-ON) there I had much high retries, around 50k.
I have tested by reducing the link to 50G and 25G and both times I can reach the maximum speeds (46.3 Gbit and 23.2 Gbits) - so I would expect 4x 23.2 Gbits, so around 92.8 Gbits.
Locally (lo interface) I can easily reach 190 Gbits Send/Receive.
I have followed the tuning guidelines:
I will test still a different cable, but mlxlink doesn’t report any issues.
What else can be checked? How can I find out WHAT is limiting the performance here?
How can I test a loopback configuration?