Hello,
We are trying to setup 2 Linux servers with Mellanox ConectX-4 NICs, the servers are connected to each other via a VXLAN tunnel and we are having some issues with the throughput.
We did some tests with standard IP Forwarding and we managed to get bandwidth speeds up to 39Gbit/s, but when we test the throughput with the VXLAN tunnel the throughput we get is around 10Gbit/s.
CPU Usage with IP Forwarding is less than 5%, when we do the VXLAN test we see that a single core is utilized at 100% on the receiving server. The process consuming the CPU is “ksoftirqd/14”.
We noticed that VXLAN offload is enabled on both server, as per the documentation:
[root@frr-lab ~]# ls /sys/kernel/debug/mlx5/0000:05:00.0/VXLAN/
4789
[root@frr-lab2 ~]# ls /sys/kernel/debug/mlx5/0000:05:00.0/VXLAN/
4789
We also used “mlnx_tune -p HIGH_THROUGHPUT” on both of the servers. We also disabled irqdbalance and used the set_irq_affinity.sh script to bind multiple cores to the NIC.
Bellow you can find some more information regarding our servers:
OS: Fedora 28
Kernel: 4.16.3-301.fc28.x86_64
Mellanox OFED Driver: mlnx-en-4.5-1.0.1.0-fc28-x86_64
ConnectX-4 Firmware: 14.24.1000
System Resources (per server):
2 x Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
128 GB of Memory
VXLAN is configured as per your documentation via a standard Linux Bridge.
Thank you very much for your time