Can't achieve 400Gbps using ConnectX-7

Hi,

I have two Nvidia ConnectX-7 200GbE / NDR200 / MCX755106AC-HEAT network cards
installed in two Lenovo ThinkSystem SR630 V3 servers (PCIe Gen 5).
These cards are connected directly to each other (back-to-back) on both ports using two QSFP56 200Gbps cables (link: https://marketplace.nvidia.com/en-us/enterprise/networking/200gbeqsfp56cables/).

I am running Ubuntu 24.04 on both servers.
And transmitting packets (iperf2, 16 tcp streams) from server 1 to server 2 on both ports (2 network interfaces) in parallel .

I can’t seem to exceed total of100Gbps across the two ports together:

  • 1 network interface achieve 100 Gbps
  • 2 network interfaces together achieve 50Gbps each. (100 Gbps total)

According to the spec, I should be able to achieve 400Gbps via Ethernet.

Things I’ve done:

  1. updated the firmware to the latest versions:

    • FW: 28.43.2026
    • PXE: 3.7.0500
    • UEFI: 14.36.0021
  2. Increased MTU on both interfaces to 9000

  3. sysfs indicate on the correct speed on both network interfaces

$ cat /sys/class/net/ens1f0np0/speed
200000
  1. ethtool indicate on 200Gbps link speed on each network interface
$ ethtool ens2f0np0
Settings for ens2f0np0:
        Supported ports: [ Backplane ]
...
        Supports auto-negotiation: Yes
...
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Advertised FEC modes: RS
        Link partner advertised link modes:  Not reported
        Link partner advertised pause frame use: No
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 200000Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Supports Wake-on: d
        Wake-on: d
        Link detected: yes
  1. mlxconfig output: https://pastebin.com/raw/RLKckxH8
  2. Checked all tips from ESPCommunity
  3. Checked all tips from ESPCommunity

Any insights or suggestions would be greatly appreciated.

Thanks!

Hi,

I do not have access to the exactly same gear but it seems that I can achieve the ‘almost full’ throughput, ~88%/350Gbit/s, with the below config:

MCX755106AS-HEA_Ax/FW 28.39.3004
AlmaLinux release 9.4 (Shamrock Pampas Cat)

Cards are configured as a LACP bond with L4 hash policy to fit the multiple parallel tcp streams:

grep -P 'Mode|Hash|Speed' /proc/net/bonding/bond2 
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
Speed: 200000 Mbps
Speed: 200000 Mbps

MTU set at 9k

As to how I tested, I used iperf3 as follows:

for i in {1..30}; do echo "iperf3 -s -p 500$i &"; done|sh
vs
for i in {1..30}; do echo "iperf3 -i1 -P 16 -t 300 -p 500$i -c $IPADDRESS &"; done|sh

You could also consider looking into using rdma to test the connection(s). Something like ib_send_bw might do nicely as the tcp test above can easily be CPU bound.


Vesa