How to aggregate CX5 VFs bandwidth in KVM

I have to two x86_64 servers and they have identical hardware configurations. Each of them have a 100Gb CX5 in PCIe slot1. Both servers and VMs are running OEL7U8, UEK5 kernel. I intend to aggregate the two ports of CX5 bandwidth in hosts and VMs. So I enabled SR-IOV and created VFs in both hosts:

[rpmem@scaoda8m020 ~]$ lspci |grep Mella

af:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]

af:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]

af:00.2 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]

af:00.3 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]

af:00.6 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]

af:00.7 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]

While I was creating the VFs of CX5 I noticed that

af:00.2 and af:00.3 are the VFs from PF at af:00.0

af:00.6 and af:00.7 are the VFs from PF at af:00.1

So I PCI passthrough af:00.2 and af:00.6 to VM in each host

and in VM they looked like:

[root@scaoda8m020c2n1 network-scripts]# lspci |grep Mella

00:08.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]

00:09.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]

So I believe ens8 is the af:00.2(port 1), and ens9 is the af:00.6 (port 2) in hosts.

I have qspf28 cables connecting port1-port1, port2-port2 of CX5 in both physical servers.

I want to aggregate the two VF ports bandwidth to get better throughput (~200Gb?) so I bond them together mode=4: BONDING_OPTS=“mode=4 miimon=100 lacp_rate=1”

but somehow then the two VFs (in two VMs) can’t ping each other. In fact it also kill the bonding interface in the KVM hosts. But if I bond them with mode=1:

BONDING_OPTS=“mode=active-backup miimon=100 primary=ens8”

then they can ping each other successfully. Since mode=1 is active-backup so I believe only the ens8 is used normally. I also checked by ifup only ens8 or ens9 (without bonding) in both VMs and found they can ping each other successfully. So I think the individual connections are fine. That left the aggregation algorithm suspicious. Wonder is 802.3ad aggregation is supported in CX5? If this mode is not supported please suggest which mode I should use to aggregate CX5 two ports bandwidth. Please help me configure VFs (from two ports) in VM and get the max throughput.

Thanks,

Ken

Hello Ken,

Thank you for posting your inquiry on the NVIDIA Networking Community.

Based on the information provided, we want to continue to handle this issue through a regular support case as you have a valid support contract.

We will reach out to you through the new support ticket for continuing this request.

Thank you and regards,

~NVIDIA Networking Technical Support