Could different flows impact each other when using ConnectX-3 Pro SR-IOV?

I am currently using ConnectX-3 Pro, and enable SR-IOV with 12 VFs. I used 2 VFs, and each VF was bound to a Virtual Network Function (VNF). I sent 2 flows to the NIC, each flow with different destination mac address (went to different VNFs). I found 2 things that I could not understand:

  1. When I slowed down the processing of one VNF and made it experience packet loss, the other VNF also experienced packet loss.
  2. When I increased the traffic rate of one flow, and made it large enough to experience packet loss, the other flow also experienced packet loss.

I was wondering whether SR-IOV is not entirely virtualized. Could anyone help me understand how SR-IOV on Mellanox ConnectX-3 Pro NIC works? Thanks!

Hello Junzhi -

Please provide exactly:

  • OS version.
  • Mellanox NIC FW version.
  • MOFED/WINOF version.
  • The test commands you used.
  • The output you viewed.
  • How you servers are connected.
  • Any other pertinent information.
  • ETH or IB

Also, you may consider opening a case with support @ Mellanox: Contact NVIDIA Networks | NVIDIA

Thank you

~Steve

Hi Steve,

  • OS version is Ubuntu 18.04.1, Linux 4.15.0-43-generic
  • Mellanox NIC FW version is 2.42.5000
  • MOFED version is 4.4-2.0-7.0
  • I am using ETH

Here are details of my test:

I have two servers A and B. I set the number of VFs in server A to 12. I ran 2 network functions (X and Y) on A on different cores, each was bound to a VF. I ran a packet generator in server B, and sent traffic to server A through a switch (I have proved that the switch has no problem). The packet generator sent two flows, the destination mac address of the two flows are set to the mac address of the two VFs corresponding to the two network functions in server A. Furthermore, the network function can print the number of packets processed.

In the beginning, I sent 2 flows with small traffic rate. There is no packet loss for both two network functions. Then I tried to lower down the processing throughput of one network function Y, making it unable to handle the traffic rate it received. The results showed that both 2 flows experienced packet loss before processed by network function X and Y.

I also tried to increase the traffic rate of flow to network function Y. The results were the same.

Furthermore, I also tried to bind network function X to another NIC on server A, which means X and Y are bound to 2 different physical NICs. I tried the 2 scenarios before, and no packet loss was detected for X.

Thanks,

Junzhi

Hello Junzhi -

Please provide exactly:

  • The test commands you used.
  • The output you viewed.
  • How you servers are connected.
  • Any other pertinent information.
  • ETH or IB

Many thanks -

~Steve

Hi Steve,

Sorry for the late reply.

I am using ETH.

I have 2 servers A and B connected by a switch. I am using Mellanox ConnectX-3 Pro NIC on server A, and using 2 independent NICs on B.

I run two NFs (X and Y) on server A, each one is bound to a VF of Mellanox NIC.

I run Moongen on server B, and make it send two flows, one flow to NF X, the other to NF Y.

Both two NFs will send the packet back to Moongen.

I start with low traffic rate, each flow is 1500Mbps.

$ sudo ./build/MoonGen examples/l2-id.lua 0 1 --rate1 1500 --rate2 1500

[INFO] Initializing DPDK. This will take a few seconds…

EAL: Detected 12 lcore(s)

EAL: No free hugepages reported in hugepages-1048576kB

EAL: Probing VFIO support…

EAL: PCI device 0000:04:00.0 on NUMA socket 0

EAL: probe driver: 8086:10fb net_ixgbe

EAL: PCI device 0000:04:00.1 on NUMA socket 0

EAL: probe driver: 8086:10fb net_ixgbe

[INFO] Found 2 usable devices:

Device 0: 90:E2:BA:86:78:D8 (Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection)

Device 1: 90:E2:BA:86:78:D9 (Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection)

PMD: ixgbe_dev_link_status_print(): Port 0: Link Down

PMD: ixgbe_dev_link_status_print(): Port 1: Link Up - speed 0 Mbps - half-duplex

[INFO] Waiting for devices to come up…

[INFO] Device 1 (90:E2:BA:86:78:D9) is up: 10000 MBit/s

[INFO] Device 0 (90:E2:BA:86:78:D8) is up: 10000 MBit/s

[INFO] 2 devices are up.

[Device: id=0] RX: 2.17 Mpps, 1459 Mbit/s (1807 Mbit/s with framing)

[Device: id=1] RX: 2.15 Mpps, 1444 Mbit/s (1788 Mbit/s with framing)

[Device: id=0] TX: 2.17 Mpps, 1456 Mbit/s (1803 Mbit/s with framing)

[Device: id=1] TX: 2.15 Mpps, 1444 Mbit/s (1788 Mbit/s with framing)

[Device: id=0] RX: 2.23 Mpps, 1498 Mbit/s (1855 Mbit/s with framing)

[Device: id=1] RX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=0] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=0] RX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] RX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=0] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

As you can see, there is no packet loss. Flow to NF X is the trace id=0, and flow to NF Y is the trace id=1.

Then I increase flow 2 to 7000Mbps.

$ sudo ./build/MoonGen examples/l2-id.lua 0 1 --rate1 1500 --rate2 7000

[INFO] Initializing DPDK. This will take a few seconds…

EAL: Detected 12 lcore(s)

EAL: No free hugepages reported in hugepages-1048576kB

EAL: Probing VFIO support…

EAL: PCI device 0000:04:00.0 on NUMA socket 0

EAL: probe driver: 8086:10fb net_ixgbe

EAL: PCI device 0000:04:00.1 on NUMA socket 0

EAL: probe driver: 8086:10fb net_ixgbe

[INFO] Found 2 usable devices:

Device 0: 90:E2:BA:86:78:D8 (Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection)

Device 1: 90:E2:BA:86:78:D9 (Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection)

PMD: ixgbe_dev_link_status_print(): Port 0: Link Down

PMD: ixgbe_dev_link_status_print(): Port 1: Link Down

[INFO] Waiting for devices to come up…

[INFO] Device 1 (90:E2:BA:86:78:D9) is up: 10000 MBit/s

[INFO] Device 0 (90:E2:BA:86:78:D8) is up: 10000 MBit/s

[INFO] 2 devices are up.

[Device: id=0] RX: 1.58 Mpps, 1060 Mbit/s (1313 Mbit/s with framing)

[Device: id=1] RX: 6.61 Mpps, 4439 Mbit/s (5496 Mbit/s with framing)

[Device: id=0] TX: 2.17 Mpps, 1460 Mbit/s (1808 Mbit/s with framing)

[Device: id=1] TX: 10.21 Mpps, 6859 Mbit/s (8492 Mbit/s with framing)

[Device: id=0] RX: 1.62 Mpps, 1089 Mbit/s (1348 Mbit/s with framing)

[Device: id=1] RX: 6.81 Mpps, 4577 Mbit/s (5667 Mbit/s with framing)

[Device: id=0] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] TX: 10.50 Mpps, 7054 Mbit/s (8733 Mbit/s with framing)

[Device: id=0] RX: 1.64 Mpps, 1100 Mbit/s (1362 Mbit/s with framing)

[Device: id=1] RX: 6.85 Mpps, 4600 Mbit/s (5695 Mbit/s with framing)

[Device: id=0] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] TX: 10.47 Mpps, 7039 Mbit/s (8715 Mbit/s with framing)

^C**[Device: id=0] RX**: 1.63 (StdDev 0.01) Mpps, 1094 (StdDev 8) Mbit/s (1355 Mbit/s with framing), total 5948487 packets with 499674480 bytes (incl. CRC)

[Device: id=1] RX: 6.83 (StdDev 0.02) Mpps, 4589 (StdDev 16) Mbit/s (5681 Mbit/s with framing), total 24908500 packets with 2092315310 bytes (incl. CRC)

[Device: id=0] TX: 2.24 (StdDev 0.00) Mpps, 1503 (StdDev 0) Mbit/s (1860 Mbit/s with framing), total 8178471 packets with 686991564 bytes (incl. CRC)

[Device: id=1] TX: 10.49 (StdDev 0.02) Mpps, 7046 (StdDev 11) Mbit/s (8724 Mbit/s with framing), total 38359881 packets with 3222230004 bytes (incl. CRC)

As you can see, both NFs suffer from packet loss.

Then I also decrease the throughput of NF Y significantly, and keep both traffic rate to 1500Mbps.

$ sudo ./build/MoonGen examples/l2-id.lua 0 1 --rate1 1500 --rate2 1500

[INFO] Initializing DPDK. This will take a few seconds…

EAL: Detected 12 lcore(s)

EAL: No free hugepages reported in hugepages-1048576kB

EAL: Probing VFIO support…

EAL: PCI device 0000:04:00.0 on NUMA socket 0

EAL: probe driver: 8086:10fb net_ixgbe

EAL: PCI device 0000:04:00.1 on NUMA socket 0

EAL: probe driver: 8086:10fb net_ixgbe

[INFO] Found 2 usable devices:

Device 0: 90:E2:BA:86:78:D8 (Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection)

Device 1: 90:E2:BA:86:78:D9 (Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection)

PMD: ixgbe_dev_link_status_print(): Port 0: Link Up - speed 0 Mbps - half-duplex

PMD: ixgbe_dev_link_status_print(): Port 1: Link Down

[INFO] Waiting for devices to come up…

[INFO] Device 1 (90:E2:BA:86:78:D9) is up: 10000 MBit/s

[INFO] Device 0 (90:E2:BA:86:78:D8) is up: 10000 MBit/s

[INFO] 2 devices are up.

[Device: id=0] RX: 2.10 Mpps, 1411 Mbit/s (1748 Mbit/s with framing)

[Device: id=1] RX: 0.01 Mpps, 9 Mbit/s (11 Mbit/s with framing)

[Device: id=0] TX: 2.17 Mpps, 1459 Mbit/s (1806 Mbit/s with framing)

[Device: id=1] TX: 2.16 Mpps, 1453 Mbit/s (1799 Mbit/s with framing)

[Device: id=0] RX: 2.10 Mpps, 1410 Mbit/s (1746 Mbit/s with framing)

[Device: id=1] RX: 0.01 Mpps, 9 Mbit/s (11 Mbit/s with framing)

[Device: id=0] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=0] RX: 2.12 Mpps, 1423 Mbit/s (1761 Mbit/s with framing)

[Device: id=1] RX: 0.01 Mpps, 9 Mbit/s (11 Mbit/s with framing)

[Device: id=0] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=0] RX: 2.13 Mpps, 1429 Mbit/s (1769 Mbit/s with framing)

[Device: id=1] RX: 0.01 Mpps, 9 Mbit/s (11 Mbit/s with framing)

[Device: id=0] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

[Device: id=1] TX: 2.24 Mpps, 1503 Mbit/s (1860 Mbit/s with framing)

^C**[Device: id=0] RX**: 2.11 (StdDev 0.01) Mpps, 1421 (StdDev 9) Mbit/s (1759 Mbit/s with framing), total 9350839 packets with 785471028 bytes (incl. CRC)

[Device: id=1] RX: 0.01 (StdDev 0.00) Mpps, 9 (StdDev 0) Mbit/s (11 Mbit/s with framing), total 58978 packets with 4954704 bytes (incl. CRC)

[Device: id=0] TX: 2.24 (StdDev 0.00) Mpps, 1503 (StdDev 0) Mbit/s (1860 Mbit/s with framing), total 9839781 packets with 826541604 bytes (incl. CRC)

[Device: id=1] TX: 2.24 (StdDev 0.00) Mpps, 1503 (StdDev 0) Mbit/s (1860 Mbit/s with framing), total 9830898 packets with 825795432 bytes (incl. CRC)

As you can see, the packet loss rate of NF Y (id=1) is high, and the packet loss rate of NF X (id=0) also is larger than 0.

Thanks,

Junzhi