Intermittent Packet Loss During VRRP MAC Reconfiguration on ConnectX-6 DX VF (SR-IOV, DPDK 23.11.3, vSRX VM)

Problem Description

When any VRRP group transitions its state (Master ↔ Backup), which involves adding or removing the corresponding Virtual MAC address on the VF, 1–2 ICMP packet drops are consistently observed on other VRRP groups’ virtual IPs that remain in a stable state.

The packet loss occurs only on the Mellanox ConnectX-6 Dx SR-IOV VF path.
When the VF is configured in promiscuous mode, the issue does not occur.


Environment

  • Virtualization: SR-IOV enabled — VFs directly assigned to VM (no representor)

  • Guest OS: FreeBSD 15 / Junos OS (vSRX)

  • Application Stack: DPDK 23.11.3 (used by vSRX data plane)

  • Feature in Use: VRRP (~30 groups) using Virtual MACs on VF interface

  • VF Configuration: ALLMULTICAST enabled; promiscuous mode disabled


Hardware and Software Configuration

  • NIC: Mellanox BlueField-2 ConnectX-6 Dx (MT42822)

  • Interface: ens3f1np1 (PF)

  • VFs: 4 active (VF 0–3) — trust mode on, spoof check off

  • Driver: MLX5 25.01-0.6.0

  • Firmware: 24.39.8002

  • Host Kernel: 5.4.0-216-generic

# ethtool -i ens3f1np1
driver: mlx5_core
version: 25.01-0.6.0
firmware-version: 24.39.8002 (MT_0000000540)
bus-info: 0000:19:00.1
supports-statistics: yes
supports-priv-flags: yes

# ip link show ens3f1np1
13: ens3f1np1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 ...
    vf 0  link/ether aa:bb:cc:19:01:01 ..., spoof checking off, trust on
    vf 1  link/ether aa:bb:cc:19:01:02 ..., spoof checking off, trust on
    vf 2  link/ether aa:bb:cc:19:01:03 ..., spoof checking off, trust on
    vf 3  link/ether aa:bb:cc:19:01:04 ..., spoof checking off, trust on


Observation

  • ICMP packet loss correlates exactly with VRRP state transitions on any group.

  • Loss is visible on unrelated VRRP groups’ VIPs.

  • ethtool -S <pf_name> | grep rx_steered shows incrementing error counters during MAC add/remove operations.

  • Enabling VF promiscuous mode fully mitigates the issue.


Request

Please investigate whether this issue is related to VF mac add/delete filtering behavior in the MLX5 driver or firmware when VRRP virtual MACs are dynamically updated on VFs (non-promiscuous mode).

Kindly confirm if this is a known issue or if a fix/workaround is available in later OFED or firmware releases. Just note that DPDK 23.11.x enabling promiscous mode resolve this issue- no 1-2 packets drops during VRRP failover.

Hi ksavinash,

Thank you for posting your query on NVIDIA Community Forum.

Based on issue described, it is fairly complex and requires in-depth troubleshooting as further logs and data collection is needed.

Unfortunately, there are no direct known issues reported based on the behavior shared.

This requires an active support entitlement in place. If there an active entitlement/support contract in place, please do not hesitate to open a support ticket by logging into the ESP Portal and submitting a new case.

For contracts, please reach out to Networking-Contracts@nvidia.com

Thanks,
Namrata.