MPESW mode is broken on DOCA Flow 3.1.0+

Starting with version 3.1.0, DOCA Flow seems to have broken SF setup in MPESW mode or something like that. For example, I have a dual-port Cx6DX (or Cx7) in esw_multiport mode and one SF with index 0 on both PFs. I run the doca_flow application with -a pci/0000:01:00.0 -r pci/01:00.0,pf[0-1]sf0, and get the following error:

[15:58:40:239952][4247071104][DOCA][INF][doca_log.cpp:628] DOCA version 3.1.0105
[15:58:40:240013][4247071104][DOCA][INF][flow_ecmp_main.c:78][main] Starting the sample
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Detected shared linkage of DPDK
EAL: Selected IOVA mode 'VA'
EAL: No free 1048576 kB hugepages reported on node 0
EAL: No free 1048576 kB hugepages reported on node 1
[15:58:40:453482][4247071104][DOCA][WRN][linux_device_adapter.cpp:1551] Failed to extract number from file. failed to open file
[15:58:40:453509][4247071104][DOCA][WRN][linux_device_adapter.cpp:1673] Failed to extract vf_cnt
[15:58:40:453519][4247071104][DOCA][WRN][linux_device_adapter.cpp:1551] Failed to extract number from file. failed to open file
[15:58:40:453523][4247071104][DOCA][WRN][linux_device_adapter.cpp:1678] Failed to extract total_vf_cnt
[15:58:40:453533][4247071104][DOCA][WRN][linux_device_adapter.cpp:1551] Failed to extract number from file. failed to open file
[15:58:40:453536][4247071104][DOCA][WRN][linux_device_adapter.cpp:1683] Failed to extract vf_stride
[15:58:40:453545][4247071104][DOCA][WRN][linux_device_adapter.cpp:1551] Failed to extract number from file. failed to open file
[15:58:40:453548][4247071104][DOCA][WRN][linux_device_adapter.cpp:1688] Failed to extract first_vf_offset
EAL: Probe PCI driver: mlx5_pci (15b3:101d) device: 0000:01:00.0 (socket 0)
EAL: Probe PCI driver: mlx5_pci (15b3:101d) device: 0000:01:00.0 (socket 0)
[15:58:42:057743][4247071104][DOCA][INF][flow_switch_common.c:137][get_dpdk_nb_ports] Port ID 0 is valid DPDK port
[15:58:42:057779][4247071104][DOCA][INF][flow_switch_common.c:137][get_dpdk_nb_ports] Port ID 1 is valid DPDK port
[15:58:42:057788][4247071104][DOCA][INF][flow_switch_common.c:137][get_dpdk_nb_ports] Port ID 2 is valid DPDK port
[15:58:42:057795][4247071104][DOCA][INF][flow_switch_common.c:137][get_dpdk_nb_ports] Port ID 3 is valid DPDK port
[15:58:42:057818][4247071104][DOCA][INF][flow_ecmp_main.c:107][main] Number of DPDK ports is 4, E-switch manager and 3 SF representors
mlx5_net: port 0 cannot enable promiscuous mode in flow isolation mode
[15:58:43:710346][4247071104][DOCA][INF][dpdk_utils.c:126][port_init] Skip represent port 1 init in switch mode
[15:58:43:710391][4247071104][DOCA][INF][dpdk_utils.c:126][port_init] Skip represent port 2 init in switch mode
[15:58:43:710410][4247071104][DOCA][INF][dpdk_utils.c:126][port_init] Skip represent port 3 init in switch mode
[15:58:43:711004][4247071104][DOCA][WRN][engine_model.c:88]adapt_queue_depth adapting queue depth to 128.
[15:58:43:718639][4247071104][DOCA][ERR][flow_switch_common.c:111][init_doca_flow_switch_ports] Insufficient DOCA devices: required 4 for 4 ports, but only 3 provided
[15:58:43:718653][4247071104][DOCA][ERR][flow_ecmp_sample.c:635][flow_ecmp] Failed to init DOCA ports: Invalid input
[15:58:43:719608][4247071104][DOCA][ERR][flow_ecmp_main.c:120][main] flow_ecmp() encountered an error: Invalid input
Device with port_id=3 already stopped
Device with port_id=2 already stopped
Device with port_id=1 already stopped

This problem did not exist in DOCA 3.0.0. Also, in recently released DOCA 3.2.0, this problem remains.

Hi Kamillaova

From your logs:

[15:58:42:057818][4247071104][DOCA][INF][flow_ecmp_main.c:107][main] Number of DPDK ports is 4, E-switch manager and 3 SF representors
...
[15:58:43:718639][4247071104][DOCA][ERR][flow_switch_common.c:111][init_doca_flow_switch_ports] Insufficient DOCA devices: required 4 for 4 ports, but only 3 provided

it looks like DOCA Flow switch-mode initialization expects to create one DOCA device/port per DPDK port, but in this topology it only manages to create 3 while DPDK exposes 4. This also matches the behavior difference you see:
DOCA 3.0.0 works, while the exact same setup fails on 3.1.x / 3.2.0.

To narrow this down, you could try the following:

  1. Compare what DPDK sees vs. what DOCA sees

    First list all DPDK ports and their PCI BDFs:

    dpdk-devbind.py -s

    Identify the 4 mlx5 ports that the sample is using (port IDs 0..3 in your logs) and note their PCI addresses / names.

    # list DOCA-capable devices
    doca_caps --list-devs
    # list DOCA-capable representor devices
    doca_caps --list-rep-devs

    The goal is to check whether one of the DPDK ports (manager or a representor) simply does not appear as a DOCA device/rep. If DPDK consistently reports 4 ports while DOCA only shows 3 usable devices, that would explain the “required 4 for 4 ports, but only 3 provided” error.

  2. Version behavior

    Since you already confirmed that:

    • DOCA 3.0.0: MPESW + SF0 on both PFs works
    • DOCA 3.1.x / 3.2.0: same setup fails with “required 4 but only 3 provided”

    it is reasonable to suspect a behavior change in DOCA Flow’s switch-mode initialization for MPESW + multi-PF SF topologies, rather than a configuration mistake on your side.

    If staying with DOCA 3.0.0 is acceptable for your use case, that is the safest way to avoid this behavior change until a newer version explicitly addresses it.

Thanks