I am trying to run the doca_flow_ct_tcp sample on my BlueField DPU with DOCA SDK 3.1.0105. I successfully compiled the sample with Meson/Ninja, and hugepages are correctly allocated (HugePages_Total: 4096).
However, when I run the sample, I get the following error:
root@localhost:/opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/build# ./doca_flow_ct_tcp
[21:13:32:762759][3135553568][DOCA][INF][doca_log.cpp:628] DOCA version 3.1.0105
[21:13:32:762884][3135553568][DOCA][INF][flow_ct_tcp_main.c:76][main] Starting the sample
EAL: Detected CPU lcores: 8
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
TELEMETRY: No legacy callbacks, legacy socket not created
[21:13:32:789456][3135553568][DOCA][ERR][flow_ct_tcp_main.c:101][main] Failed to init flow switch common: Unrecognized error code
[21:13:32:790770][3135553568][DOCA][INF][flow_ct_tcp_main.c:131][main] Sample finished with errors
Environment details:
DOCA SDK: 3.1.0105
DPU: BlueField-2
Hugepages: 4096 total, 3943 free
Executing as root
Interfaces: pf0hpf, pf1hpf (representors)
I have verified that the interfaces are up and DPDK bound, but the sample still fails.
My questions:
Are there additional steps required to initialize the flow switch before running doca_flow_ct_tcp?
Are there specific configurations in OVS-DOCA or DPDK that must be applied for the sample to work?
Could this error be related to DPU firmware or DOCA SDK version mismatches?
Any guidance or pointers would be highly appreciated.
log:
root@localhost:/opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/build# meson . ..
The Meson build system
Version: 1.9.1
Source dir: /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp
Build dir: /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/build
Build type: native build
Program cat found: YES (/usr/bin/cat)
Project name: DOCA_SAMPLE
Project version: 3.1.0105
C compiler for the host machine: cc (gcc 11.4.0 "cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C linker for the host machine: cc ld.bfd 2.38
C++ compiler for the host machine: c++ (gcc 11.4.0 "c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C++ linker for the host machine: c++ ld.bfd 2.38
Host machine cpu family: aarch64
Host machine cpu: aarch64
Found pkg-config: YES (/usr/bin/pkg-config) 0.29.2
Run-time dependency doca-common found: YES 3.1.0105
Run-time dependency doca-flow found: YES 3.1.0105
Run-time dependency doca-dpdk-bridge found: YES 3.1.0105
Run-time dependency doca-argp found: YES 3.1.0105
Run-time dependency libdpdk found: YES 22.11.2507.1.0
Build targets in project: 1
Found ninja-1.13.0.git.kitware.jobserver-pipe-1 at /usr/local/bin/ninja
WARNING: Running the setup command as `meson [options]` instead of `meson setup [options]` is ambiguous and deprecated.
root@localhost:/opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/build# ninja
[8/8] Linking target doca_flow_ct_tcp
Did you run this sample application with arguments as presented in /opt/mellanox/doca/samples/doca_flow/flow_ct_tcp/flow_ct_tcp_sample.yaml? Like /tmp/build/doca_flow_ct_tcp – -l 60 -r pci/03:00.0,pf0vf0.
Log:
root@localhost:/opt/mellanox/doca/samples/doca_flow/flow_ct_tcp# /tmp/build/doca_flow_ct_tcp -- -l 60 -r pci/03:00.0,pf0vf0
[05:47:28:690652][2418720800][DOCA][INF][doca_log.cpp:628] DOCA version 3.1.0105
[05:47:28:690784][2418720800][DOCA][INF][flow_ct_tcp_main.c:76][main] Starting the sample
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
TELEMETRY: No legacy callbacks, legacy socket not created
EAL: Probe PCI driver: mlx5_pci (15b3:a2dc) device: 0000:03:00.0 (socket -1)
mlx5_net: port 0 cannot enable promiscuous mode in flow isolation mode
[05:47:28:922739][2418720800][DOCA][DBG][dpdk_utils.c:181][port_init] Port 0 MAC: 5c 25 73 53 14 70
[05:47:28:922774][2418720800][DOCA][INF][dpdk_utils.c:126][port_init] Skip represent port 1 init in switch mode
[05:47:28:923008][2418720800][DOCA][WRN][engine_model.c:88] adapting queue depth to 128.
[05:47:29:439107][2418720800][DOCA][INF][flow_ct_tcp_sample.c:668][flow_ct_tcp] Wait few seconds for 'SYN' packet to arrive
[05:47:34:439330][2418720800][DOCA][INF][flow_ct_tcp_sample.c:465][process_packets] Sample didn't receive packets to process
[05:47:35:220837][2418720800][DOCA][ERR][flow_ct_tcp_main.c:115][main] flow_ct_tcp() encountered an error: Bad State
Device with port_id=1 already stopped
[05:47:35:324924][2418720800][DOCA][DBG][dpdk_utils.c:697][dpdk_fini] DPDK fini is done
[05:47:35:324972][2418720800][DOCA][WRN][doca_dpdk.cpp:544] DPDK dev already detached: 1
[05:47:35:324979][2418720800][DOCA][WRN][doca_dpdk.cpp:544] DPDK dev already detached: 0
[05:47:35:325409][2418720800][DOCA][INF][flow_ct_tcp_main.c:131][main] Sample finished with errors
Sorry, I don’t follow you. Do you mean “which lines of the sample code handle the representor?” or just “What enlightened me to run this sample in this way?”
So you may need to configure your BF to expose at least one representor. For me, it is an VF named pf0vf0. Then you just need to run this sample with the pci address and iface name.
I did not try tcpdump or whatever tools alike. I just ran these flow_ct_tcp/udp samples, read their source code, and wrote scripts to send packets. As long as the code reacts as I expect, I guess it works fine?
If you want to use tcpdump to confirm whether any traffic is offloaded correctly, I guess you may try to (modify one of these samples to) implement a simple echo within your DPU?
I don’t use LinkedIn.
Maybe you can take it as a plan B to learn DOCA by upgrading the deprecated firewall app to use 3.1 before building yours from scratch? I’m not sure if it is possible though.