OpenVswitch + OFED = Verbs device not found?

I am following along the doc at Open vSwitch with DPDK — Open vSwitch 3.0.90 documentation to see OpenVswitch work with DPDK and a CX5 25GBE adapter. It all builds fine but when attempting to add the adapter it errors out (log entries below). Lots and lots of Google search suggests that this is some issue related to the OFED drivers but seemingly nobody has ever encountered it before, or at least posted about it.

I’ve no clue where to start with this one. The system is a fresh install of Rocky 8.7. OFED is MLNX_OFED_LINUX-5.8- DPDK is dpdk-stable-21.11.2 and OVS is openvswitch-2.17.3. There are no special build options or attempts to optimize anything at this point. Just a fresh system following along step by step the docs at the OpenVswitch url above.

I’d be most appreciative if someone can give me a pointer to get back on track!



cat /etc/system-release

Rocky Linux release 8.7 (Green Obsidian)

uname -a

Linux VH04 4.18.0-425.3.1.el8.x86_64 #1 SMP Wed Nov 9 20:13:27 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[root@VH04 dpdk-stable-21.11.2]#

OFED was installed with this command:
./mlnxofedinstall -vv --upstream-libs --dpdk --add-kernel-support

The failing OpenVSwitch command was
ovs-vsctl add-port br0 port0 – set Interface port0 type=dpdk options:dpdk-devargs=0000:81:00.0

This is the Mellanox/NVidia card

lspci | grep Mel

81:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
81:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

OVS thinks DPDK was initialized:
#ovs-vsctl get Open_vSwitch . dpdk_initialized

This is a tail from the ovs log:

tail /usr/local/var/log/openvswitch/ovs-vswitchd.log

2022-11-21T21:20:50.144Z|00071|memory|INFO|peak resident set size grew 805% in last 10.2 seconds, from 76524 kB to 692292 kB
2022-11-21T21:20:50.144Z|00072|memory|INFO|handlers:53 idl-cells:96 ports:1 revalidators:19 rules:5
2022-11-21T21:22:11.368Z|00073|dpdk|INFO|EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: 0000:81:00.0 (socket 1)
2022-11-21T21:22:11.368Z|00074|dpdk|ERR|mlx5_common: Verbs device not found: 0000:81:00.0
2022-11-21T21:22:11.368Z|00075|dpdk|ERR|mlx5_common: Failed to initialize device context.
2022-11-21T21:22:11.368Z|00076|dpdk|ERR|EAL: Driver cannot attach the device (0000:81:00.0)
2022-11-21T21:22:11.368Z|00077|dpdk|ERR|EAL: Failed to attach device on primary process
2022-11-21T21:22:11.368Z|00078|netdev_dpdk|WARN|Error attaching device ‘0000:81:00.0’ to DPDK
2022-11-21T21:22:11.368Z|00079|netdev|WARN|port0: could not set configuration (Invalid argument)
2022-11-21T21:22:11.368Z|00080|dpdk|ERR|Invalid port_id=32

Loaded modules:

lsmod | egrep “(ib|mlx)” | sort

ib_cm 118784 2 rdma_cm,ib_ipoib
ib_core 442368 8 rdma_cm,ib_ipoib,iw_cm,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
ib_ipoib 155648 0
ib_umad 28672 0
ib_uverbs 155648 2 rdma_ucm,mlx5_ib
libahci 40960 1 ahci
libata 266240 2 libahci,ahci
libcrc32c 16384 5 nf_conntrack,nf_nat,openvswitch,nf_tables,xfs
mlx5_core 2105344 1 mlx5_ib
mlx5_ib 462848 0
mlx_compat 16384 11 rdma_cm,ib_ipoib,mlxdevm,iw_cm,ib_umad,ib_core,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm,mlx5_core
mlxdevm 180224 1 mlx5_core
mlxfw 32768 1 mlx5_core
nf_tables 180224 235 nft_ct,nft_reject_inet,nft_fib_ipv6,nft_fib_ipv4,nft_chain_nat,nf_tables_set,nft_reject,nft_fib,nft_fib_inet
nft_fib 16384 3 nft_fib_ipv6,nft_fib_ipv4,nft_fib_inet
nft_fib_inet 16384 1
nft_fib_ipv4 16384 1 nft_fib_inet
nft_fib_ipv6 16384 1 nft_fib_inet
pci_hyperv_intf 16384 1 mlx5_core
psample 20480 1 mlx5_core
tls 110592 1 mlx5_core

I should have included this info as well:

$DPDK_DIR/usertools/ --status

Network devices using DPDK-compatible driver

0000:81:00.0 ‘MT27800 Family [ConnectX-5] 1017’ drv=vfio-pci unused=mlx5_core
0000:81:00.1 ‘MT27800 Family [ConnectX-5] 1017’ drv=vfio-pci unused=mlx5_core

Well, seems the answer to this was in a little note in this article:

For anyone else having this problem, it seems you must not unbind the mlx5_core. IOW, skip this step in the DPDK how-to doc:

$DPDK_DIR/usertools/ --bind=vfio-pci eth1 ← just don’t do it!

NVIDIA MLX DPDK driver not base on VFIO/UIO like INTEL NIC. It base on kernel infiniband verbs layer.

And please read all of below before you run DPDK on MLX device