DPDK not working with ConnectX-3 card on Openstack virtual setup

Hi,

I am trying to get DPDK 17.11.4 to run with a ConnectX-3 card on a virtual environment using Openstack.

This uses the VFIO driver which initialises correctly (though running in no-IOMMU mode). However starting DPDK via the testpmd binary fails trying to add default flows to the device.

$ /root/testpmd -c 0xf -n 4 -w 0000:00:07.0 – --rxq=8 --txq=8

EAL: Detected 12 lcore(s)

EAL: No free hugepages reported in hugepages-1048576kB

EAL: Probing VFIO support…

EAL: VFIO support initialized

EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no → using unreliable clock cycles !

EAL: PCI device 0000:00:07.0 on NUMA socket -1

EAL: Invalid NUMA socket, default to 0

EAL: probe driver: 15b3:1004 net_mlx4

PMD: net_mlx4: PCI information matches, using device “mlx4_1” (VF: true)

PMD: net_mlx4: 1 port(s) detected

PMD: net_mlx4: port 1 MAC address is fa:16:3e:de:c6:4b

USER1: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0

Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.

Configuring Port 0 (socket 0)

PMD: net_mlx4: 0x55a3e1527200: cannot attach flow rules (code 95, “Operation not supported”), flow error type 2, cause 0x17328ec40, message: flow rule rejected by device

Fail to start port 0

Please stop the ports first

Done

I’ve seen this error before on a bare metal setup. this was due to the value of the log_num_mgm_entry_size parameter in the mlx4_core kernel module. Setting this to -1 there worked, but it has no effect in this virtual setup. The bare metal system is running the same Linux kernel and RDMA versions as this Openstack environment.

$ more /sys/module/mlx4_core/parameters/log_num_mgm_entry_size

-1

Tracing the code shows a call to ibv_create_flow() returning the error. Why does this report “Operation not supported” when trying to add flows to the device? Is there something else which is required here to get DPDK to work correctly here?

Thanks for any help. Can provide more info if required.

Greg O’Rawe

Hi,

The code 95 report permissions issue, the issue you referred related to allocation memory issue has a different code (code = 12)

So the two problems are not related one to the other.

Could you please check that you correctly installed Mellanox OFED with --dpdk --upstream-libs ?

Which version do you install of MOFED ?

Which kernel version do you use ?

See some related threads http://mails.dpdk.org/archives/users/2018-January/002831.html

Marc

Hi,

Thanks for the reply. I have seen that other thread and it appeared to indicate that changing log_num_mgm_entry_size could be the issue.

I am not using OFED, this is installed using rdma from the Linux kernel

http://doc.dpdk.org/guides/nics/mlx4.html

Kernel version is 3.10.0-862.11.6

RDMA version is 15.7

The docs say kernel version should be 4.14 or later, but this does work with ConnectX-3 on bare metal setup with the same version, and also another system with later ConnectX-4 card on same kernel version works fine.

What permissions issue could this refer to? DPDK/testpmd is trying to add a flow through ibv_create_flow().

Thanks

Greg

To add, setting log_num_mgm_entry_size to PF driver on the hypervisor removes the error 95, and the testpmd comes up, but other similar errors (error 22 Invalid argument) are logged.

$ /root/testpmd -c 0xf -n 4 -w 0000:00:06.0 -w 0000:00:08.0 – --rxq=2 --txq=2 -i

EAL: Detected 12 lcore(s)

EAL: No free hugepages reported in hugepages-1048576kB

EAL: Probing VFIO support…

EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no → using unreliable clock cycles !

EAL: PCI device 0000:00:06.0 on NUMA socket -1

EAL: Invalid NUMA socket, default to 0

EAL: probe driver: 15b3:1004 net_mlx4

PMD: net_mlx4: PCI information matches, using device “mlx4_0” (VF: true)

PMD: net_mlx4: 1 port(s) detected

PMD: net_mlx4: port 1 MAC address is fa:16:3e:c6:5b:df

EAL: PCI device 0000:00:08.0 on NUMA socket -1

EAL: Invalid NUMA socket, default to 0

EAL: probe driver: 15b3:1004 net_mlx4

PMD: net_mlx4: PCI information matches, using device “mlx4_2” (VF: true)

PMD: net_mlx4: 1 port(s) detected

PMD: net_mlx4: port 1 MAC address is fa:16:3e:d9:f9:9d

Interactive-mode selected

USER1: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0

Configuring Port 0 (socket 0)

Port 0: FA:16:3E:C6:5B:DF

Configuring Port 1 (socket 0)

Port 1: FA:16:3E:D9:F9:9D

Checking link statuses…

Done

PMD: net_mlx4: cannot toggle promiscuous mode (code 22, “Invalid argument”), flow error type 1, cause (nil), message: flow rule rejected by device

PMD: net_mlx4: cannot toggle promiscuous mode (code 22, “Invalid argument”), flow error type 1, cause (nil), message: flow rule rejected by device

testpmd>

Also seen on non- test application similar errors plus this one:

May 01 10:28:38 - INFO - PMD: net_mlx4: failed to synchronize flow rules after adding MAC address at index 0 (code 22, “Invalid argument”), flow error type 1, cause (nil), message: flow rule rejected by device.

Hi,

Could you have a try by modifying mlx4_core.conf file to:

/etc/modprobe.d/mlx4_core.conf

options mlx4_core log_num_mgm_entry_size=-7

Marc

Hi,

Have modified with that value, now see a different error:

PMD: net_mlx4: 0x55cfc14c5280: cannot attach flow rules (code 12, “Cannot allocate memory”), flow error type 2, cause 0x13de8acc0, message: flow rule rejected by device

also in /var/log/messages:

May 2 12:43:59 mlx_fe-fe-0 kernel: <mlx4_ib> __mlx4_ib_create_flow: mcg table is full. Fail to register network rule.

May 2 12:43:59 mlx_fe-fe-0 testpmd[2601]: PMD: net_mlx4: 0x55cfc14c1200: cannot attach flow rules (code 12, “Cannot allocate memory”), flow error type 2, cause 0x13de8ee40, message: flow rule rejected by device

What is the mcg table? Is there a configuration to change its size?

thanks

Greg

Hi Greg,

We are fully recommend to don’t work with Connectx-3/Pro with latest version of DPDK.

The error you are now meeting is a known issue and there’s no plan to be fixed.

Can you try to use an older version of Mellanox OFED as described in this community article

HowTo Install DPDK 1.7/1.8 with ConnectX-3 Adapter

https://community.mellanox.com/s/article/howto-install-dpdk-1-7-1-8-with-connectx-3-adapter

Br,

Marc

Hi,

Thanks for the reply.

Unfortunately moving to an earlier version of DPDK is not an option here. Are you saying that earlier versions worked with ConnectX-3 cards but later ones do not? Is there a reason for not maintaining compatibility?

What exactly is the memory issue as to why flow rules cannot be attached to the device in this case? Is there a bug/ticket open somewhere with details of the known issue?

The same card works fine with the same version of DPDK in a bare metal setup (HP ProLiant 360 gen 7 machine running Red Hat Linux 7.5) - this error only occurs when running in a virtual environment.

Many thanks

Greg

Be sure that you are running tested configuration - http://doc.dpdk.org/guides/rel_notes/release_17_11.html

AFAIK, at least Mellanox OFED 4.2 need to be installed and I would recommend go with the latest v4.5 with 17.11.5 LTS for proper support