Mlx-5 for Ubuntu 18.04 (kernel 4.15.0-36) drops all rdma packets

We are not using Mellanox OFED driver. Instead, we are using Linux standard driver. So, in order to capture RoCEv2 traffic, we use port mirroring in a switch to copy all traffic between host and target to a monitoring PC. This PC has a Connect4X rNIC. It was running Ubuntu 17.10 with 4.13 kernel. I was able to run Wireshark to capture the RoCEv2 traffic.

I recently upgraded that PC to Ubuntu 18.04 with 4.15 kernel. After that, I can’t capture RoCE traffic any more. I debug a bit and got these counters.

yao@Host2:~$ ethtool -S enp21s0f0 | grep rdma

rx_vport_rdma_unicast_packets: 3112973874

rx_vport_rdma_unicast_bytes: 1434773360288

tx_vport_rdma_unicast_packets: 362387261

tx_vport_rdma_unicast_bytes: 27309669154

So Mlx-5 driver drops all RDMA packets. Why is it doing that? Is there any configuration to change its behavior back to the old one (in Kernel 4.13)? I don’t see such a configuration in Mellanox_OFED_Linux_User_Manual_v4_3.pdf.

Here’s the driver info in the old kernel:

dotadmin@DavidLenovo:~$ modinfo mlx5_core

filename: /lib/modules/4.13.0-46-generic/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko

version: 5.0-0

license: Dual BSD/GPL

description: Mellanox Connect-IB, ConnectX-4 core driver

author: Eli Cohen <eli@mellanox.com mailto:eli@mellanox.com >

srcversion: 79D72EC6EB494E762310F77

alias: pci:v000015B3d0000A2D3svsdbcsci*

alias: pci:v000015B3d0000A2D2svsdbcsci*

alias: pci:v000015B3d0000101Csvsdbcsci*

alias: pci:v000015B3d0000101Bsvsdbcsci*

alias: pci:v000015B3d0000101Asvsdbcsci*

alias: pci:v000015B3d00001019svsdbcsci*

alias: pci:v000015B3d00001018svsdbcsci*

alias: pci:v000015B3d00001017svsdbcsci*

alias: pci:v000015B3d00001016svsdbcsci*

alias: pci:v000015B3d00001015svsdbcsci*

alias: pci:v000015B3d00001014svsdbcsci*

alias: pci:v000015B3d00001013svsdbcsci*

alias: pci:v000015B3d00001012svsdbcsci*

alias: pci:v000015B3d00001011svsdbcsci*

depends: devlink,ptp,mlxfw

intree: Y

name: mlx5_core

vermagic: 4.13.0-46-generic SMP mod_unload

signat: PKCS#7

signer:

sig_key:

sig_hashalgo: md4

parm: debug_mask:debug mask: 1 = dump cmd data, 2 = dump cmd exec time, 3 = both. Default=0 (uint)

parm: prof_sel:profile selector. Valid range 0 - 2 (uint)

Here’s the driver info in the new kernel:

yao@Host2:~$ modinfo mlx5_core

filename: /lib/modules/4.15.0-36-generic/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko

version: 5.0-0

license: Dual BSD/GPL

description: Mellanox Connect-IB, ConnectX-4 core driver

author: Eli Cohen <eli@mellanox.com mailto:eli@mellanox.com >

srcversion: C271CE9036D77E924A8E038

alias: pci:v000015B3d0000A2D3svsdbcsci*

alias: pci:v000015B3d0000A2D2svsdbcsci*

alias: pci:v000015B3d0000101Csvsdbcsci*

alias: pci:v000015B3d0000101Bsvsdbcsci*

alias: pci:v000015B3d0000101Asvsdbcsci*

alias: pci:v000015B3d00001019svsdbcsci*

alias: pci:v000015B3d00001018svsdbcsci*

alias: pci:v000015B3d00001017svsdbcsci*

alias: pci:v000015B3d00001016svsdbcsci*

alias: pci:v000015B3d00001015svsdbcsci*

alias: pci:v000015B3d00001014svsdbcsci*

alias: pci:v000015B3d00001013svsdbcsci*

alias: pci:v000015B3d00001012svsdbcsci*

alias: pci:v000015B3d00001011svsdbcsci*

depends: devlink,ptp,mlxfw

retpoline: Y

intree: Y

name: mlx5_core

vermagic: 4.15.0-36-generic SMP mod_unload

signat: PKCS#7

signer:

sig_key:

sig_hashalgo: md4

parm: debug_mask:debug mask: 1 = dump cmd data, 2 = dump cmd exec time, 3 = both. Default=0 (uint)

parm: prof_sel:profile selector. Valid range 0 - 2 (uint)

So srcversion is different, even though version is the same.