RoCEv2 is disabled on ConnectX-5 but the NIC still captures RoCEv2 packets.

Hi. I have a Mellanox ConnectX-5 with OFED 5.4 on Ubuntu 18.04 LTS . I want to parse some RoCEv2 packets with a custom program. So I need the ConnectX-5 not to capture RoCEv2 packets. I’ve followed the article How to Disable RoCE to disable the RoCE. If I get it right, now the RoCEv2 packets should be passed to kernel. But it seems the RoCE is still enabled:

  1. ib_send_bw is still working. I used the GID 3 that corresponds to RoCEv2.
  2. hping3 UDP test fails.

$ hping3 –udp -p 4791

No responding. This command will send some UDP packets to the port on the target host. If I use another port, the target will return an ICMP Unreachable which means the kernel receives the UDP packet. But sending UDP to port 4791 (the port RoCEv2 uses) gets no response so I assume the RoCEv2 is still enabled and the NIC drops the “malformed RoCEv2 packet”.

Besides, I tried to remove the kernel module mlx5_ib to completely shutdown the functionality and it worked. But my program is actually a DPDK program that needs the mlx5_ib module, so this is not the solution.

How can I gently disable the RoCEv2 and let the kernel gets the packets?

I’m sorry if I misunderstand something. Thanks for your advice.

Below are some info:

$ ibv_devinfo

hca_id: mlx5_0

transport: InfiniBand (0)

fw_ver: 16.31.1014

node_guid: ec0d:9a03:00bf:d92c

sys_image_guid: ec0d:9a03:00bf:d92c

vendor_id: 0x02c9

vendor_part_id: 4119

hw_ver: 0x0

board_id: MT_0000000012

phys_port_cnt: 1

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

hca_id: mlx5_1

transport: InfiniBand (0)

fw_ver: 16.31.1014

node_guid: ec0d:9a03:00bf:d92d

sys_image_guid: ec0d:9a03:00bf:d92c

vendor_id: 0x02c9

vendor_part_id: 4119

hw_ver: 0x0

board_id: MT_0000000012

phys_port_cnt: 1

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

$ lspci | grep Mellanox

82:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

82:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

$ cat /sys/bus/pci/devices/0000:82:00.*/roce_enable

0

0

$ cat /sys/class/infiniband/mlx5_*/ports/1/gid_attrs/types/3

RoCE v2

RoCE v2

Hi,

You might one of these methods:

  1. mlxconfig to disable it permanently in the firmware

Be sure you have latest MOFED or latest MFT (Mellanox Firmware Tools) installed

Run as root

#mlxconfig -d 82:00.0 s ROCE_CONTROL=1

#mlxconfig -d 82:00.1 s ROCE_CONTROL=1

and reboot the host after that

  1. Using devilnk. This is not permanent change and restarting the openibd service will undo the changes

#which devlink

#[/opt/mellanox/iproute2/sbin/devlink]

#devlink dev param set pci/0000:82:00.0 name enable_roce value false cmode driverinit

#[devlink ]dev reload pci/0000:82:00.0

To restore

#devlink dev param set pci/0000:82:00.0 name enable_roce value true cmode driverinit

#devlink dev reload pci/0000:82:00.0

Thank you very much!

The second method won’t succeed with an error “kernel answers: Operation not supported” and I’m not sure what’s wrong. But anyway, the first method works well and it helps a lot.

By the way, I’m wondering if it is possible to configure the RoCEv2 UDP destination port to a value other than 4791 so that I can bypass the NIC without disabling RoCEv2?

Hi,

Changing RoCEv2 port is not supported

Thank you!