RDMA not working with ConnectX-6

I am trying to get RoCEv2 working. I have two ConnectX-6 NICs (Mellanox Technologies MT2892 Family) connected to each other with an Ethernet cable. The firewall service is disabled.

Using ibv_devinfo, ifconfig, show_gids, modinfo, lspci and ibdev2netdev all show that the NICs are operational and mounted to the correct drivers:

But the following commands are not working: rping, ib_send_bw, ibv_rc_pingpong, qperf.

When I use rping, it shows the following message:

[root@localhost perftest]# rping -c -a 10.1.2.251 -C 10 -v

cma event RDMA_CM_EVENT_ADDR_ERROR, error -19

waiting for addr/route resolution state 1

When I use ib_send_bw, I get the following on the client:

[root@localhost perftest]# ./ib_send_bw 10.1.2.251


Send BW Test

Dual-port : OFF Device : mlx5_0

Number of qps : 1 Transport type : IB

Connection type : RC Using SRQ : OFF

PCIe relax order: ON

ibv_wr* API : ON

TX depth : 128

CQ Moderation : 1

Mtu : 1024[B]

Link type : Ethernet

GID index : 3

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet


Failed to modify QP 182 to RTR

Unable to Connect the HCA’s through the link

…and the following on the server:

[root@localhost perftest]# ./ib_send_bw


  • Waiting for client to connect… *


Send BW Test

Dual-port : OFF Device : mlx5_0

Number of qps : 1 Transport type : IB

Connection type : RC Using SRQ : OFF

PCIe relax order: ON

ibv_wr* API : ON

RX depth : 512

CQ Moderation : 1

Mtu : 1024[B]

Link type : Ethernet

GID index : 3

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet


ethernet_read_keys: Couldn’t read remote address

Unable to read to socket/rdma_cm

Failed to exchange data between server and clients

If I pass the -R commandline option (to use RDMA), I get a different error message:

[root@localhost perftest]# ./ib_send_bw -R 10.1.2.251

Received 10 times ADDR_ERROR

Unable to perform rdma_client function

Unable to init the socket connection

ibv_rc_pingpong reports something very different:

[root@localhost perftest]# ibv_rc_pingpong

local address: LID 0x0000, QPN 0x0000b7, PSN 0xd05bde, GID ::

Failed to modify QP to RTR

Couldn’t connect to remote QP

[root@localhost perftest]# ibv_rc_pingpong 10.1.2.251

local address: LID 0x0000, QPN 0x0000b7, PSN 0x0cf37d, GID ::

client read/write: No space left on device

Couldn’t read/write remote address

qperf also reports an issue with RDMA address lookup:

[root@localhost perftest]# qperf -cm1 10.1.2.251 rc_bw

rc_bw:

unexpected event from RDMA CM: Address error

expecting: Address resolved

My question: are there any additional steps I am missing for configuration? I am using a RHEL 8.3 system with OFED @ MLNX_OFED_LINUX-5.5-1.0.3.2 (OFED-5.5-1.0.3).

Here’s some additional information:

modinfo:

[root@localhost perftest]# modinfo mlx5_core

filename: /lib/modules/4.18.0-348.7.1.el8_5.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko

ifconfig:

[root@localhost perftest]# ifconfig

enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 10.1.2.250 netmask 255.0.0.0 broadcast 10.255.255.255

ether b8:ce:f6:30:10:3a txqueuelen 1000 (Ethernet)

RX packets 21 bytes 2934 (2.8 KiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 23 bytes 1764 (1.7 KiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

[root@localhost perftest]# ifconfig

enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 10.1.2.251 netmask 255.0.0.0 broadcast 10.255.255.255

ether b8:ce:f6:30:10:3a txqueuelen 1000 (Ethernet)

RX packets 21 bytes 2934 (2.8 KiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 23 bytes 1764 (1.7 KiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ibv_devinfo:

hca_id: mlx5_0

transport: InfiniBand (0)

fw_ver: 22.32.1010

node_guid: b8ce:f603:0030:1044

sys_image_guid: b8ce:f603:0030:1044

vendor_id: 0x02c9

vendor_part_id: 4125

hw_ver: 0x0

board_id: MT_0000000709

phys_port_cnt: 1

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

ibdev2netdev:

mlx5_0 port 1 ==> enp1s0 (Up)

show_gids:

[root@localhost perftest]# show_gids

DEV PORT INDEX GID IPv4 VER DEV


mlx5_0 1 0 fe80:0000:0000:0000:bace:f6ff:fe30:1044 v1 enp1s0

mlx5_0 1 1 fe80:0000:0000:0000:bace:f6ff:fe30:1044 v2 enp1s0

mlx5_0 1 2 0000:0000:0000:0000:0000:ffff:0a01:02fb 10.1.2.251 v1 enp1s0

mlx5_0 1 3 0000:0000:0000:0000:0000:ffff:0a01:02fb 10.1.2.251 v2 enp1s0

n_gids_found=4

DEV PORT INDEX GID IPv4 VER DEV


mlx5_0 1 0 fe80:0000:0000:0000:bace:f6ff:fe30:103a v1 enp1s0

mlx5_0 1 1 fe80:0000:0000:0000:bace:f6ff:fe30:103a v2 enp1s0

mlx5_0 1 2 0000:0000:0000:0000:0000:ffff:0a01:02fa 10.1.2.250 v1 enp1s0

mlx5_0 1 3 0000:0000:0000:0000:0000:ffff:0a01:02fa 10.1.2.250 v2 enp1s0

n_gids_found=4

Hello,

Based on the information provided, there are a number of syntax and requirements not met when you had attempted to run these commands.

For more information on the perftest collection, as well as the individual utilities themselves, please see the following community post:

https://community.mellanox.com/s/article/perftest-package

Additionally, more information on qperf and rping can be found here:

https://community.mellanox.com/s/article/MLNX2-117-1581kn

https://community.mellanox.com/s/article/MLNX2-117-2807kn

https://man7.org/linux/man-pages/man1/rping.1.html

For usage examples, please see the following output for each command as run from our lab systems.

Regarding rping, you will need to specify a server instance of rping before running it on the client side.

rping Server Side:

# rping -s -a 10.10.10.10

rping Client Side:

# rping -c -a 10.10.10.10 -C 10 -v

ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqr

ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrs

ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrst

ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstu

ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuv

ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvw

ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwx

ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxy

ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz

ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyzA

client DISCONNECT EVENT…

To run ibv_rc_pingpong, in addition to running and specifying the client and server, you will need to specify the GID with the -g flag. We will provide the lab systems’ GID output for clarity in this example and others that require it.

ibv_rc_pingpong Server side:

# show_gids

DEV PORT INDEX GID IPv4 VER DEV

— ---- ----- — ------------ — —

mlx5_0 1 0 fe80:0000:0000:0000:9a03:9bff:fe13:f21c v1 enp129s0f0

mlx5_0 1 1 fe80:0000:0000:0000:9a03:9bff:fe13:f21c v2 enp129s0f0

mlx5_1 1 0 fe80:0000:0000:0000:9a03:9bff:fe13:f21d v1 enp129s0f1

mlx5_1 1 1 fe80:0000:0000:0000:9a03:9bff:fe13:f21d v2 enp129s0f1

mlx5_1 1 2 0000:0000:0000:0000:0000:ffff:0a0a:0a0a 10.10.10.10 v1 enp129s0f1

mlx5_1 1 3 0000:0000:0000:0000:0000:ffff:0a0a:0a0a 10.10.10.10 v2 enp129s0f1

n_gids_found=6

# ibv_rc_pingpong -d mlx5_1 -g 3

local address: LID 0x0000, QPN 0x000191, PSN 0xbba026, GID ::ffff:10.10.10.10

ibv_rc_pingpong Client side:

# show_gids

DEV PORT INDEX GID IPv4 VER DEV

— ---- ----- — ------------ — —

mlx5_0 1 0 fe80:0000:0000:0000:9a03:9bff:fe13:f414 v1 enp129s0f0

mlx5_0 1 1 fe80:0000:0000:0000:9a03:9bff:fe13:f414 v2 enp129s0f0

mlx5_1 1 0 fe80:0000:0000:0000:9a03:9bff:fe13:f415 v1 enp129s0f1

mlx5_1 1 1 fe80:0000:0000:0000:9a03:9bff:fe13:f415 v2 enp129s0f1

mlx5_1 1 2 0000:0000:0000:0000:0000:ffff:0a0a:0a0c 10.10.10.12 v1 enp129s0f1

mlx5_1 1 3 0000:0000:0000:0000:0000:ffff:0a0a:0a0c 10.10.10.12 v2 enp129s0f1

n_gids_found=6

# ibv_rc_pingpong -d mlx5_1 -g 3 10.10.10.10

local address: LID 0x0000, QPN 0x000191, PSN 0x846386, GID ::ffff:10.10.10.12

remote address: LID 0x0000, QPN 0x000191, PSN 0xbba026, GID ::ffff:10.10.10.10

8192000 bytes in 0.01 seconds = 8031.37 Mbit/sec

1000 iters in 0.01 seconds = 8.16 usec/iter

To run the ib_send_bw test, specify the device with the -d [DeviceName] and GID index for the configured port with the -x flag.

ib_send_bw Server side:

# ib_send_bw -R -d mlx5_1 -x 3

************************************

* Waiting for client to connect… *

************************************

ib_send_bw Client Side:

# ib_send_bw -R -d mlx5_1 10.10.10.12 -x 3

---------------------------------------------------------------------------------------

Send BW Test

Dual-port : OFF Device : mlx5_1

Number of qps : 1 Transport type : IB

Connection type : RC Using SRQ : OFF

PCIe relax order: ON

ibv_wr* API : ON

TX depth : 128

CQ Moderation : 1

Mtu : 1024[B]

Link type : Ethernet

GID index : 3

Max inline data : 0[B]

rdma_cm QPs : ON

Data ex. method : rdma_cm

---------------------------------------------------------------------------------------

local address: LID 0000 QPN 0x018a PSN 0x528956

GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:10:10:10

remote address: LID 0000 QPN 0x018a PSN 0x419cbe

GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:10:10:12

---------------------------------------------------------------------------------------

#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]

Conflicting CPU frequency values detected: 1500.000000 != 2350.000000. CPU Frequency is not max.

65536 1000 2758.33 2758.32 0.044133

---------------------------------------------------------------------------------------

For qperf, you will need to run both the client and server for the included test:

qperf Server Side:

# qperf

qperf Client Side:

# qperf -cm1 10.10.10.10 rc_bw

rc_bw:

bw = 4.63 GB/sec

If you require additional support beyond the usage guidance provided here, please open a support case. If you do not have a current/valid contract, please reach out to the team at Networking-contracts@nvidia.com for assistance in setting one up.

Thank you,

Nvidia Network Support