I am trying to get RoCEv2 working. I have two ConnectX-6 NICs (Mellanox Technologies MT2892 Family) connected to each other with an Ethernet cable. The firewall service is disabled.
Using ibv_devinfo, ifconfig, show_gids, modinfo, lspci and ibdev2netdev all show that the NICs are operational and mounted to the correct drivers:
But the following commands are not working: rping, ib_send_bw, ibv_rc_pingpong, qperf.
When I use rping, it shows the following message:
[root@localhost perftest]# rping -c -a 10.1.2.251 -C 10 -v
cma event RDMA_CM_EVENT_ADDR_ERROR, error -19
waiting for addr/route resolution state 1
When I use ib_send_bw, I get the following on the client:
[root@localhost perftest]# ./ib_send_bw 10.1.2.251
Send BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : ON
TX depth : 128
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
Failed to modify QP 182 to RTR
Unable to Connect the HCA’s through the link
…and the following on the server:
[root@localhost perftest]# ./ib_send_bw
- Waiting for client to connect… *
Send BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : ON
RX depth : 512
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
ethernet_read_keys: Couldn’t read remote address
Unable to read to socket/rdma_cm
Failed to exchange data between server and clients
If I pass the -R commandline option (to use RDMA), I get a different error message:
[root@localhost perftest]# ./ib_send_bw -R 10.1.2.251
Received 10 times ADDR_ERROR
Unable to perform rdma_client function
Unable to init the socket connection
ibv_rc_pingpong reports something very different:
[root@localhost perftest]# ibv_rc_pingpong
local address: LID 0x0000, QPN 0x0000b7, PSN 0xd05bde, GID ::
Failed to modify QP to RTR
Couldn’t connect to remote QP
[root@localhost perftest]# ibv_rc_pingpong 10.1.2.251
local address: LID 0x0000, QPN 0x0000b7, PSN 0x0cf37d, GID ::
client read/write: No space left on device
Couldn’t read/write remote address
qperf also reports an issue with RDMA address lookup:
[root@localhost perftest]# qperf -cm1 10.1.2.251 rc_bw
rc_bw:
unexpected event from RDMA CM: Address error
expecting: Address resolved
My question: are there any additional steps I am missing for configuration? I am using a RHEL 8.3 system with OFED @ MLNX_OFED_LINUX-5.5-1.0.3.2 (OFED-5.5-1.0.3).