question
I have two machines,each with two cx6 nics.
Nics in the same network segment can communicate, but nics in different network segments cannot communicate.
env
machine1
root@:~$ rdma link show
link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev ens105f0np0
link mlx5_1/1 state ACTIVE physical_state LINK_UP netdev ens105f1np1
core@:~$ ip a | grep -e ens105f0np0 -e ens105f1np1
8: ens105f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.16.255.10/24 brd 10.16.255.255 scope global ens105f0np0
9: ens105f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.22.1.10/24 brd 10.22.1.255 scope global ens105f1np1
machine2
root@:~$ rdma link show
link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev ens105f0np0
link mlx5_1/1 state ACTIVE physical_state LINK_UP netdev ens105f1np1
core@:~$ ip a | grep -e ens105f0np0 -e ens105f1np1
8: ens105f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.16.255.11/24 brd 10.16.255.255 scope global ens105f0np0
9: ens105f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 10.22.1.11/24 brd 10.22.1.255 scope global ens105f1np1
Scenario
same cidr is normal
core@:~$ ib_write_bw -d mlx5_0
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x00c7 PSN 0x627216 RKey 0x203d00 VAddr 0x007f6f820ac000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:11
remote address: LID 0000 QPN 0x024c PSN 0x50d5db RKey 0x203d00 VAddr 0x007f9f133c7000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:10
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
65536 5000 2758.12 2758.08 0.044129
---------------------------------------------------------------------------------------
core@:~$ ib_write_bw 10.16.255.11 -d mlx5_0
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
TX depth : 128
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x024c PSN 0x50d5db RKey 0x203d00 VAddr 0x007f9f133c7000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:10
remote address: LID 0000 QPN 0x00c7 PSN 0x627216 RKey 0x203d00 VAddr 0x007f6f820ac000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:11
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 800.000000 != 3000.000000. CPU Frequency is not max.
65536 5000 2758.12 2758.08 0.044129
---------------------------------------------------------------------------------------
different cidr is abnormal
core@:~$ ib_write_bw -d mlx5_0
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x00c8 PSN 0xf8ea40 RKey 0x203d00 VAddr 0x007fc295bcc000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:11
remote address: LID 0000 QPN 0x01bf PSN 0xa7999c RKey 0x23fd00 VAddr 0x007f33b8993000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:22:01:10
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
ethernet_read_keys: Couldn't read remote address
Unable to read to socket/rdma_cm
Failed to exchange data between server and clients
core@:~$ ib_write_bw 10.16.255.11 -d mlx5_1
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_1
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
TX depth : 128
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01bf PSN 0xa7999c RKey 0x23fd00 VAddr 0x007f33b8993000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:22:01:10
remote address: LID 0000 QPN 0x00c8 PSN 0xf8ea40 RKey 0x203d00 VAddr 0x007fc295bcc000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:11
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
Completion with error at client
Failed status 12: wr_id 0 syndrom 0x81
scnt=128, ccnt=0
Failed to complete run_iter_bw function successfully
different cidr with “-R” is normal
core@:~$ ib_write_bw -d mlx5_0 -R
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
Waiting for client rdma_cm QP to connect
Please run the same command with the IB/RoCE interface IP
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x00ca PSN 0x7daa14
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:11
remote address: LID 0000 QPN 0x024e PSN 0x134574
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:10
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
65536 5000 2745.92 2745.92 0.043935
---------------------------------------------------------------------------------------
core@:~$ ib_write_bw 10.16.255.11 -d mlx5_1 -R
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_1
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
TX depth : 128
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x024e PSN 0x134574
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:10
remote address: LID 0000 QPN 0x00ca PSN 0x7daa14
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:16:255:11
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MiB/sec] BW average[MiB/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 800.000000 != 2001.000000. CPU Frequency is not max.
65536 5000 2745.92 2745.92 0.043935
---------------------------------------------------------------------------------------