Raw_ethernet_lat not working: Unable to create QP

RNIC: ConnectX-4

Ping works as expected:

~/cuda/perftest-23.04.0$ ping 20.4.3.220
PING 20.4.3.220 (20.4.3.220) 56(84) bytes of data.
64 bytes from 20.4.3.220: icmp_seq=1 ttl=64 time=0.617 ms

But following does not work:

$ ./raw_ethernet_lat -d mlx5_0 --source_mac 08:c0:eb:cb:93:52 --dest_mac 08:c0:eb:cb:93:82 --server
 Min msg size for RawEth is 64B - changing msg size to 64 
---------------------------------------------------------------------------------------
                    Send Latency Test
 Dual-port       : OFF		Device         : mlx5_0
 Number of qps   : 1		Transport type : IB
 Connection type : RawEth		Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : OFF
 TX depth        : 1
 RX depth        : 512
 Mtu             : 1518[B]
 Link type       : Ethernet
 GID index       : 0
 Max inline data : 236[B]
 rdma_cm QPs	 : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
MAC attached  : 08:C0:EB:CB:93:52
Unable to create QP.
Failed to create QP.
 Couldn't create IB resources

MAC addresses from source and destination:

Source:       08:c0:eb:cb:93:52
Destination:  08:c0:eb:cb:93:82

Do you have any idea why it is not working?

Pls check MAC is right, and if go through switch see MTU match.

Likely ping 20.4.3.220 from 20.4.3.220? Make sure ping work on 2 port.

https://enterprise-support.nvidia.com/s/article/raw-ethernet-lat

Pls check MAC is right, and if go through switch see MTU match.

MAC is correct on server side. No switch, direct peer to peer connection via optics cable. SFP28 transceiver is used.
Available MAC addresses for server:

enp33s0f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 20.4.3.220  netmask 255.255.255.0  broadcast 20.4.3.255
        inet6 fe80::7228:63dd:1b93:ae80  prefixlen 64  scopeid 0x20<link>
        ether 08:c0:eb:cb:93:52  txqueuelen 1000  (Ethernet)
        RX packets 1811  bytes 109550 (109.5 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 85  bytes 8994 (8.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp33s0f1np1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 08:c0:eb:cb:93:53  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Available MAC addresses for client:

enp6s0f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 20.4.3.219  netmask 255.255.255.0  broadcast 20.4.3.255
        inet6 fe80::83fa:cf69:55a5:5cfa  prefixlen 64  scopeid 0x20<link>
        ether 08:c0:eb:cb:93:82  txqueuelen 1000  (Ethernet)
        RX packets 73  bytes 7942 (7.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1918  bytes 118122 (118.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp6s0f1np1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 08:c0:eb:cb:93:83  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

gids of server:

$ show_gids
DEV	PORT	INDEX	GID					IPv4  		VER	DEV
---	----	-----	---					------------  	---	---
mlx5_0	1	0	fe80:0000:0000:0000:0ac0:ebff:fecb:9352			v1	enp33s0f0np0
mlx5_0	1	1	fe80:0000:0000:0000:0ac0:ebff:fecb:9352			v2	enp33s0f0np0
mlx5_0	1	2	fe80:0000:0000:0000:7228:63dd:1b93:ae80			v1	enp33s0f0np0
mlx5_0	1	3	fe80:0000:0000:0000:7228:63dd:1b93:ae80			v2	enp33s0f0np0
mlx5_0	1	4	0000:0000:0000:0000:0000:ffff:1404:03dc	20.4.3.220  	v1	enp33s0f0np0
mlx5_0	1	5	0000:0000:0000:0000:0000:ffff:1404:03dc	20.4.3.220  	v2	enp33s0f0np0
mlx5_1	1	0	fe80:0000:0000:0000:0ac0:ebff:fecb:9353			v1	enp33s0f1np1
mlx5_1	1	1	fe80:0000:0000:0000:0ac0:ebff:fecb:9353			v2	enp33s0f1np1
n_gids_found=8

gids of client:

$ show_gids
DEV	PORT	INDEX	GID					IPv4  		VER	DEV
---	----	-----	---					------------  	---	---
mlx5_0	1	0	fe80:0000:0000:0000:0ac0:ebff:fecb:9382			v1	enp6s0f0np0
mlx5_0	1	1	fe80:0000:0000:0000:0ac0:ebff:fecb:9382			v2	enp6s0f0np0
mlx5_0	1	2	fe80:0000:0000:0000:83fa:cf69:55a5:5cfa			v1	enp6s0f0np0
mlx5_0	1	3	fe80:0000:0000:0000:83fa:cf69:55a5:5cfa			v2	enp6s0f0np0
mlx5_0	1	4	0000:0000:0000:0000:0000:ffff:1404:03db	20.4.3.219  	v1	enp6s0f0np0
mlx5_0	1	5	0000:0000:0000:0000:0000:ffff:1404:03db	20.4.3.219  	v2	enp6s0f0np0
mlx5_1	1	0	fe80:0000:0000:0000:0ac0:ebff:fecb:9383			v1	enp6s0f1np1
mlx5_1	1	1	fe80:0000:0000:0000:0ac0:ebff:fecb:9383			v2	enp6s0f1np1
n_gids_found=8

ping test from server:

$ ping -c 1 -I 20.4.3.220 20.4.3.219
PING 20.4.3.219 (20.4.3.219) from 20.4.3.220 : 56(84) bytes of data.
64 bytes from 20.4.3.219: icmp_seq=1 ttl=64 time=0.383 ms

--- 20.4.3.219 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.383/0.383/0.383/0.000 ms

ping test from client:

$ ping -c 1 -I 20.4.3.219 20.4.3.220
PING 20.4.3.220 (20.4.3.220) from 20.4.3.219 : 56(84) bytes of data.
64 bytes from 20.4.3.220: icmp_seq=1 ttl=64 time=0.788 ms

--- 20.4.3.220 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.788/0.788/0.788/0.000 ms

mst status from server:

$ sudo mst status -v
MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE             MST                           PCI       RDMA            NET                       NUMA  
ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf0.1    21:00.1   mlx5_1          net-enp33s0f1np1          -1    

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf0      21:00.0   mlx5_0          net-enp33s0f0np0          -1 

mst status from client:

$ sudo mst status -v
MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE             MST                           PCI       RDMA            NET                       NUMA  
ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf0.1    06:00.1   mlx5_1          net-enp6s0f1np1           -1    

ConnectX4LX(rev:0)      /dev/mst/mt4117_pciconf0      06:00.0   mlx5_0          net-enp6s0f0np0           -1

I checked on my lab, no issue.

root@mtbc-r740-07:~# raw_ethernet_lat -d mlx5_2 -E 1c:34:da:65:82:bd -B 1c:34:da:65:82:75 --server -F
Min msg size for RawEth is 64B - changing msg size to 64

                Send Latency Test

Dual-port : OFF Device : mlx5_2
Number of qps : 1 Transport type : IB
Connection type : RawEth Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : OFF
TX depth : 1
RX depth : 512
Mtu : 1518[B]
Link type : Ethernet
GID index : 0
Max inline data : 236[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet

MAC attached : 1C:34:DA:65:82:75
raw ethernet header**************************************


| Dest MAC | Src MAC | Packet Type |
|------------------------------------------------------------|
| 1C:34:DA:65:82:BD| 1C:34:DA:65:82:75|DEFAULT |
|------------------------------------------------------------|


bytes iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec]
64 1000 2.13 2.84 2.18 2.18 0.00 2.23 2.84

root@mtbc-r740-07:~# ibdev2netdev -v
0000:86:00.1 mlx5_2 (MT4121 - MCX556A-EDAT) CX556A - ConnectX-5 QSFP28 fw 16.35.2000 port 1 (ACTIVE) ==> enp134s0f1np1 (Up)

root@mtbc-r740-07:~# ip addr show enp134s0f1np1
6: enp134s0f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 1c:34:da:65:82:75 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.101/24 brd 192.168.1.255 scope global enp134s0f1np1
valid_lft forever preferred_lft forever

you can check if ib_write_bw can work. and if iperf test can work.

And both side same OFED driver, rdma_perf test version.