Hello all together,
I have a little problem to configure my connectx-5 vdi cards correctly.
The issue is the following:
I would like to use RoCE and ovs with hardware offloads parallel on my card.
For this I have to change the devlink mode to switchdev according to the instructions. But after that RoCE no longer work.
General:
# ethtool -i enp1s0f0np0 | head -5
driver: mlx5_core
version: 5.8-2.0.3
firmware-version: 16.35.2000 (MT_0000000013)
expansion-rom-version:
bus-info: 0000:01:00.0
# ethtool -i enp1s0f1np1 | head -5
driver: mlx5_core
version: 5.8-2.0.3
firmware-version: 16.35.2000 (MT_0000000013)
expansion-rom-version:
bus-info: 0000:01:00.1
Network config:
auto enp1s0f0np0
iface enp1s0f0np0 inet manual
auto enp1s0f0np0.2
iface enp1s0f0np0.2 inet static
address 10.15.15.1/24
up ip route add 10.15.15.3/32 dev enp1s0f0np0.2
down ip route del 10.15.15.3/32
auto enp1s0f1np1
iface enp1s0f1np1 inet manual
auto enp1s0f1np1.2
iface enp1s0f1np1.2 inet static
address 10.15.15.1/24
up ip route add 10.15.15.2/32 dev enp1s0f1np1.2
down ip route del 10.15.15.2/32
systemd service
Is only enabled in the after step
# cat /etc/systemd/system/mlxn_ofed.service
[Unit]
Description=Configure nvidia connectx network card
Before=network.target
After=openibd.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=bash -c "mlnx_qos -i enp1s0f0np0 --pfc 0,0,0,0,1,0,0,0"
ExecStart=bash -c "echo 2 > /sys/class/net/enp1s0f0np0/device/sriov_numvfs"
ExecStart=bash -c "echo 0000:01:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind"
ExecStart=bash -c "echo 0000:01:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind"
ExecStart=bash -c "devlink dev eswitch set pci/0000:01:00.0 mode switchdev"
ExecStart=bash -c "echo 0000:01:00.2 > /sys/bus/pci/drivers/mlx5_core/bind"
ExecStart=bash -c "echo 0000:01:00.3 > /sys/bus/pci/drivers/mlx5_core/bind"
ExecStart=bash -c "mlnx_qos -i enp1s0f1np1 --pfc 0,0,0,0,1,0,0,0"
ExecStart=bash -c "echo 2 > /sys/class/net/enp1s0f1np1/device/sriov_numvfs"
ExecStart=bash -c "echo 0000:01:01.2 > /sys/bus/pci/drivers/mlx5_core/unbind"
ExecStart=bash -c "echo 0000:01:01.3 > /sys/bus/pci/drivers/mlx5_core/unbind"
ExecStart=bash -c "devlink dev eswitch set pci/0000:01:00.1 mode switchdev"
ExecStart=bash -c "echo 0000:01:01.2 > /sys/bus/pci/drivers/mlx5_core/bind"
ExecStart=bash -c "echo 0000:01:01.3 > /sys/bus/pci/drivers/mlx5_core/bind"
[Install]
WantedBy=multi-user.target
Test
Before:
# show_gids
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_0 1 0 < removed > v1 enp1s0f0np0
mlx5_0 1 1 < removed > v2 enp1s0f0np0
mlx5_0 1 2 < removed > 10.15.15.1 v1 enp1s0f0np0.2
mlx5_0 1 3 < removed > 10.15.15.1 v2 enp1s0f0np0.2
mlx5_0 1 4 < removed > v1 enp1s0f0np0.2
mlx5_0 1 5 < removed > v2 enp1s0f0np0.2
mlx5_1 1 0 < removed > v1 enp1s0f1np1
mlx5_1 1 1 < removed > v2 enp1s0f1np1
mlx5_1 1 2 < removed > 10.15.15.1 v1 enp1s0f1np1.2
mlx5_1 1 3 < removed > 10.15.15.1 v2 enp1s0f1np1.2
# lspci
01:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
01:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
# ip link
5: enp1s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
6: enp1s0f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
13: enp1s0f0np0.2@enp1s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
14: enp1s0f1np1.2@enp1s0f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
After:
# show_gids
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_2 1 0 < removed > v1 enp1s0f0v0
mlx5_2 1 1 < removed > v2 enp1s0f0v0
mlx5_3 1 0 < removed > v1 enp1s0f0v1
mlx5_3 1 1 < removed > v2 enp1s0f0v1
mlx5_4 1 0 < removed > v1 enp1s0f1v0
mlx5_4 1 1 < removed > v2 enp1s0f1v0
mlx5_5 1 0 < removed > v1 enp1s0f1v1
mlx5_5 1 1 < removed > v2 enp1s0f1v1
# lspci
01:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
01:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
01:00.2 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]
01:00.3 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]
01:01.2 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]
01:01.3 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function]
# ip link
5: enp1s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
6: enp1s0f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off
15: enp1s0f0npf0vf0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
16: enp1s0f0npf0vf1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
17: enp1s0f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff permaddr < removed >
18: enp1s0f0v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff permaddr < removed >
21: enp1s0f0np0.2@enp1s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
22: enp1s0f1np1.2@enp1s0f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
23: enp1s0f1npf1vf0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
24: enp1s0f1npf1vf1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff
25: enp1s0f1v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff permaddr < removed >
26: enp1s0f1v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether < removed > brd ff:ff:ff:ff:ff:ff permaddr < removed >
Host Infos
My host is a proxmox server, kernel 5.19 and i have installed with https://linux.mellanox.com/public/repo/mlnx_ofed/5.8-2.0.3.0/debian11.3/mellanox_mlnx_ofed.list
the package mlnx-ofed-all
.
In the end, a mesh network with 3 servers is planned. Therefore the explicit routes in the network part.
Links
I am following this link:
-
https://docs.nvidia.com/networking/display/MLNXOFEDv582030LTS/Features+Overview+and+Configuration
-
Ethernet Network / RDMA over Converged Ethernet (RoCE)
-
OVS Offload Using ASAP² Direct
-
If you still need info, please just say.
Already many thanks for your help.
Kind regards