Hello Nvidia community,
I’m trying to configure hardware offloading with connectX6-DX on OpenStack environment in switchde mode. The configuration is done according to OpenStack configuration guide and nvidia configuration guide.
I’m using openvswitch/kernel from nvidia doca repository
doca-openvswitch-common 3.0.0-0056-25.04-based-3.3.5
uname -a
Linux kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa 6.8.0-1029-nvidia #32~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri May 30 17:22:17 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
devlink dev eswitch show pci/0000:21:00.0
pci/0000:21:00.0: mode switchdev inline-mode none encap-mode basic
Querying Mellanox devices firmware ...
Device #1:
----------
Device Type: ConnectX6DX
Part Number: MCX623432AC-ADA_Ax
Description: ConnectX-6 Dx EN adapter card; 25GbE OCP3.0; With Host management; Dual-port SFP28; PCIe 4.0 x16; Crypto and Secure Boot;
PSID: MT_0000000440
PCI Device Name: 21:00.0
Base GUID: 08c0eb03008bf98a
Base MAC: 08c0eb8bf98a
Versions: Current Available
FW 22.43.1014 22.43.1014
PXE 3.7.0500 3.7.0500
UEFI 14.36.0016 14.36.0016
Status: Up to date
OpenStack configures the following topology in ovs
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl show
12e00cdb-6520-4666-908f-57f59e8b9bd4
Manager "ptcp:6640:127.0.0.1"
is_connected: true
Bridge br-ex
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port br-ex
Interface br-ex
type: internal
Port pr-floating
Interface pr-floating
Port phy-br-ex
Interface phy-br-ex
type: patch
options: {peer=int-br-ex}
Bridge br-int
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}
Port br-int
Interface br-int
type: internal
Port int-br-tenant
Interface int-br-tenant
type: patch
options: {peer=phy-br-tenant}
Port tap58ac2b17-fb
tag: 1
Interface tap58ac2b17-fb
Port int-br-ex
Interface int-br-ex
type: patch
options: {peer=phy-br-ex}
Port eth14
tag: 1
Interface eth14
Bridge br-tenant
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port br-tenant
Interface br-tenant
type: internal
Port phy-br-tenant
Interface phy-br-tenant
type: patch
options: {peer=int-br-tenant}
Port bond0
Interface bond0
Port enp33s0f0np0
Interface enp33s0f0np0
Bridge br-tun
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
datapath_type: system
Port br-tun
Interface br-tun
type: internal
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}
ovs_version: "3.0.0-0056-25.04-based-3.3.5"
Representor port for VM (eth14) is plugged into br-int (integration bridge), PF enp33s0f0np0 is plugged into br-tenant bridge.
Openstack configures the following openvswitch rules
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-ofctl -O OpenFlow14 dump-flows br-int
cookie=0x45b91976d5c4f175, duration=443.375s, table=0, n_packets=0, n_bytes=0, priority=65535,dl_vlan=4095 actions=drop
cookie=0x925fef9f789a4751, duration=443.121s, table=0, n_packets=1713, n_bytes=167326, reset_counts priority=200,reg3=0 actions=set_queue:0,load:0x1->NXM_NX_REG3[0],resubmit(,0)
cookie=0x45b91976d5c4f175, duration=441.099s, table=0, n_packets=3838, n_bytes=756232, priority=3,in_port="int-br-tenant",dl_vlan=1443 actions=set_field:4097->vlan_vid,goto_table:58
cookie=0x45b91976d5c4f175, duration=443.367s, table=0, n_packets=72, n_bytes=5040, priority=2,in_port="int-br-ex" actions=drop
cookie=0x45b91976d5c4f175, duration=443.361s, table=0, n_packets=22619, n_bytes=1361096, priority=2,in_port="int-br-tenant" actions=drop
cookie=0x45b91976d5c4f175, duration=443.379s, table=0, n_packets=4991, n_bytes=376729, priority=0 actions=goto_table:58
cookie=0x45b91976d5c4f175, duration=443.380s, table=23, n_packets=0, n_bytes=0, priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.376s, table=24, n_packets=0, n_bytes=0, priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.374s, table=30, n_packets=0, n_bytes=0, priority=0 actions=goto_table:58
cookie=0x45b91976d5c4f175, duration=443.373s, table=31, n_packets=0, n_bytes=0, priority=0 actions=goto_table:58
cookie=0x45b91976d5c4f175, duration=443.378s, table=58, n_packets=8829, n_bytes=1132961, priority=0 actions=goto_table:60
cookie=0x45b91976d5c4f175, duration=441.069s, table=60, n_packets=1839, n_bytes=103507, priority=100,in_port="tap58ac2b17-fb" actions=set_field:0xc->reg5,set_field:0x1->reg6,resubmit(,73)
cookie=0x45b91976d5c4f175, duration=233.443s, table=60, n_packets=314, n_bytes=35574, priority=100,in_port=eth14 actions=set_field:0x11->reg5,set_field:0x1->reg6,resubmit(,73)
cookie=0x45b91976d5c4f175, duration=443.376s, table=60, n_packets=5492, n_bytes=877458, priority=1 actions=NORMAL
cookie=0x45b91976d5c4f175, duration=443.374s, table=62, n_packets=0, n_bytes=0, priority=3 actions=NORMAL
cookie=0x45b91976d5c4f175, duration=443.088s, table=71, n_packets=0, n_bytes=0, reset_counts priority=110,ct_state=+trk actions=ct_clear,resubmit(,71)
cookie=0x45b91976d5c4f175, duration=443.107s, table=71, n_packets=0, n_bytes=0, reset_counts priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.104s, table=72, n_packets=0, n_bytes=0, reset_counts priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=441.069s, table=73, n_packets=1853, n_bytes=104373, priority=80,reg5=0xc actions=resubmit(,94)
cookie=0x45b91976d5c4f175, duration=233.443s, table=73, n_packets=314, n_bytes=35574, priority=80,reg5=0x11 actions=resubmit(,94)
cookie=0x45b91976d5c4f175, duration=443.099s, table=73, n_packets=0, n_bytes=0, reset_counts priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.094s, table=81, n_packets=0, n_bytes=0, reset_counts priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.090s, table=82, n_packets=0, n_bytes=0, reset_counts priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.082s, table=91, n_packets=0, n_bytes=0, reset_counts priority=1 actions=resubmit(,94)
cookie=0x45b91976d5c4f175, duration=443.079s, table=92, n_packets=0, n_bytes=0, reset_counts priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.074s, table=93, n_packets=0, n_bytes=0, reset_counts priority=0 actions=drop
cookie=0x45b91976d5c4f175, duration=443.085s, table=94, n_packets=372, n_bytes=38010, reset_counts priority=1 actions=NORMAL
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-ofctl -O OpenFlow14 dump-flows br-tenant
cookie=0x61ac610f761b0239, duration=457.450s, table=0, n_packets=1793, n_bytes=117429, priority=4,in_port="phy-br-tenant",dl_vlan=1 actions=set_field:5539->vlan_vid,NORMAL
cookie=0x61ac610f761b0239, duration=459.708s, table=0, n_packets=583, n_bytes=30546, priority=2,in_port="phy-br-tenant" actions=drop
cookie=0x61ac610f761b0239, duration=459.711s, table=0, n_packets=45623, n_bytes=3468253, priority=0 actions=NORMAL
Connectivity from/to VM works, but hardware offloading is not happening. I see all packets when running tcpdump on eth14 interface, and ovs shows 0 hardware offloaded rules
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-appctl dpif/show
system@ovs-system: hit:120798 missed:28004
br-ex:
br-ex 65534/1: (internal)
phy-br-ex 2/none: (patch: peer=int-br-ex)
pr-floating 1/2: (system)
offloaded flows: 0
offloaded packets: 1.79% (170/9484)
offloaded bytes: 2.22% (14718/663444)
br-int:
br-int 65534/5: (internal)
eth14 17/7: (system)
int-br-ex 1/none: (patch: peer=phy-br-ex)
int-br-tenant 2/none: (patch: peer=phy-br-tenant)
patch-tun 3/none: (patch: peer=patch-int)
tap58ac2b17-fb 12/3: (system)
offloaded flows: 0
offloaded packets: 0.52% (1126/218315)
offloaded bytes: 0.56% (115914/20788049)
br-tenant:
bond0 11/9: (system)
br-tenant 65534/4: (internal)
enp33s0f0np0 6/8: (system)
phy-br-tenant 2/none: (patch: peer=int-br-tenant)
offloaded flows: 0
offloaded packets: 13.03% (16539/126884)
offloaded bytes: 16.10% (1645967/10221371)
br-tun:
br-tun 65534/6: (internal)
patch-int 1/none: (patch: peer=patch-tun)
offloaded flows: 0
offloaded packets: 2.33% (632/27128)
offloaded bytes: 2.43% (42976/1767470)
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-appctl dpctl/dump-flows type=offloaded br-int
recirc_id(0),skb_priority(0),in_port(3),eth(src=fa:16:3e:f2:8c:16,dst=fa:16:3e:47:70:d3),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(3),eth(src=fa:16:3e:f2:8c:16,dst=fa:16:3e:a5:80:1e),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(7),eth(src=fa:16:3e:36:1b:90,dst=fa:16:3e:39:90:d8),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(7),eth(src=fa:16:3e:36:1b:90,dst=fa:16:3e:47:70:d3),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(7),eth(src=fa:16:3e:36:1b:90,dst=fa:16:3e:a5:80:1e),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:a5:80:1e,dst=fa:16:3e:f2:8c:16),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),3
recirc_id(0),in_port(8),eth(src=00:01:e8:8b:09:58,dst=01:80:c2:00:00:00),eth_type(0/0xffff), packets:10735, bytes:644100, used:0.173s, actions:drop
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:47:70:d3,dst=fa:16:3e:f2:8c:16),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),3
recirc_id(0),skb_priority(0),in_port(8),eth(src=0c:c4:7a:6c:83:58,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(frag=no), packets:21729, bytes:1303740, used:0.111s, actions:4,9
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:a5:80:1e,dst=fa:16:3e:36:1b:90),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),7
recirc_id(0),skb_priority(0),in_port(8),eth(src=0c:c4:7a:6c:85:dc,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(frag=no), packets:21802, bytes:1308120, used:0.864s, actions:4,9
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:39:90:d8,dst=fa:16:3e:36:1b:90),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),7
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:47:70:d3,dst=fa:16:3e:36:1b:90),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),7
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-appctl dpctl/dump-flows type=offloaded br-tenant
recirc_id(0),skb_priority(0),in_port(3),eth(src=fa:16:3e:f2:8c:16,dst=fa:16:3e:47:70:d3),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(3),eth(src=fa:16:3e:f2:8c:16,dst=fa:16:3e:a5:80:1e),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(7),eth(src=fa:16:3e:36:1b:90,dst=fa:16:3e:47:70:d3),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(7),eth(src=fa:16:3e:36:1b:90,dst=fa:16:3e:a5:80:1e),eth_type(0x0806), packets:0, bytes:0, used:never, actions:push_vlan(vid=1443,pcp=0),set(skb_priority(0x10001)),8
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:a5:80:1e,dst=fa:16:3e:f2:8c:16),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),3
recirc_id(0),in_port(8),eth(src=00:01:e8:8b:09:58,dst=01:80:c2:00:00:00),eth_type(0/0xffff), packets:10741, bytes:644460, used:0.188s, actions:drop
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:47:70:d3,dst=fa:16:3e:f2:8c:16),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),3
recirc_id(0),skb_priority(0),in_port(8),eth(src=0c:c4:7a:6c:83:58,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(frag=no), packets:21741, bytes:1304460, used:0.247s, actions:4,9
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:a5:80:1e,dst=fa:16:3e:36:1b:90),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),7
recirc_id(0),skb_priority(0),in_port(8),eth(src=0c:c4:7a:6c:85:dc,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(frag=no), packets:21815, bytes:1308900, used:0.120s, actions:4,9
recirc_id(0),skb_priority(0),in_port(8),eth(src=fa:16:3e:47:70:d3,dst=fa:16:3e:36:1b:90),eth_type(0x8100),vlan(vid=1443,pcp=0),encap(eth_type(0x0806)), packets:0, bytes:0, used:never, actions:pop_vlan,set(skb_priority(0x10001)),7
recirc_id(0),in_port(8),eth(src=00:01:e8:8b:09:58,dst=01:80:c2:00:00:02),eth_type(0x8809), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(8),eth(src=00:01:e8:8b:09:58,dst=01:80:c2:00:00:0e),eth_type(0x88cc), packets:0, bytes:0, used:never, actions:drop
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-appctl dpctl/show
system@ovs-system:
lookups: hit:120652 missed:27984 lost:6
flows: 13
masks: hit:208088 total:5 hit/pkt:1.40
cache: hit:116396 hit-rate:78.31%
caches:
masks-cache: size:256
port 0: ovs-system (internal)
port 1: br-ex (internal)
port 2: pr-floating
port 3: tap58ac2b17-fb
port 4: br-tenant (internal)
port 5: br-int (internal)
port 6: br-tun (internal)
port 7: eth14
port 8: enp33s0f0np0
port 9: bond0
It seems not all ovs actions can be offloaded into hardware, the suspission is on this rule
cookie=0x45b91976d5c4f175, duration=233.443s, table=60, n_packets=314, n_bytes=35574, priority=100,in_port=eth14 actions=set_field:0x11->reg5,set_field:0x1->reg6,resubmit(,73)
When moving eth14 into br-tenant and assiging vlan via tag attribute offloading starts working
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl del-port eth14
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl add-port br-tenant eth14
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-vsctl set port eth14 tag=1443
root@kaas-node-52648a8f-cbfd-4b9a-b392-1594b1d97cfa:~# ovs-appctl dpif/show
system@ovs-system: hit:121910 missed:28177
br-ex:
br-ex 65534/1: (internal)
phy-br-ex 2/none: (patch: peer=int-br-ex)
pr-floating 1/2: (system)
offloaded flows: 2
offloaded packets: 1.79% (170/9488)
offloaded bytes: 2.22% (14718/663756)
br-int:
br-int 65534/5: (internal)
int-br-ex 1/none: (patch: peer=phy-br-ex)
int-br-tenant 2/none: (patch: peer=phy-br-tenant)
patch-tun 3/none: (patch: peer=patch-int)
tap58ac2b17-fb 12/3: (system)
offloaded flows: 2
offloaded packets: 0.51% (1126/221633)
offloaded bytes: 0.55% (115914/21049041)
br-tenant:
bond0 11/9: (system)
br-tenant 65534/4: (internal)
enp33s0f0np0 6/8: (system)
eth14 13/7: (system)
phy-br-tenant 2/none: (patch: peer=int-br-tenant)
offloaded flows: 2
offloaded packets: 12.91% (16541/128168)
offloaded bytes: 15.95% (1646091/10317855)
br-tun:
br-tun 65534/6: (internal)
patch-int 1/none: (patch: peer=patch-tun)
offloaded flows: 2
offloaded packets: 2.33% (632/27135)
offloaded bytes: 2.43% (42976/1768088)
Can you please help to identify reason why hardware offloading does not work? And where to find list of support ovs actions/rules that may be offloaded into hardware?