OpenStack Neutron ML2 w/ SR-IOV (VLAN): Can't ping VM IP

Hello,

I’m trying OpenStack Icehouse ML2 w/ SR-IOV (VLAN) on CentOS7 and ConnectX-3 cards (40gE):

(1) Mellanox-Neutron-Icehouse-Redhat-Ethernet - OpenStack https://wiki.openstack.org/wiki/Mellanox-Neutron-Icehouse-Redhat-Ethernet

(2) Nova-neutron-sriov - OpenStack https://wiki.openstack.org/wiki/Nova-neutron-sriov

(3) Mellanox OFED Driver Installation and Configuration for SR-IOV https://community.mellanox.com/s/article/mellanox-ofed-driver-installation-and-configuration-for-sr-iov

After setting SR-IOV w/ ConnectX-3 while following (3), I also verified that vlan is working among systems.

I installed OpenStack Icehouse using packstack on single node (all-in-one).

And then, I modified all the configurations for SR-IOV while following (1), except /etc/neutron/dhcp_agent.ini.

I fixed the interface_driver from BridgeInterfaceDriver to OVSInterfaceDriver in /etc/neutron/dhcp_agent.ini.

Even though VM can get the DHCP IP, host and VM can’t ping each other.

[root@gpu6 ~(keystone_admin)]# cat /etc/modprobe.d/mlx4_core.conf

options mlx4_core port_type_array=2,2 num_vfs=16 probe_vf=0 enable_64b_cqe_eqe=0 log_num_mgm_entry_size=-1

[root@gpu6 ~(keystone_admin)]# neutron net-list

±-------------------------------------±-------±----------------------------------------------------+

| id | name | subnets |

±-------------------------------------±-------±----------------------------------------------------+

| 1c555886-f026-4727-a2e6-99913e383bf2 | net40g | afdeec0e-6b9f-421a-9a5b-421a77c283d8 192.168.2.0/24 |

±-------------------------------------±-------±----------------------------------------------------+

[root@gpu6 ~(keystone_admin)]# neutron subnet-list

±-------------------------------------±------------±---------------±-------------------------------------------------+

| id | name | cidr | allocation_pools |

±-------------------------------------±------------±---------------±-------------------------------------------------+

| afdeec0e-6b9f-421a-9a5b-421a77c283d8 | demo-subnet | 192.168.2.0/24 | {“start”: “192.168.2.2”, “end”: “192.168.2.254”} |

±-------------------------------------±------------±---------------±-------------------------------------------------+

[root@gpu6 ~(keystone_admin)]# neutron port-list

±-------------------------------------±-----------±------------------±-----------------------------------------------------------------------------------+

| id | name | mac_address | fixed_ips |

±-------------------------------------±-----------±------------------±-----------------------------------------------------------------------------------+

| 385600c0-fafa-4e15-b0b4-83f780e26daf | | fa:16:3e:ce:2b:5f | {“subnet_id”: “afdeec0e-6b9f-421a-9a5b-421a77c283d8”, “ip_address”: “192.168.2.2”} |

| 9a291386-c020-4cfd-9e11-bc98fa418566 | | fa:16:3e:90:d8:bc | {“subnet_id”: “afdeec0e-6b9f-421a-9a5b-421a77c283d8”, “ip_address”: “192.168.2.1”} |

| e0f81bbb-2da3-4ba0-9bba-3f90a79fd9a7 | sriov_port | fa:16:3e:8b:83:76 | {“subnet_id”: “afdeec0e-6b9f-421a-9a5b-421a77c283d8”, “ip_address”: “192.168.2.7”} |

±-------------------------------------±-----------±------------------±-----------------------------------------------------------------------------------+

[root@gpu6 ~(keystone_admin)]# ip netns

qdhcp-1c555886-f026-4727-a2e6-99913e383bf2

qrouter-4d297bce-3888-4036-9b63-e61028f9ff8f

[root@gpu6 ~(keystone_admin)]# ip netns exec qdhcp-1c555886-f026-4727-a2e6-99913e383bf2 ping -c1 192.168.2.2

PING 192.168.2.2 (192.168.2.2) 56(84) bytes of data.

64 bytes from 192.168.2.2: icmp_seq=1 ttl=64 time=0.027 ms

— 192.168.2.2 ping statistics —

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.027/0.027/0.027/0.000 ms

[root@gpu6 ~(keystone_admin)]# ip netns exec qdhcp-1c555886-f026-4727-a2e6-99913e383bf2 ping -c1 192.168.2.1

PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.

64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.495 ms

— 192.168.2.1 ping statistics —

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.495/0.495/0.495/0.000 ms

[root@gpu6 ~(keystone_admin)]# ip netns exec qdhcp-1c555886-f026-4727-a2e6-99913e383bf2 ping -c1 192.168.2.7

PING 192.168.2.7 (192.168.2.7) 56(84) bytes of data.

^C

— 192.168.2.7 ping statistics —

1 packets transmitted, 0 received, 100% packet loss, time 0ms

VM also can’t ping 192.168.2.1 or 192.168.2.2, except own IP, 192.168.2.7.

VM’s lspci result is as follows:

00:04.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] [15b3:1004]

00:05.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] [15b3:1004]

[root@gpu6 ~(keystone_admin)]# ovs-vsctl show

af9350bf-af96-4fac-adf5-0cd665e1215e

Bridge br-int

fail_mode: secure

Port “qr-9a291386-c0”

tag: 1

Interface “qr-9a291386-c0”

type: internal

Port int-br-ex

Interface int-br-ex

Port “int-br-ens4”

Interface “int-br-ens4”

Port “tap385600c0-fa”

tag: 1

Interface “tap385600c0-fa”

type: internal

Port br-int

Interface br-int

type: internal

Bridge “br-ens4”

Port “br-ens4”

Interface “br-ens4”

type: internal

Port “ens4”

Interface “ens4”

Port “phy-br-ens4”

Interface “phy-br-ens4”

ovs_version: “2.1.3”

[root@gpu6 ~(keystone_admin)]# ovs-ofctl show br-int

OFPT_FEATURES_REPLY (xid=0x2): dpid:0000029b05424542

n_tables:254, n_buffers:256

capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP

actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST ENQUEUE

25(tap385600c0-fa): addr:00:00:00:00:00:00

config: PORT_DOWN

state: LINK_DOWN

speed: 0 Mbps now, 0 Mbps max

26(qr-9a291386-c0): addr:00:00:00:00:00:00

config: PORT_DOWN

state: LINK_DOWN

speed: 0 Mbps now, 0 Mbps max

29(int-br-ex): addr:ee:06:9e:4b:9e:62

config: 0

state: 0

current: 10GB-FD COPPER

speed: 10000 Mbps now, 0 Mbps max

30(int-br-ens4): addr:6e:aa:42:99:af:d2

config: 0

state: 0

current: 10GB-FD COPPER

speed: 10000 Mbps now, 0 Mbps max

LOCAL(br-int): addr:02:9b:05:42:45:42

config: 0

state: 0

speed: 0 Mbps now, 0 Mbps max

OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

[root@gpu6 ~(keystone_admin)]# ovs-ofctl show br-ens4

OFPT_FEATURES_REPLY (xid=0x2): dpid:000024be05820470

n_tables:254, n_buffers:256

capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP

actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST ENQUEUE

1(ens4): addr:24:be:05:82:04:70

config: 0

state: 0

current: AUTO_NEG

advertised: AUTO_NEG AUTO_PAUSE

supported: FIBER AUTO_NEG AUTO_PAUSE AUTO_PAUSE_ASYM

speed: 0 Mbps now, 0 Mbps max

11(phy-br-ens4): addr:b6:07:55:f6:42:7c

config: 0

state: 0

current: 10GB-FD COPPER

speed: 10000 Mbps now, 0 Mbps max

LOCAL(br-ens4): addr:24:be:05:82:04:70

config: 0

state: 0

speed: 0 Mbps now, 0 Mbps max

OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

Could you please let me know what I shol check for this problem?

I’ve changed single-node setup to multi node setup including one controller+network node and one compute node. After stopping neutron-openvswitch-agent on compute node, it works fine w/ SR-IOV+VLAN.

I’m trying to compare (1) default OpenStack testbed using LibvirtGenericVIFDriver to (2) SR-IOV+VLAN testbed using MlxEthVIFDriver. (1) works fine, but (2) doesn’t have qv* device (ip link / ovs-ofctl show br-int) and doesn’t have iptables rules that are related to neutron-openvswi-* and VM IP/port.