Orin VLAN support appears to be broken

Hello,

I have an Orin AGX dev board. It appears that hardware support for Ethernet is broken. Specifically egress works, but ingress does not.

I setup a simple VLAN using netplan with ID 2:

network:
  version: 2
  renderer: networkd
  ethernets:
    eth0:
      dhcp4: yes
  vlans:
    eth0.2:
      id: 2
      link: eth0
      dhcp4: no
      addresses : [ 172.27.153.3/24 ]

This creates an interface with HW acceleration:

[    9.083204] nvethernet 6810000.ethernet: eth0 (HW ver: 31) created with 10 DMA channels
[   11.472843] 8021q: adding VLAN 0 to HW filter on device eth0

I observe tagged packets come out of this new eth0.2 interface, but I don’t see packets come back —

If you try to delete the VLAN, and recreate it however:

sudo ip link del eth0.2
sudo ip link add link eth0 name eth0.2 type vlan id 2

Suddenly things start working.

I notice this message in the kernel buffer:

[  659.856533] nvethernet 6810000.ethernet eth0: failed to kill vid 0081/2

This leads me to believe the new interface isn’t using hardware acceleration.

So, I think HW VLAN tagging support is broken on the RX side in the Orin drivers. Has anyone got it working?

If it helps, this is with 35.1:

uname -arn
Linux orin 5.10.104-tegra #1 SMP PREEMPT Wed Aug 10 20:17:07 PDT 2022 aarch64 aarch64 aarch64 GNU/Linux

Sorry for the late response, I will forward this issue to our internal team to do the investigation. Thanks

hi @kayccc - do you know if there was any update on this?

We belive, we have the same problem on Jetson AGX Orin, kernel version 5.10.120.
If vlan interface is created while main nvethernet interface is down, we don’t see incoming tagged packets. But network works, if we create vlan interface after bringing main interface up.

In short, to reproduce problem:

  • Connect jetson board to external host
  • On external host set network like this (you may need to change interface name):
sudo ip link add dev eth0.2 link eth0 type vlan id 2
sudo ip link set dev eth0 up
sudo ip link set dev eth0.2 up
sudo ip addr add dev eth0.2 192.168.2.1/24
  • On jetson board set up network this way, right after boot:
sudo ip link add dev eth1.2 link eth1 type vlan id 2
sudo ip link set dev eth1 up
sudo ip link set dev eth1.2 up
sudo ip addr add dev eth1.2 192.168.2.10/24
  • Try pinging external host form jetson board, it doesn’t work:
$ ping -c 4 192.168.2.1
PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
From 192.168.2.10 icmp_seq=1 Destination Host Unreachable
From 192.168.2.10 icmp_seq=2 Destination Host Unreachable
From 192.168.2.10 icmp_seq=3 Destination Host Unreachable
From 192.168.2.10 icmp_seq=4 Destination Host Unreachable

--- 192.168.2.1 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3064ms

If during network setup on jetson board we instead run sudo ip link set dev eth1 up and then sudo ip link add dev eth1.2 link eth1 type vlan id 2, everything works as expected.

Hi,

Need to use this sequence instead.

sudo ip link set dev eth1 up
sudo ip link add dev eth1.2 link eth1 type vlan id 2
sudo ip link set dev eth1.2 up
sudo ip addr add dev eth1.2 192.168.2.10/24

It’s mentioned in my post, that this sequence works:

Both sequences are correct, previous one also must work. If network is being set up manually from interactive shell, interface can be brought up first as a workaround.
But in production environment with hundreds of boards deployed, systemd-networkd or some similar automatic network configurator will be used (like e.g. netplan from the original poster), as manual setup is unfeasible and custom shell scripts for network setup are error prone. These configurators bring vlan interface first, which leads to de facto broken network and confusing behaviour.

Here is a more concrete example of failing systemd-networkd configuration:

  • On external host
sudo ip link add dev eth0.2 link eth0 type vlan id 2
sudo ip link set dev eth0 up
sudo ip link set dev eth0.2 up
sudo ip addr add dev eth0.2 192.168.2.1/24
  • On jetson board:
sudo tee /etc/systemd/network/eth1.network <<HERE
[Match]
Name=eth1
[Link]
RequiredForOnline=carrier
[Network]
IPv6AcceptRA=false
LinkLocalAddressing=no
VLAN=vlan2
HERE

sudo tee /etc/systemd/network/vlan2.network <<HERE
[Match]
Name=vlan2
[Link]
RequiredForOnline=routable
[Network]
DNS=8.8.8.8
Address=192.168.2.2/24
[Route]
Gateway=192.168.2.1
HERE

sudo tee /etc/systemd/network/vlan2.netdev <<HERE
[NetDev]
Kind=vlan
Name=vlan2
[VLAN]
Id=2
HERE

sudo systemctl daemon-reload
sudo systemctl restart systemd-networkd
  • External host is not accessible:
$ ping -c 4 192.168.2.1
PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
From 192.168.2.10 icmp_seq=1 Destination Host Unreachable
From 192.168.2.10 icmp_seq=2 Destination Host Unreachable
From 192.168.2.10 icmp_seq=3 Destination Host Unreachable
From 192.168.2.10 icmp_seq=4 Destination Host Unreachable

So, to restate the problem in simpler terms:

  1. Automatic network configuration tools like systemd-networkd or netplan first create vlan interface (sudo ip link add dev eth1.2 link eth1 type vlan id 2), then bring the main interface up (sudo ip link set dev eth1 up), but this order is not supported by nvethernet.
    systemd-networkd is widely used in many linux distributions and works with any other linux network driver.
  2. Also, we can create eth1.2 while eth1 is down, it appears in the system, no errors or warnings show up anywhere. Then, we bring eth1 and eth1.2 up, but we don’t see incoming packets on eth1.2, it’s in a broken state, why would driver allow us to create it in the first place?

Therefore we believe this behaviour to be a bug, as it’s non-standard(1) and confusing(2).

We are looking into this issue.

Hi,

Please apply this patch and shall meet your expectation.

diff --git a/drivers/net/ethernet/nvidia/nvethernet/ether_linux.c b/drivers/net/ethernet/nvidia/nvethernet/ether_linux.c
index 1283529..2369d8a 100644
--- a/drivers/net/ethernet/nvidia/nvethernet/ether_linux.c
+++ b/drivers/net/ethernet/nvidia/nvethernet/ether_linux.c
@@ -2632,7 +2632,9 @@
 
 	/* As all registers reset as part of ether_close(), reset private
 	 * structure variable as well */
+#ifdef ETHER_VLAN_VID_SUPPORT
 	pdata->vlan_hash_filtering = OSI_PERFECT_FILTER_MODE;
+#endif /* ETHER_VLAN_VID_SUPPORT */
 	pdata->l2_filtering_mode = OSI_PERFECT_FILTER_MODE;
 
 	/* Initialize PTP */
@@ -4035,6 +4037,7 @@
 	return ret;
 }
 
+#ifdef ETHER_VLAN_VID_SUPPORT
 /**
  * @brief Adds VLAN ID. This function is invoked by upper
  * layer when a new VLAN id is registered. This function updates the HW
@@ -4122,6 +4125,7 @@
 
 	return ret;
 }
+#endif /* ETHER_VLAN_VID_SUPPORT */
 
 #if (KERNEL_VERSION(5, 10, 0) <= LINUX_VERSION_CODE)
 /**
@@ -4174,8 +4178,10 @@
 	.ndo_select_queue = ether_select_queue,
 	.ndo_set_features = ether_set_features,
 	.ndo_set_rx_mode = ether_set_rx_mode,
+#ifdef ETHER_VLAN_VID_SUPPORT
 	.ndo_vlan_rx_add_vid = ether_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid = ether_vlan_rx_kill_vid,
+#endif /* ETHER_VLAN_VID_SUPPORT */
 #if (KERNEL_VERSION(5, 10, 0) <= LINUX_VERSION_CODE)
 	.ndo_setup_tc = ether_setup_tc,
 #endif
@@ -6239,9 +6245,11 @@
 		features |= NETIF_F_HW_VLAN_CTAG_TX;
 	}
 
+#ifdef ETHER_VLAN_VID_SUPPORT
 	/* Rx VLAN tag stripping/filtering enabled by default */
 	features |= NETIF_F_HW_VLAN_CTAG_RX;
 	features |= NETIF_F_HW_VLAN_CTAG_FILTER;
+#endif /* ETHER_VLAN_VID_SUPPORT */
 
 	/* Receive Hashing offload */
 	if (pdata->hw_feat.rss_en) {
2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.