Issue with Wake-on-LAN on Xavier NX

Hello,

I am trying to enable Wake-on-LAN for Xavier NX, and followed the instructions in the post below.

However, there is some issue with Wake-on-LAN on NX dev kit. In my testing, NX is put into sleep mode by “sudo systemctl suspend”, after Wlan-on-LANE is enabled for eth0 by ethtool. NX is able to wake up after the Linux host PC sends the magic packet to NX’s ETH MAC address. The IP address for eth0 on NX is still kept after NX wakes up from deep sleep mode, and the routing table looks good. However, I found NX cannot ping through the host PC any more, and the host PC cannot ping through NX either. If NetworkManager service is enabled on NX side before system suspend, we can see NetworkManager lost both the Wired connection (eth0) and WiFi connection after the system comes back from sleep mode. If we run “ifconfig eth0 down” on the terminal, the terminal will be hung. After a while, NX system crashed and rebooted.

I did the same Wake-on-LAN testing on ConnectTech RUDI NX, and saw the same issue above.
RUDI NX has a SD card slot. If the system on RUDI NX is waken up from sleep mode by inserting a SD card, everything looks fine and ETH interface has no issue. So it seems the above-mentioned issue with ETH interface after system resume is related to Wake-on-LAN.

Can you please check if this is a know issue? Or is there any mistake in the above steps?

Thanks!

Hi hecnl4o,

We cannot reproduce your issue on our devkit.

Could you share more detail about your board and release version?

Also, it would be better if you could share the steps.

Hi, WayneWWW,

The board we use in this WoLAN testing is the NX dev kit from nVIDIA. I did the testing again today after removing the NVME M.2 card and USB mouse, and I used the default power model (i.e., 10 W & 2 CPUs) for the testing today. So, the HW configuration and SW configuration are both the default setting when I received this dev kit from nVIDIA. However, I still met the same network interface issue and crash issue after woke up NX from deep sleep mode by magic packet.

Here is the SW info on this dev board:

uname -a

Linux brain-nx1 4.9.140-tegra #1 SMP PREEMPT Wed Apr 8 18:15:20 PDT 2020 aarch64 aarch64 aarch64 GNU/Linux

cat /etc/os-release

NAME=“Ubuntu”
VERSION=“18.04.4 LTS (Bionic Beaver)”
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME=“Ubuntu 18.04.4 LTS”
VERSION_ID=“18.04”
HOME_URL=“https://www.ubuntu.com/
SUPPORT_URL=“https://help.ubuntu.com/
BUG_REPORT_URL=“https://bugs.launchpad.net/ubuntu/
PRIVACY_POLICY_URL=“https://www.ubuntu.com/legal/terms-and-policies/privacy-policy
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Here are the detailed steps on both host PC side and NX dev kit side (ETH IP address is 192.168.3.6):

On host PC side:

ping 192.168.3.6

arp -a 192.168.3.6

? (192.168.3.6) at 48:b0:2d:07:75:8a [ether] on eth0

On NX dev kit:

ethtool -s eth0 wol g

ethtool eth0

Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Link partner advertised pause frame use: Symmetric Receive-only
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Port: MII
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Link detected: yes

The output of ethtool above shows wake-on-LAN by magic packet is enabled for eth0.

Then run the following command on NX dev kit:

sudo systemctl suspend

I use a power supply reader which provides power to NX dev kit. After I ran the above command, the power consumption of NX dev kit dropped from 4W to 0.82W, which confirms the NX has entered sleep mode.

Then I ran the following command on host PC side to wake up NX:
etherwake 48:b0:2d:07:75:8a

After a couple of seconds, the power consumption of NX went up and the Ubuntu UI was shown on the display which is connected to NX dev kit via HDMI. However, I could not ping through NX’s ETH interface from host PC side. After I login into NX dev kit from its UI, the network manager shows the network connection is lost (Please refer to the picture uploaded here).
Then I opened a terminal window and ran “ifconfig eth0” there. The terminal was hung and the whole system rebooted by itself after 30-60 seconds.

I also tried another tool called wakeonlan today. wakeonlan can also wake up NX by magic packet, but NX ran into the same issue here. It seems the issue happened during the resume of NX system.
Please let me know if you need any more information.

Thanks for your help here!

Hi,

I notice we are using different host tools to wake up device here.

Could you try to use this one to wake device?

$ wakeonline -i DUT-IP-Address DUT-HW-Address

I tried wakeonlan yesterday with the same format you mentioned above. wakeonlan can also wake up NX, but NX ran into the same network interface failure and crash.

Hi,

We don’t hit such error on our side. Could you share your kernel log with us?

Also, since the system will hang if you use “ifconfig eth0”, could you also connect the UART for serial console log?

wake up by rtcwake.txt (15.5 KB) wake up by wlanonlan.txt (13.6 KB)

Hi, WayneWWW,

I tried to suspend and resume NX system by rtcwake today. ETH interface works fine after resume by rtcwake. wake up by rtcwake.txt here is the kernel log for suspend/resume for RTC wake testing.

“wake up by wlanonlan.txt” attached here is the kernel log for ETH failure case where we use wlanonlan on host side to wake up NX by magic packet. We can see the following error messages for ETH interface during system resume. Do you also see those error messages in your testing with WoLAN?

[ 693.631894] eqos 2490000.ether_qos: WoL Failed to reset MAC
[ 693.632191] dpm_run_callback(): eqos_resume_noirq+0x0/0x1d0 returns -19
[ 693.632362] PM: Device 2490000.ether_qos failed to resume noirq: error -19

Thanks!

Hi,

Yes, I think that is an issue but it seems only happen to your board. Not sure if any other users also have such issue or not.

I would like to know…

  1. Are you able to upgrade to rel-32.4.3? The uname -a seems a rel-32.4.2 release. It would be better to check this issue on latest release.

  2. Is it pure devkit? Is there any other peripherals on it?

  3. Do you have other NX modules that can do the test? Maybe that one you are using has some defect.

  4. Have you tried different ethernet environment? For example, different cable/switch/hub…etc. Is the IP assigned by DHCP?

Hi, WayneWWW,

Thank you for your help and debugging tips here!
My devkit has an SSD card installed. I put an ETH switch between dev kit and RUDI NX in my testing today. ETH interface on NX is still not working after resumed from sleep mode by magic package, but the error “eqos 2490000.ether_qos: WoL Failed to reset MAC” is not seen anymore. The static IP is used by ETH0 in all the testing so far.
The SW build on my dev kit is nv-jetson-nx-sd-card-image-r32.4.2. I will try the latest R32.4.3 release and update the thread later.

Thanks!

I put an ETH switch between dev kit and RUDI NX in my testing today. ETH interface on NX is still not working after resumed from sleep mode by magic package, but the error “eqos 2490000.ether_qos: WoL Failed to reset MAC” is not seen anymore.

What did you do to make this error disappear?

Hi, WayneWWW,

I did not change SW build when I did the testing with ETH switch in between. This ETH switch is needed for our final product. I will do the testing again with direct ETH cable between NX and host PC, and check if “eqos 2490000.ether_qos: WoL Failed to reset MAC” can be reproduced.

Thanks!

And please also reply this if possible.

  1. Do you have other NX modules that can do the test? Maybe that one you are using has some defect.

I have one NX dev kit and one ConnectTech RUDI NX. I met the same issue on both devices. In my latest testing, it seems the issue I met is due to rtcwake. Will do the testing from scratch to confirm on this.

Hi hecnl4o,

Have you ran the test to clarify the cause?
Any result can be shared?

Hi, Kayccc,

I ran the testing again with all four combinations shown in the tables below. The results are consistent to what I saw in separate tests before. Since our real use case has this ETH switch between NX and host, the original issue we reported here is not a blocking issue for our use case anymore. Thank you and other people’s help on debugging this issue!

Test set up #1: nVIDIA NX dev kit will be put into sleep mode, and ConnectTech RUDI NX will act as a Linux host device.

Test set up #2: ConnectTech RUDI NX will be put into sleep mode, and nVIDIA NX dev kit will act as a Linux host device.