can not ssh to nano after running over night

Hi,

I am running Jetson nano over night. and sometime i found that i can not ssh to the device. I rebooted and the log shows following . any help is apperciated .

====================================================
[2:50 PM] Jordan Hurwitz

Oct 21 13:19:43 cupertino kernel: [256891.476208] pcieport 0000:00:01.0: AER: Corrected error received: id=0010
Oct 21 13:19:43 cupertino kernel: [256891.476218] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
Oct 21 13:19:43 cupertino kernel: [256891.486555] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
Oct 21 13:19:43 cupertino kernel: [256891.495038] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
Oct 21 13:45:09 cupertino kernel: [258417.162700] pcieport 0000:00:01.0: AER: Corrected error received: id=0010
Oct 21 13:45:09 cupertino kernel: [258417.162713] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
Oct 21 13:45:09 cupertino kernel: [258417.175880] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
Oct 21 13:45:09 cupertino kernel: [258417.185352] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
Oct 21 14:03:31 cupertino kernel: [259519.699818] pcieport 0000:00:01.0: AER: Corrected error received: id=0010
Oct 21 14:03:31 cupertino kernel: [259519.699829] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
Oct 21 14:03:31 cupertino kernel: [259519.711249] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
Oct 21 14:03:31 cupertino kernel: [259519.720001] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
Oct 21 14:20:46 cupertino kernel: [260554.237133] r8168: eth0: link up
Oct 21 14:34:14 cupertino kernel: [261362.156228] r8168: eth0: link up
Oct 21 14:39:44 cupertino kernel: [261691.942375] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 21 14:39:44 cupertino kernel: [261692.000973] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 21 14:39:45 cupertino kernel: [261693.060069] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 21 14:39:45 cupertino kernel: [261693.118631] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 21 14:39:45 cupertino kernel: [261693.177173] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 21 14:40:08 cupertino kernel: [261716.249313] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 21 14:40:08 cupertino kernel: [261716.307861] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).
Oct 21 14:40:08 cupertino kernel: [261716.366431] uvcvideo: Failed to query (GET_CUR) UVC control 1 on unit 3: -32 (exp. 1024).

====================================================

What is the device connected and used when this error happened?

According to the log, there is one pcie(m.2?) and usb camera.

There are usb camera and wifi moudle are connected .

SSHD is unresponsive until a successful login via a directly connected terminal. SSHD has also become unresponsive again after a period of inactivity. A successful login via a directly connected terminal works to re-awaken SSHD under these circumstances as well. This is also true with no monitor connected. It is also true when a usb keyboard is connected after boot with an unresponsive SSHD running. I’ve tried a few things attempting to wake networking up internally without having to login and then access via SSH over a shared, unfiltered LAN. They have been failed attempts so far:

  1. Altering the configuration located at /boot/extlinux/extlinux.conf to (addition in bold):

TIMEOUT 30
DEFAULT primary

MENU TITLE L4T boot options

LABEL primary
MENU LABEL primary kernel
LINUX /boot/Image
INITRD /boot/initrd
APPEND ${cbootargs} quiet pcie_aspm=off

  1. adding “@reboot ping -c 1 4.2.2.2” to the root crontab
  2. adding “@reboot service sshd restart” to the root crontab
  3. generating a failed login attempt on a directly connected terminal.

I am actively working on this problem right now and I’m open to try other ideas if they happen.

This is the network adapter that is installed:

I’m still looking into things not booting in an order that allows network access via ssh. Maybe the power state of the M.2 card? Additionally, these effects occur even after a custom Linux Kernel has been generated on board of the Nano using these methods:

https://github.com/JetsonHacksNano/buildKernelAndModules

Following this suggestion from many years ago corrects this behavior for me:

https://askubuntu.com/questions/16376/connect-to-network-before-user-login

“I found out how to do it :) Simply go into Network Manager > Edit Connections. Select your connection, click Edit and check Available to all users.”

Edit:

making a new directory in here: /etc/update-motd.d/
and moving these files into the new directory (for possible later use):

10-help-text
50-motd-news
60-unminimize
90-updates-available
91-release-upgrade
95-hwe-eol

really speeds up a login via ssh.

To make the wireless adapter hyper-available, I also did this:

touch /etc/pm/sleep.d/wireless

Also to remove a lot of lag during login I edited:

/etc/NetworkManager/conf.d/default-wifi-powersave-on.conf

and changed “wifi.powersave = 3” to “wifi.powersave = 2”