Jetson TK1 networking fails overnight

I have a TK1 system attached to my network that uses satellite internet. That satellite internet changes from a limited connection to an unlimited connection between 00:00 and 05:00 every day, and the transition back to a limited connection causes the TK1 to lose connectivity on LAN and WAN completely. I’m looking for options to correct this problem after the fact.

  • Rebooting the board fixes the issue but it's not practical to run as a cron job. It causes a television to turn on in the middle of the night.
  • Running "# ifdown eth0" and checking ifconfig shows eth0 is still active with IP.
  • Unplugging the ethernet cord from the card and checking ifconfig shows eth0 is still active with IP.
  • Running "# /etc/init.d/networking restart" does nothing.
  • systemd's systemctl doesn't even exist in the stock ubuntu distro.
  • Trying to restart networking via the stock UI accomplishes nothing.
  • I have other ARM boards attached to the network running Debian and Fedora without issue. So what’s going on? And more importantly, what kind of cron job can I make Ubuntu run every day to force the network down and reload it completely when init.d, systemd, and networkmanager aren’t even there?

    Have you tried this command ?
    #sudo restart network-manager

    more documentation:
    #sudo nmcli help

    what about just re-newing your dhcp-lease ?
    #sudo dhclient -v eth0
    
    Cycle your interface state:
    
    First list your connections:
    root@tegra-ubuntu:~# nmcli connection list
    NAME                      UUID                                   TYPE              TIMESTAMP-REAL                    
    Wired connection 1        c658da39-167d-466b-a4d3-131944a1b9ee   802-3-ethernet    Sun 02 Nov 2014 05:51:31 PM CET 
    
    root@tegra-ubuntu:~# nmcli dev  #check state
    DEVICE     TYPE              STATE        
    eth0       802-3-ethernet    connected    
    root@tegra-ubuntu:~# nmcli dev disconnect iface eth0    #disconnect
    root@tegra-ubuntu:~# nmcli dev
    DEVICE     TYPE              STATE        
    eth0       802-3-ethernet    disconnected 
    root@tegra-ubuntu:~# nmcli con up id "Wired connection 1"   #re-connect
    root@tegra-ubuntu:~# nmcli dev
    DEVICE     TYPE              STATE        
    eth0       802-3-ethernet    connected
    

    Also would be useful to know if the IP is assigned indirectly through a router or directly as a routable internet address. Sometimes it is the router which interferes with restart.

    Thanks for replying. The board is connected to a router and assigned a static IP through the router’s DHCP server. I’m going to try some of peba’s recommendations first as I wasn’t aware of dhclient or nmcli. I’ll report back in a few days with the results.

    Perhaps you can ping the router when things go bad…in which case the failure is between router and Jetson, rather than from Jetson to satellite. There are a lot of unreliable and flaky routers.

    I’m back with a status update.

  • pinging the router: fails
  • dhclient: fails after several attempts
  • nmcli disconnect / reconnect: times out & fails
  • I have another strange bit of information I’ve discovered. When I check the IP of the board, it’s not the one my router is supposed to give it. The router, a very reliable so far Netgear wndr3700 v3, is set to give a static IP based on mac address. I’m not entirely sure what that means. During these tests the graphics driver also crashed more than once.

    My next course of action is to replace Ubuntu and the L4T kernel with Debian.

    It is possible there is more than DHCP response to a DHCP request, unless the address is a default for no response. What address is unexpected address? Second, does a reboot of the router allow a simple network restart on Jetson without rebooting Jetson?

    Good thinking on restarting the router linuxdev. I will try that tomorrow morning.

    The IP I reserved for the tk1 is 192.168.1.13. The IP it receives overnight is 192.168.1.6. The second IP isn’t assigned to anything, it’s just the next open IP the router assigns to an unassigned machine. It’s almost like the tk1 isn’t giving the proper mac address, or the router isn’t receiving the right mac address.

    Some routers have logs available via a web port if connected directly. You might want to see if this is available even if router reset works.

    Resetting the router solved the problem. It solved the problem once anyway. I won’t know if this is a permanent solution until tomorrow, but it is a possibility. The router hadn’t been reset in over a year.

    There are a large number of routers which behave badly, I’m not even slightly surprised this would be the cause. You might check for firmware updates on this model.

    I’m back. I’m pleased to say that everything was working as intended this morning. There’s a lesson for me to learn in all this about checking the obvious places first. Thank you, linuxdev, I am very grateful for your help.

    Hi, I am having the exact same problem. After a few hours of running the TK1’s network card no longer responds. I am using R21.2. I’ve tried uninstalling network manager but that has not helped.

    I have configured a static address in /etc/network/interfaces:

    auto lo
    iface lo inet loopback

    auto eth0

    iface eth0 inet dhcp

    iface eth0 inet static
    address 10.47.11.221
    netmask 255.255.252.0
    network 10.47.8.0
    broadcast 10.47.11.255
    gateway 10.47.11.254
    dns-nameservers 10.47.8.8

    Does anyone have any suggestions on how to fix this issue. It is definitely not router related as per the other posts.

    So far as R21.x goes, there seems to be a kernel driver issue where the driver interacts with the scheduler. You can find threads all over on this topic for workarounds, but the only current solid fix is to revert to R19.x or else to build a kernel a few versions in the future which doesn’t have the issue.

    Hi, thanks for your reply. I have another TK1 running R19.2 and it is a lot more stable. Do you know where I can download older jetpacks, Nvidia has removed older version from the site?

    So far as I know there are no older Jetpack versions. As I recall, the first Jetpack came out when R21.1 or R21.2 was out…probably R21.2. No R19.x Jetpack was ever issued. All of the packages are available separately for R19.x and R21.x, but CUDA 6.0 is the only supported under R19.x and CUDA 6.5 is the only one supported under R21.x.

    Hi

    I am having the similar problem with Jetson Tx1 with ubuntu 16.04, jetpack version 3.3, and kernel version is 4.4.38. I have two devices(camera) connected, One is directly connected to PCI bridge: NVIDIA Corporation Device 0fae (rev a1), and other with Ethernet controller: Intel Corporation 82574L Gigabit Network Connection. Often when the device boots up, it does not show the devices connected neither I am able to ping the devices, networking restart or ifup down does not help. The only way to solve is to reboot the device. After rebooting the board the devices shows up in arp table and I am able to ping the devices.

    The syslog can be found here: https://drive.google.com/file/d/1i2ktuAKzBOYKfTrZo_kQJOrfYOxvflFT/view?usp=sharing

    16.04 is not officially supported on 32-bit platforms, and so if you’ve upgraded there are some ways it might work and other ways in which it is likely to fail. You might mention how you upgraded.

    Below are a couple of threads on things people have done to update:
    https://devtalk.nvidia.com/default/topic/1036457/jetson-tk1/how-to-successfully-update-jetson-tk1-to-ubuntu-16-04/
    https://devtalk.nvidia.com/default/topic/1037496/jetson-tk1/ubuntu-update-on-jetson-tk1-/post/5270975/#5270975

    If the upgrade went well, then this should show all “ok”:

    sha1sum -c /etc/nv_tegra_release
    

    …and the gist of this is that those checksums are for the NVIDIA-specific direct hardware access drivers.

    Do note that if the device tree is altered, or if the kernel using that device tree is altered, then some devices might not work (think of the device tree as a constructor for driver loading…the driver uses a generic way to name various variables, and the device tree fills in those values). I probably can’t help, but if you’ve done the above from those URLs, then someone might be able to help.

    I am using the Jetson Tx1 which is having 64-bit quad-core ARM Cortex-A57 so this should definitely not be case.

    The reason I assumed TK1 is that this is the TK1 forum (sorry, I missed the earlier TX1 note). The 32-bit (TK1) and 64-bit (TX1) code is sufficiently different that it could make quite a bit of difference (the kernel itself is an entire generation older for some releases versus on the TK1). This should really be moved to the TX1 forum, but the rest of the people in the thread are using TK1 (their posts should stay here).

    You might want to start a new thread with the same (or edited) topic here:
    TX1: https://devtalk.nvidia.com/default/board/164/jetson-tx1/

    The “sha1sum -c /etc/nv_tegra_release” question is still valid, and the question of which release is used (see “head -n 1 /etc/nv_tegra_release”) is also needed. Can you post this in the TX1 forum instead (you could simply copy and paste much of what is here, plus the sha1sum and release version questions are important)? Many of the people who work with the TX1 won’t be reading the TK1 forum, and many people who do read this forum will also miss that this is a TX1.

    EDIT: I see you’ve already put this in the TX1 forum. For others looking, see this for the TX1:
    https://devtalk.nvidia.com/default/topic/1049491/jetson-tx1/ethernet-communication-problem-on-bootup-with-tx1/