TX2 best way to recover from apt-get upgrade?

I got a new TX2 at GTC yesterday and was very excited. I fired it up and was immediately able to login via SSH.

Step 1: I enabled universe, installed fio and managed to get Intel Optane Memory working on the machine. It is a big upgrade storage wise over eMMC. I decided that it was time to try running some of our other software.

Step 2: I did apt-get update && apt-get upgrade -y and then did sudo reboot. I do this on our Cavium ThunderX servers and that tends to fix a few issues since arm64 support is still maturing.

Apparently, that was a very bad idea. SSH is no longer responding but I can see the NIC light come up, then turn off, then on. It is almost as though there is a stopped boot cycle.

I have tried 3 different monitors with three different HDMI cables and get no video. Only USB keyboard and HDMI plugged in. I also have tried with only the HDMI cable plugged in.

Any tips on this? I was having a ton of fun with the dev kit. But it seems like it may have been a very expensive 30 minutes of fun.

Updating individual packages shouldn’t be an issue, although some people lose video. If you did an upgrade to a new version of Ubuntu itself, this is always fatal.

For those who can still log in, this would tell you if any critical files were damaged:

sha1sum -c /etc/nv_tegra_release

In all cases where you no longer have access and want to see what’s going on you’ll probably have to have a serial console set up. See:
http://www.jetsonhacks.com/2015/12/01/serial-console-nvidia-jetson-tx1/
(serial console is the same on TX1 and TX2, except that TX2 does not support CTS/DTS flow control…TX2 is software or “no” flow control)

If you did do an upgrade to a new Ubuntu version, then you’ll need to flash and start over from scratch. You’re not going to be able to save it. If you have critical data though you can clone the root partition for later reference.

Thanks! Looks like 30 min in I already need to do a full recovery. Luckily not enough time to get data on there.

ethernet@2490000 Waiting for PHY auto negotiation to complete...... done
BOOTP broadcast 1
BOOTP broadcast 2
BOOTP broadcast 3
DHCP client bound to address 192.168.1.44 (1253 ms)
*** Warning: no boot file name; using 'C0A8012C.img'
Using ethernet@2490000 device
TFTP from server 0.0.0.0;

Looks like on the reboot, it lost its bootloader.

This is a case of having not found the boot partition, and then falling back to look for a diskless boot setup. You have a DHCP server (on a router or PC), so it gets an address, but you haven’t set up a TFTP server, so it can’t load a kernel. Basically the flash did not put a valid rootfs on. Some detail of your flash did not properly prepare a rootfs image…the flash could succeed, but if rootfs is not valid, then it won’t matter. flash.sh does not differentiate between flashing an invalid rootfs and a valid rootfs.

In your driver package directory is subdirectory “bootloader/”. Go there, and list the size of this file:

ls -l system.img.raw

…knowing the exact size of this file may offer a clue.

Because there was some uncertainty about what the apt commands do in this thread:

apt-get upgrade does not upgrade to a new version of Ubuntu, it just upgrades packages with bug/security fixes.

Not even apt-get dist-upgrade will upgrade to a new version of Ubuntu (it’s just a version of “upgrade” that may also add/remove packages if needed.)

Only do-release-upgrade will change to a new version of Ubuntu! (Don’t do that!)

@snarky exactly what I thought.

No idea how this broke everything.

@linuxdev I just provisioned a NUC as a Ubuntu Desktop (I only had server boxes) then re-flashed everything.

I pulled it off the PXE boot network just to make sure it was not picking up something off of the network.

In the case of apt getting in the way with just a package or two, then you can manually unpack the same file which apply_binaries.sh unpacks into rootfs…only you’d unpack it as root (sudo) from the Jetson’s “/” (typically the libGL.so is what breaks in that set of files, but only the GUI would fail from this, not networking). This is what is listed/tested from “sha1sum -c /etc/nv_tegra_release”. If you can’t boot though, and if networking is failing, you can’t just unpack a file, so you’re back to flashing. Some people have seen the note about how you can upgrade Ubuntu (I think it is for going from 14 to 16) version itself…this is always fatal.

In a flash where extlinux.conf is not found it might be because something about the root partition is not readable (e.g., there seem to be some 64-bit extensions from Ubuntu 16 hosts which might need to be manually disabled), or perhaps the files don’t exist on the partition. You’d have to post the full boot log to know. Had the partition and extlinux.conf been readable, then it would have never fallen back to PXE boot. Just be careful to unpack the sample rootfs with sudo, as well as running apply_binaries.sh sudo. If you have an Ubuntu 16 host, then you might mention this and we can check if the partition was unreadable due to some newer 64-bit ext4 options which U-Boot can’t use.

I Lost libglx ( i think this is video) After an apt-get upgrade.

How can i solve this?

Duplicate, answered here:
https://devtalk.nvidia.com/default/topic/1027461/jetson-tx2/after-apt-get-upgrade-for-jetson-tx2-with-jetpack-3-2-error-with-libglx/