TX2 doesn't boot after flashing OS

Hi,

I already had the Jetson TX2 working before.
I flashed the OS again, and now the TX2 doesn’t boot.

I saw the serial port and here is the output (once the HDMI doesn’t display anything):

U-Boot 2016.07-g5971907 (Mar 01 2017 - 21:06:45 -0800)
In:    serial
Out:   serial
Err:   serial
Net:   eth0: ethernet@2490000
Hit any key to stop autoboot:  0 
MMC: no card present
switch to partitions #0, OK
mmc0(part 8) is current device
Scanning mmc 0:1...
stqrting USB...
No controllers found
USB is stopped. Please issue 'usb start'$first.
starting USB...
No controllers found
ethernetH2490000 Waiting for PHY auto negotiation to complete.... done
                                                                       BOOTP broadcast 1
DHCP client bound to address 192.168.100.233 (1 ms)
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.0'.
Load atdress: 0x80088080
Loading: T T T0T T T T T T T 
Retry count exceeded; starting again
Config file not found
starting USB...
No controllers found

Another try:

(...)
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/fefault-arm'.
Load atdress: 0x90100000
Loading: T T T T T T T T T T
Retry count exceeded; starting again
Retrieving file: pxelinux.cfg/default
ethernetH2490000 Waiting for PHY auto negotiation to complete.... done
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/default'.
Load atdress: 0x90100000
Loading: T T T T T T T T T T
(...)

What I’m doing wrong?

Thank you

flash_reuse_log.txt (24.3 KB)

This implies the partition with extlinux.conf (presumably eMMC, or mmcblk0p1) cannot be opened…because of this the boot loader tries other fallbacks…the last is network booting.

On the host you used to flash the Jetson, in “/etc/mke2fs.conf”, is there anything about 64-bit in the ext4 section?

If you look at your driver package in the “Linux_for_Tegra” directory there will be subdirectory “rootfs”. Does the content of rootfs look like a valid Linux root file system?

I’m using Ubunut 14.04 host.

The /etc/mke2fs.conf is:

[fs_types]
        ext3 = {
                features = has_journal
        }
        ext4 = {
                features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
                inode_size = 256
        }
        ext4dev = {
                features = has_journal,extent,huge_file,flex_bg,metadata_csum,inline_data,64bit,dir_nlink,extra_isize
                inode_size = 256
                options = test_fs=1

In the rootfs directory it seems everything ok:

ls /home/lsa/64_TX2/Linux_for_Tegra_tx2/rootfs
bin  boot  dev  etc  home  lib  media  mnt  opt  proc  README.txt 
root  run  sbin  snap  srv  sys  tmp  usr  var

The ext4 parameters should be fine. The rootfs also looks ok. The apply_binaries.sh script could have failed, but this would not change being able to find the rootfs, so basics are in place and correct.

Was the flash via command line, or via JetPack? Try flashing again using only command line. Here’s the info:
https://devtalk.nvidia.com/default/topic/1011180/jetson-tx2/initial-bring-up-of-tx2/post/5157683/#5157683
OR:
https://devtalk.nvidia.com/default/topic/1001583/jetson-tx2/jetson-tx2-not-recognized-installation-inside-docker-error-8-error-return-value-8-command-tegrarcm_v2-chip-0x18-rcm-rcm_list_signed-xml-skipuid-reading-board-information-failed-/post/5116716/#5116716

I was using via JetPack.

I tried via command line but get the same problem.

There is anyway to format the Jetson TX2 and flash again the OS?

U-Boot 2016.07-g5971907 (Mar 01 2017 - 21:06:45 -0800)

TEGRA186
Model: NVIDIA P2771-0000-500
DRAM:  7.8 GiB
MC:   Tegra SD/MMC: 0, Tegra SD/MMC: 1
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   eth0: ethernet@2490000
Hit any key to stop autoboot:  0 
MMC: no card present
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
starting USB...
No controllers found
USB is stopped. Please issue 'usb start' first.
starting USB...
No controllers found
ethernet@2490000 Waiting for PHY auto negotiation to complete.... done
BOOTP broadcast 1
DHCP client bound to address 192.168.100.233 (1 ms)
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.0'.
Load address: 0x80080000
Loading: T 
Abort
missing environment variable: pxeuuid
Retrieving file: pxelinux.cfg/01-00-04-4b-8c-3b-73
ethernet@2490000 Waiting for PHY auto negotiation to complete.... done
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/01-00-04-4b-8c-3b-73'.
Load address: 0x90100000
Loading: T T T T T T T T T T 
Retry count exceeded; starting again
Retrieving file: pxelinux.cfg/C0A864E9
ethernet@2490000 Waiting for PHY auto negotiation to complete.... done
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/C0A864E9'.
Load address: 0x90100000
Loading: T T T T T T T T T T 
Retry count exceeded; starting again
Retrieving file: pxelinux.cfg/C0A864E
ethernet@2490000 Waiting for PHY auto negotiation to complete.... done
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/C0A864E'.
Load address: 0x90100000
Loading: T T T T T T T T T T 
Retry count exceeded; starting again
Retrieving file: pxelinux.cfg/C0A864
ethernet@2490000 Waiting for PHY auto negotiation to complete.... done
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/C0A864'.
Load address: 0x90100000
Loading: T T T T T T T T T T 
Retry count exceeded; starting again
Retrieving file: pxelinux.cfg/C0A86
ethernet@2490000 Waiting for PHY auto negotiation to complete.... done
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/C0A86'.
Load address: 0x90100000
Loading: T T T T T T T T T T 
Retry count exceeded; starting again
Retrieving file: pxelinux.cfg/C0A8
ethernet@2490000 Waiting for PHY auto negotiation to complete.... done
Using ethernet@2490000 device
TFTP from server 192.168.36.1; our IP address is 192.168.100.233
Filename 'pxelinux.cfg/C0A8'.
Load address: 0x90100000
Loading: T T

What was the exact command line you used for the flash? Flash is in fact the way to format and start over. A bit-for-bit exact image is created during flash, and then this is copied to the Jetson. Whatever is going on now the root partition is not valid…this is why it moves on to try network boot.

The act of flashing should produce file “bootloader/system.img.raw”. Is that there, and what is its exact size (ls -l)? Perhaps flash failed to create a loopback device, in which case the file would be there but not mountable. Assuming the file is there, can you do this:

sudo -s
cd /where/ever/Linux_for_Tegra/bootloader
mount -o loop ./system.img.raw /tmp
cd /tmp
ls
# Looks like rootfs...
cd -
umount /mnt
exit

Sometimes a loopback device fails and the image never gets formatted/populated…the umount/umount on the file should test that.

I run this code:

sudo ./apply_binaries.sh
sudo ./flash.sh -S 28GiB jetson-tx2 mmcblk0p1

The output with the image mounted:

root@hydra:/tmp# ls
bin   dev  home  lost+found  mnt  proc        root  sbin  srv  tmp  var
boot  etc  lib   media       opt  README.txt  run   snap  sys  usr
root@hydra:/tmp#  ls -l
total 112
drwxr-xr-x   2 root root  4096 Jun 30  2016 bin
drwxr-xr-x   5 root root  4096 Jun  1 18:56 boot
drwxr-xr-x   2 root root  4096 Mai  3  2016 dev
drwxr-xr-x 133 root root 12288 Jun  1 18:56 etc
drwxr-xr-x   4 root root  4096 Jan  6 03:18 home
drwxr-xr-x  22 root root  4096 Jun  1 17:55 lib
drwx------   2 root root 16384 Jun  1 19:10 lost+found
drwxr-xr-x   2 root root  4096 Ago  8  2016 media
drwxr-xr-x   2 root root  4096 Abr 20  2016 mnt
drwxr-xr-x   2 root root  4096 Abr 20  2016 opt
drwxr-xr-x   2 root root  4096 Abr 12  2016 proc
-rw-r--r--   1 lsa  lsa     62 Mar  2 05:14 README.txt
drwx------   3 root root  4096 Mai  6  2016 root
drwxr-xr-x   9 root root  4096 Dez 12 07:18 run
drwxr-xr-x   2 root root 12288 Dez 12 07:17 sbin
drwxr-xr-x   2 root root  4096 Abr 19  2016 snap
drwxr-xr-x   2 root root  4096 Abr 20  2016 srv
drwxr-xr-x   2 root root  4096 Fev  5  2016 sys
drwxrwxrwt   7 root root  4096 Jan 25 09:31 tmp
drwxr-xr-x  11 root root  4096 Mai  3  2016 usr
drwxr-xr-x  15 root root  4096 Jun  1 17:55 var

So we know the image copied to the Jetson is itself not corrupt…this implies loopback was not an issue. But what is the exact byte size of system.img.raw? Use “ls -l system.img.raw” to see this (it is in the bootloader subdirectory).

In your bootloader subdirectory, are both system.img.raw and system.img still there? if so, try to log a flash with the “reuse” option (this will reuse system.img, which is a compressed system.img.raw). Note the “-r” in this flash command…if you forget this it’ll overwrite your system.img and you’ll have to restart:

# Jetson in recovery mode...
sudo ./flash.sh -S 28GiB <b>-r</b> jetson-tx2 mmcblk0p1 2>&1 | tee flash_reuse_log.txt

There are places in the log with the progress bar text which is very long, you don’t need that, but see if you can post the actual log here.

Sorry I forget the files size. It seems ok

-rwxr-xr-x 1 root root  3359375428 Jun  1 19:13 system.img
-rw-r--r-- 1 root root 30064771072 Jun  1 19:25 system.img.raw

I attached the output log of the flash here.

flash_reuse_log.txt (24.3 KB)

That file size is correct and matches “-S 28GiB”.

According to that log flash was successful.

So I went over previous information again and found I did miss something…I had thought your ext4 did not have the 64-bit extensions, but it does:

ext4 = {
                features = has_journal,extent,huge_file,flex_bg,<b>metadata_csum,64bit,</b>dir_nlink,extra_isize
                inode_size = 256
        }

So everything is correct, except your host is defaulting in ext4 creation to 64-bit extensions which U-Boot doesn’t understand. I should have caught that sooner.

The flash.sh script can be edited to tell it to ignore the host’s defaults, or you can temporarily edit the host’s “/etc/mke2fs.conf”. These options need to be removed:

metadata_csum,64bit,

Editing mke2fs.conf would stop the 64-bit extensions until the option is added back in. Editing flash.sh instead makes flash.sh override the 64-bit extensions and to not use them.

This URL explains editing flash.sh:
https://devtalk.nvidia.com/default/topic/1001880/jetson-tx2/tx2-not-booting-after-successful-flashing-/post/5120419/#5120419

Changing that one line in flash.sh to this tells it to ignore the 64-bit and meta_csum options:

mkfs -t $4 -O <b>^metadata_csum,^64bit</b> "${loop_dev}" > /dev/null 2>&1;

It works, thank you @linuxdev!

But if I flashed the jetson one time and it worked (with same host and jetson), what changed now? Host updates?

Host updates can change default parameters to support larger file systems…I couldn’t tell you though if that was the case. This seems likely if the flash worked before (well, technically speaking the flash always worked…it was the ext4 formatting which does not work with 64-bit options). Assuming updates changed a previously working system I suspect we’ll start seeing more and more cases of otherwise seemingly successful flashes not finding the partition.