Cloning TK1 without JetPack and micro USB

It wouldn’t hurt to see what serial console sees. Mostly the goal would be to verify if it rebooted and u-boot can be verified as trying to execute…not having it execute versus having it execute makes good debug information. Just be careful to not introduce any extraneous key strokes during the process.

It would make a big difference if there are mixes of L4T versions. Some things like partition layout and hidden partition content may change…mixing the two may not be compatible. When going from R21.4 to R21.5 you lose the ability to use the “tegra12_defconfig”. R21.4 uses this config, but R21.5 uses a somewhat altered config…an R21.5 system won’t boot with the defconfig, so this would break boot.

That said, you can probably work around R21.4/R21.5 mismatch without too much effort if you need to go to R21.5. If not, just use R21.4 first to flash a standard R21.4 environment, and then flash with “-r” to put your clone in place. If you want to convert to R21.5, then the first step is to make a backup of the clone image (at least until you are satisfied the R21.5 edits work). Essentially you’d be mounting the loopback image on the rootfs directory where you would normally unpack a sample rootfs, and then running the apply_binaries.sh program on it as if it were a sample rootfs…actual flash would edit and change the boot directory, including kernel, extlinux.conf, and firmware, so this would alter your original, and then create a new system.img from this. Everything other than the boot-related and kernel-related files would remain constant.

@linuxdev I finally got serial console working here is the message it gave after trying to boot. It seems that it can not locate the eMMC?

TEGRA124
Board: NVIDIA Jetson TK1
I2C:   ready
DRAM:  2 GiB
MMC:   Tegra SD/MMC: 0, Tegra SD/MMC: 1
*** Warning - bad CRC, using default environment

tegra-pcie: PCI regions:
tegra-pcie:   I/O: 0x12000000-0x12010000
tegra-pcie:   non-prefetchable memory: 0x13000000-0x20000000
tegra-pcie:   prefetchable memory: 0x20000000-0x40000000
tegra-pcie: 2x1, 1x1 configuration
W�֖W�    �V׭�'$\�KW��ˮK$&KH]�Z��&$[X뒷Y�C�tegra-pcie: link 0 down, retrying
tegra-pcie: link 0 down, retrying
tegra-pcie: link 0 down, retrying
tegra-pcie: link 0 down, ignoring
tegra-pcie: probing port 1, using 1 lanes
In:    serial
Out:   serial
Err:   serial
Net:   RTL8169#0
Warning: RTL8169#0 using MAC address from net device

Hit any key to stop autoboot:  0 
MMC: no card present
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0...
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
(Re)start USB...
USB0:   USB EHCI 1.10
scanning bus 0 for devices... 1 USB Device(s) found
USB1:   USB EHCI 1.10
scanning bus 1 for devices... 1 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
       scanning usb for ethernet devices... 0 Ethernet Device(s) found

USB device 0: unknown device
BOOTP broadcast 1
BOOTP broadcast 2
BOOTP broadcast 3
BOOTP broadcast 4
DHCP client bound to address 192.168.0.101 (2009 ms)
*** Warning: no boot file name; using 'C0A80065.img'
Using RTL8169#0 device
TFTP from server 0.0.0.0; our IP address is 192.168.0.101; sending through gateway 192.168.0.1
Filename 'C0A80065.img'.
Load address: 0x80408000
Loading: T T T T T T T T T T 
Retry count exceeded; starting again
BOOTP broadcast 1
DHCP client bound to address 192.168.0.101 (2 ms)
*** Warning: no boot file name; using 'C0A80065.img'
Using RTL8169#0 device
TFTP from server 0.0.0.0; our IP address is 192.168.0.101; sending through gateway 192.168.0.1
Filename 'C0A80065.img'.
Load address: 0x80408000
Loading: T T T T T T T T T T 
Retry count exceeded; starting again

During creation of that log, was there anything connected to the mini-PCIe slot? I ask because probing PCIe is where the scrambled characters are.

Note that “mmc0(part0)” is at least acknowledged as existing. The note about failing to find ext2 file system may really refer to ext4 (but there is some backwards compatibility in mounting ext4 as ext2), as this step should instead note that it found extlinux.conf (which is on a subset of mmc0). Failing to find extlinux.conf is why it moves on to trying other boot configuration sources.

Note that it still states that you could hit any key to stop autoboot…should you hit a serial console key slightly earlier to this you could then get to the u-boot console and perhaps do some investigation of what the boot environment is. Many serial console applications offer logging, you could log or mouse copy and paste the output from this to look at some environment details:

# ...hit a key to stop autoboot...
env print

Of most interest are the variables starting with “boot” in their name. Here is a “boot” example from R21.5:

boot_a_script=load ${devtype} ${devnum}:${bootpart} ${scriptaddr} ${prefix}${script}; source ${scriptaddr}
boot_extlinux=sysboot ${devtype} ${devnum}:${bootpart} any ${scriptaddr} ${prefix}extlinux/extlinux.conf
boot_prefixes=/ /boot/
boot_scripts=boot.scr.uimg boot.scr
boot_targets=mmc1 mmc0 usb0 pxe dhcp 
bootcmd=setenv usb_need_init; for target in ${boot_targets}; do run bootcmd_${target}; done
bootcmd_dhcp=run usb_init; if dhcp ${scriptaddr} boot.scr.uimg; then source ${scriptaddr}; fi
bootcmd_mmc0=setenv devnum 0; run mmc_boot
bootcmd_mmc1=setenv devnum 1; run mmc_boot
bootcmd_pxe=run usb_init; dhcp; if pxe get; then pxe boot; fi
bootcmd_usb0=setenv devnum 0; run usb_boot
bootdelay=2

To see if mmc is visible, look for output from “mmcinfo”. To see what partitions were added from flash (including hidden), run “mmc part”. Here’s an example on R21.5 flashed with “-S 14580MiB”:

Part	Start LBA	End LBA		Name
	Attributes
	Type GUID
	Partition GUID
  1	0x00017000	0x01c90fff	"APP"
	attrs:	0x0001000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	7369c667-ff51-ec4a-29cd-baabf2fbe346
  2	0x01c91000	0x01c92fff	"DTB"
	attrs:	0x0002000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	f854c27c-e81b-8de7-765a-2e63339fc99a
  3	0x01c93000	0x01cb2fff	"EFI"
	attrs:	0x0003000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	b70d3266-5831-5aa3-255d-051758e95ed4
  4	0x01cb3000	0x01cb4fff	"USP"
	attrs:	0x0004000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	c6cdb2ab-b49b-1154-0e82-7441213ddc87
  5	0x01cb5000	0x01cb6fff	"TP1"
	attrs:	0x0005000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	a13ee970-e141-67fc-3e01-7e97eadc6b96
  6	0x01cb7000	0x01cb8fff	"TP2"
	attrs:	0x0006000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	2a5c388f-b0ec-fb3b-32af-3c54ec18db5c
  7	0x01cb9000	0x01cbafff	"TP3"
	attrs:	0x0007000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	43fe1a02-fafb-3aaa-fb29-d1e6053c7c94
  8	0x01cbb000	0x01cbbfff	"WB0"
	attrs:	0x0008000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	61bed875-f989-bb5c-a899-0f95b1ebf1b3
  9	0x01cbc000	0x01d58fff	"UDA"
	attrs:	0x0009000000000001
	type:	ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
	guid:	00f7ef05-a1e9-e53a-ca0b-cbd0484764bd

Basically we are looking at the boot loader’s concept of eMMC existence, along with how the boot loader is looking for configuration, and specifically the boot loader’s concept of the partition containing extlinux.conf. If you have the disk space, you could try a clone of the current root partition to see if it really is what you expect (don’t overwrite any saved images you want to keep). If the new clone is loopback mountable and of the correct size containing extlinux.conf in the right place, then the issue falls back to the boot loader and its environment rather than the partition.

Thank you for the quick response! I was actually confused by the mini-PCIe because there is nothing plugged into the mini-PCIe as I have never had a use for it.

Here is the “env print” output:

arch=arm
baudrate=115200
board=jetson-tk1
board_name=jetson-tk1
boot_a_script=load ${devtype} ${devnum}:${bootpart} ${scriptaddr} ${prefix}${script}; source ${scriptaddr}
boot_extlinux=sysboot ${devtype} ${devnum}:${bootpart} any ${scriptaddr} ${prefix}extlinux/extlinux.conf
boot_prefixes=/ /boot/
boot_scripts=boot.scr.uimg boot.scr
boot_targets=mmc1 mmc0 usb0 pxe dhcp 
bootcmd=setenv usb_need_init; for target in ${boot_targets}; do run bootcmd_${target}; done
bootcmd_dhcp=run usb_init; if dhcp ${scriptaddr} boot.scr.uimg; then source ${scriptaddr}; fi
bootcmd_mmc0=setenv devnum 0; run mmc_boot
bootcmd_mmc1=setenv devnum 1; run mmc_boot
bootcmd_pxe=run usb_init; dhcp; if pxe get; then pxe boot; fi
bootcmd_usb0=setenv devnum 0; run usb_boot
bootdelay=2
bootpart=1

The output seems to match yours exactly from the boot.

“mmcinfo” supplied this:

Device: Tegra SD/MMC
Manufacturer ID: 45
OEM: 100
Name: SEM16 
Tran Speed: 52000000
Rd Block Len: 512
MMC version 4.5
High Capacity: Yes
Capacity: 14.7 GiB
Bus Width: 8-bit

Finally using “mmc part” I got the exact same output as you:

Partition Map for MMC device 0  --   Partition Type: EFI

Part    Start LBA       End LBA         Name
        Attributes
        Type GUID
        Partition GUID
  1     0x00017000      0x01c90fff      "APP"
        attrs:  0x0001000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   7369c667-ff51-ec4a-29cd-baabf2fbe346
  2     0x01c91000      0x01c92fff      "DTB"
        attrs:  0x0002000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   f854c27c-e81b-8de7-765a-2e63339fc99a
  3     0x01c93000      0x01cb2fff      "EFI"
        attrs:  0x0003000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   b70d3266-5831-5aa3-255d-051758e95ed4
  4     0x01cb3000      0x01cb4fff      "USP"
        attrs:  0x0004000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   c6cdb2ab-b49b-1154-0e82-7441213ddc87
  5     0x01cb5000      0x01cb6fff      "TP1"
        attrs:  0x0005000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   a13ee970-e141-67fc-3e01-7e97eadc6b96
  6     0x01cb7000      0x01cb8fff      "TP2"
        attrs:  0x0006000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   2a5c388f-b0ec-fb3b-32af-3c54ec18db5c
  7     0x01cb9000      0x01cbafff      "TP3"
        attrs:  0x0007000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   43fe1a02-fafb-3aaa-fb29-d1e6053c7c94
  8     0x01cbb000      0x01cbbfff      "WB0"
        attrs:  0x0008000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   61bed875-f989-bb5c-a899-0f95b1ebf1b3
  9     0x01cbc000      0x01d58fff      "UDA"
        attrs:  0x0009000000000001
        type:   ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
        guid:   00f7ef05-a1e9-e53a-ca0b-cbd0484764bd

Is could it be not recognizing the mmc because it is ext4?

edit: When I stop the autoboot these errors come up:

missing environment variable: pxeuuid
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/01-00-04-4b-5b-03-cc
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C0A80065
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C0A8006
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C0A800
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C0A80
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C0A8
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C0A
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C0
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/C
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/default-arm-tegra124
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/default-arm
*** ERROR: `serverip' not set
missing environment variable: bootfile
Retrieving file: pxelinux.cfg/default
*** ERROR: `serverip' not set
Config file not found

Despite error messages, eMMC should be ext4, and things like partition UUID should not matter. According to all of the boot information I see, your setup should find extlinux.conf. I see these possibilities for not finding extlinux.conf:

  1. The binary boot loader is ignoring that environment (possibly the wrong binary is there, but since it tries to read ext2 this is unlikely...the nVidia version of u-boot is expected to mention ext2), or
  2. The partition where u-boot is intended to look is invalid (which can be tested via cloning), or
  3. There is a hardware error.

The above information pretty much guarantees the boot environment is correct to look for extlinux.conf where it should. It really seems the partition in question is also correct, but you could clone it again just to verify that the most recent flash correctly put this in place (I know this takes lots of disk space and time, it’s hard to avoid if needing complete validation). Since you should still have your clone from the original somewhere safe, you should be able to flash with just the default setup and -S 14580MiB and see if pristine flash also fails to boot (you might even use a completely fresh L4T download to the host)…if it is hardware, then this will be the case. If not hardware, then we could investigate what is failing with the particular clone being put back in.

The URL for info on cloning rootfs with a working micro-B USB connector is here:
http://elinux.org/Jetson/Cloning

@linuxdev I don’t think it is a hardware problem because when I boot with “-r” the system boots up without error and is fully functional. I will try to clone the system again now and see what is happening. I have an extra hard drive so I can hook so the disk space won’t be a concern.

I need to back up and make sure I have the information correct. If the unit is flashed with or without reusing an image, I was thinking serial console shows u-boot did not find the partition with extlinux.conf on it. Is this correct? When you say you boot with “-r” the system boots up without error, do you mean you flashed using the clone image and -r?

The image is currently flashed with reusing an image.

This weekend I did reflash it without reusing the image just to see if the jetson was actually working. It booted up without problem so I flashed it again with the image I took from the old jetson. The boot log I posted is with the reused image.

I think you are correct that u-boot did not find the partition with extlinux.conf on it.

The interesting thing about cloned partitions is that they do preserve UUID. I do not believe u-boot looks at UUID, so no matter which ext4 loopback partition of matching size is used for clone restore via “-r”, a valid extlinux.conf should always be found. However, if you are willing to spend the time, once you clone from a working system, restore with the clone you just created to see if it still works…you’d be replacing a partition with itself, right down to UUID to be sure it works (the “re-use” scenario is validated if an image can be cloned of itself and then re-installed on itself and still work). Then you can loopback mount the clone and use utilities to change the UUID, and once again restore with that clone which differs only in UUID…if it fails then we know the issue (I doubt that is the issue but it is for the sake of thorough testing…this would prove if UUID matters or not…it should not).

Note that if it does turn out to be UUID, then the clone used from the original machine which you are trying to install can have its UUID altered to match. I’m hoping this is not the case, as the boot environment should not depend on this. This is one of the few (low probability) possibilities I can think of as to why a valid partition would be ignored.

Do be careful to remember that each non-reuse there is both a system.img and system.img.raw which need to be removed if you plan to later restore via clone, and the previous raw image needs to be copied back in as system.img instead of system.img.raw (originally the flasher did not know aobut raw images and only created the uncompressed system.img, later moving it to system.img.raw prior to creating the sparse/compressed image…so unless the flash app created system.img.raw it assumes system.img is what to use…in cases of re-use system.img.raw is ignored).

If I understand correctly: I should flash this system to a working state, make a clone of that, change the UUID, and then flash again with that clone?

Yes…but flash once without changing UUID…then change nothing but UUID, and see if it still works. Verify the clone process when nothing changes…then verify the one thing which is guaranteed to change, but otherwise should not have an effect…UUID. I do not believe UUID change should break anything, but among boot software, this is the one thing which could get in the way if overlooked (there may be a dependency I don’t know about). If it does turn out UUID matters, then you could write down the working UUID first and set your clone to that UUID (incidentally, a dd partition and clone both preserve UUID).

I might be confused, but is there a way to change the UUID of a mounted image? I have flashed the jetson without changing anything so now I just need to change the UUID.

You can view UUID via the blkid command. If you just wanted a unique random UUID, you’d use uuidgen (such as in mass production of cloning). For specific UUID, you’ll want to use tune2fs with the “-U UUID” option parameter, e.g.:

sudo tune2fs /dev/sdb1 -U "SOM-THI-NG"

Note that both partitions and devices can have a UUID. We’re interested in that partition, which is under mmcblk0p1 when it is in the eMMC, but would have some other designation when loopback mounted, e.g., “/dev/loop0”.

I was able to clone successfully and flash with the same UUID but I am confused as to how I can change the uuid of the loopback mounted image. When I use tune2fs it says I can not change the UUID of a mounted image but when I unmount it I do not have access to that partition anymore.

Behind the scenes during a loopback mount the file is first “covered” with a loopback device which has the ability to pretend it is a disk…it is later that the loopback device is silently mounted instead of the file itself. While covered (via the losetup command), but not mounted, you should be able to change UUID (or for that matter, almost anything you can do to a disk you can do with the loopback device, e.g., formatting).

Note that you have to use root or sudo for losetup, but basically it goes like this:

# Be root...
sudo -s
# Make sure there is an unused loop device...doing as root will create one if not already existing.
# ...remember which device is shown:
losetup -f
# Cover the image (example will assume this uses "/dev/loop0" because of the previous losetup -f):
losetup -f ./system.img.raw
# See current GUID, remember this value:
blkid /dev/loop0
# ...do the tune2fs based on naming the device, e.g., loop0:
# I have a loopback image with UUID "B6CA7FB6-93C9-4809-8814-D2605ADB93DC", using this as an example...
# ...note I've incremented the last digit in hex from C to D for the example...this is arbitrary:
tune2fs -U 'B6CA7FB6-93C9-4809-8814-D2605ADB93D<b>D</b>' /dev/loop0
# You can later do this to put it back to the original UUID if covered by loop0,
# but you won't do this until testing flash by clone with only UUID changed, and maybe
# you won't even care if the UUID stays changed when done:
tune2fs -U 'B6CA7FB6-93C9-4809-8814-D2605ADB93DC' /dev/loop0
# Uncover the loopback file so flash can proceed:
losetup -d /dev/loop0
# End sudo shell:
exit
# ...do flash stuff with sudo where required...see if it works via the same image
# with only UUID changed...

After a lot of tinkering I have still not been able to get this to work. It appears that changing the UUID does not effect the flash but after trying to flash with my old system and the new systems default UUID I have not been able to get it running. My next course of action might just have to be starting from scratch. Unless there is some other option?

Let me verify what we know so far. You can reuse the ordinary image cloned from a fresh flash, with or without UUID changing, and it works. Is that correct? The exact byte size of the two images, both the new image and the previous image you want to restore from, are the same exact byte size? Of the current L4T release version and the L4T version from the image being restored, are these two images sharing the same version (if not, then adjustments would be required)?

If you cover the old image and new image with loopback (meaning probably one will be /dev/loop0 and the other /dev/loop1 if they are both covered at the same time…but one at a time will probably just use one loop device), let’s see if there is a difference in the image’s geometry/setup. For both, is there a difference for something like this (after posting you could attach the log files to the post, just hover the mouse over the quote icon in the upper right corner of the post and the attach icon should become visible as a paper clip…you should be able to mouse copy and paste this to run the commands):

sudo -s
# Substitute clone0 or 1 img.raw with your actual raw image names.
losetup -f clone0.img.raw
losetup -f clone1.img.raw
# Log file names are arbitrary...leaving out the "-a" creates the file
# or erases old content...use of "-a" appends without erasing to the log.
gdisk -l /dev/loop0 2>&1 | tee imglog0.txt
gdisk -l /dev/loop1 2>&1 | tee imglog1.txt
lsblk -O /dev/loop0 2>&1 | tee -a imglog0.txt
lsblk -O /dev/loop1 2>&1 | tee -a imglog1.txt
dumpe2fs -h /dev/loop0 2>&1 | tee -a imglog0.txt
dumpe2fs -h /dev/loop1 2>&1 | tee -a imglog1.txt
# Use two separate mount points, create directories or use any mount location.
mkdir /mnt/clone0
mkdir /mnt/clone1
mount -o ro /dev/loop0 /mnt/clone0
mount -o ro /dev/loop1 /mnt/clone1
sha1sum /mnt/clone0/boot/zImage 2>&1 | tee -a imglog0.txt
sha1sum /mnt/clone1/boot/zImage 2>&1 | tee -a imglog1.txt
ls /mnt/clone0/boot/extlinux/extlinux.conf 2>&1 | tee -a imglog0.txt
ls /mnt/clone1/boot/extlinux/extlinux.conf 2>&1 | tee -a imglog1.txt
umount /dev/loop0
umount /dev/loop1
losetup -D
exit

What I’m looking for is if there may be some sort of difference in how the partitions were created that might account for behavior differences where files present seem to be the same. Also, after flashing with the image you are trying to restore, can you check the serial console to see if there is any output at all during boot which might indicate at which point it hangs, e.g., from a fresh power on until hang?

@linuxdev I’m not exactly sure what I did but the flash is now working! I tried your last post again except paying closer attention to which loop contained the image. Just to be safe I then changed the permissions using chmod 744.

After that the flash was successful! The UUID is unchanged from the original jetson and everything seems to be exactly the same.

Thank you for all your help!!!

Glad it worked out…as the saying goes, “the devil is in the details”. I’m sure someone else trying to use clone for production or trying to recover a system will find the information useful.