TX1 & TK1 flash stuck in KVM guest

Hi, it’s my first post on this forum so I’m sorry if I’m unclear. I think I need to describe a bit the project first so I’m a engineering student in embedded system and for my graduation work I must realize a sort of cluster with TX1, TK1 and others embedded board. The goal is to bring a support to a informatic class, where the student would be apple to connect remotely and work with many architecture without any risk of hardware mishandling. Since it would be different courses on it, I must use a master server that would be able to reflash/clonning/deploy environnement/monitor on remote request.

So here we are, I have the master server running a Debian 8 with 3 KVM VPS:

  • A Ubuntu 16.04 desktop to flash TX1;
  • A Ubuntu 14.04 desktop to flash TK1;
  • A PfSense for firewall purpose;

I am to the point that TX1 and TK1 are well assigned to the guest OS when in recovery mode, they appears in lsusb with HighSpeed 480mb/s, JetPack can mount the 15Gb system without any problems and start the flash but it stuck every time at the same moment (not exactly the same between TX and TK)…

lsusb on host (TX1):

root@HEH-SC-Server:/home/sangoku# lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04f2:1013 Chicony Electronics Co., Ltd 
Bus 001 Device 004: ID 0955:7721 NVidia Corp. 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
root@HEH-SC-Server:/home/sangoku# lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
    |__ Port 2: Dev 4, If 0, Class=Vendor Specific Class, Driver=, 480M
    |__ Port 3: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 12M
    |__ Port 3: Dev 3, If 1, Class=Human Interface Device, Driver=usbhid, 12M

lsusb on the guest (TX1):

sangoku@VPS-Ubuntu-16:~$ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 0955:7721 NVidia Corp. 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
sangoku@VPS-Ubuntu-16:~$ lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
    |__ Port 2: Dev 2, If 0, Class=Vendor Specific Class, Driver=, 480M

Flash output:

-- Total: -----------------------------------------------------------
2504 CHUNK 15032385536(3670016 blks) ==>  3356878204(819543 blks)

done.
system.img built successfully. 
Existing tbcfile(/home/sangoku/Desktop/NVidia/64_TX1/Linux_for_Tegra_64_tx1/bootloader/nvtboot_cpu.bin) reused.
copying cfgfile(/home/sangoku/Desktop/NVidia/64_TX1/Linux_for_Tegra_64_tx1/bootloader/t210ref/cfg/gnu_linux_tegraboot_emmc_full.xml) to flash.xml... done.
creating gpt(ppt.img)... 

*** GPT Parameters ***
device size -------------- 31276924928
bootpart size ------------ 8388608
userpart size ------------ 31268536320
Erase Block Size --------- 2097152
sector size -------------- 4096
Partition Config file ---- flash.xml
Visible partition flag --- GP1
Primary GPT output ------- PPT->ppt.img
Secondary GPT output ----- GPT->gpt.img
Target device name ------- none

*** PARTITION LAYOUT(23 partitions) ***
[     BCT] BH            0         8191       4.0MiB 
[     NVC] BH         8192        16383       4.0MiB nvtboot.bin
[     PPT] UH            0         4095       2.0MiB 
[     GP1] UH         4096         8191       2.0MiB 
[     APP] UH         8192     29368319   14336.0MiB system.img
[     TBC] UV     29368320     29372415       2.0MiB nvtboot_cpu.bin
[     EBT] UV     29372416     29380607       4.0MiB u-boot-dtb.bin
[     BPF] UV     29380608     29384703       2.0MiB bpmp.bin
[     WB0] UV     29384704     29396991       6.0MiB warmboot.bin
[     RP1] UV     29396992     29405183       4.0MiB tegra210-jetson-tx1-p2597-2180-a01-devkit.dtb
[     TOS] UV     29405184     29417471       6.0MiB tos.img
[     EKS] UV     29417472     29421567       2.0MiB 
[      FX] UV     29421568     29425663       2.0MiB 
[     SOS] UV     29425664     29466623      20.0MiB 
[     EXI] UV     29466624     29597695      64.0MiB 
[     LNX] UV     29597696     29728767      64.0MiB 
[     DTB] UV     29728768     29736959       4.0MiB tegra210-jetson-tx1-p2597-2180-a01-devkit.dtb
[     NXT] UV     29736960     29741055       2.0MiB 
[     MXB] UV     29741056     29753343       6.0MiB 
[     MXP] UV     29753344     29765631       6.0MiB 
[     USP] UV     29765632     29769727       2.0MiB 
[     UDA] UV     29769728     61067263   15282.0MiB 
[     GPT] UH     61067264     61071359       2.0MiB 
copying flasher(/home/sangoku/Desktop/NVidia/64_TX1/Linux_for_Tegra_64_tx1/bootloader/t210ref/cboot.bin)... done.
Existing flashapp(/home/sangoku/Desktop/NVidia/64_TX1/Linux_for_Tegra_64_tx1/bootloader/tegraflash.py) reused.
*** Flashing target device started. ***
./tegraflash.py --bl cboot.bin --bct P2180_A00_LP4_DSC_204Mhz.cfg --odmdata 0x84000 --bldtb tegra210-jetson-tx1-p2597-2180-a01-devkit.dtb --applet nvtboot_recovery.bin --boardconfig board_config_p2597-devkit.xml --cmd "flash;reboot" --cfg flash.xml --chip 0x21 
Welcome to Tegra Flash
version 1.0.0
Type ? or help for help and q or quit to exit
Use ! to execute system commands
 
[   0.0000 ] Generating RCM messages
[   0.0435 ] tegrarcm --listrcm rcm_list.xml --chip 0x21 --download rcm nvtboot_recovery.bin 0 0
[   0.0442 ] RCM 0 is saved as rcm_0.rcm
[   0.0575 ] RCM 1 is saved as rcm_1.rcm
[   0.0575 ] List of rcm files are saved in rcm_list.xml
[   0.0575 ] 
[   0.0575 ] Signing RCM messages
[   0.0678 ] tegrasign --key None --list rcm_list.xml --pubkeyhash pub_key.key
[   0.0687 ] Assuming zero filled SBK key
[   0.1603 ] 
[   0.1603 ] Copying signature to RCM mesages
[   0.1611 ] tegrarcm --chip 0x21 --updatesig rcm_list_signed.xml
[   0.1621 ] 
[   0.1621 ] Parsing partition layout
[   0.2455 ] tegraparser --pt flash.xml
[   0.3045 ] 
[   0.3045 ] Creating list of images to be signed
[   0.3236 ] tegrahost --chip 0x21 --partitionlayout flash.bin --list images_list.xml
[   0.7276 ] 
[   0.7276 ] Generating signatures
[   0.7298 ] tegrasign --key None --list images_list.xml --pubkeyhash pub_key.key
[   0.7336 ] Assuming zero filled SBK key
[   0.8114 ] 
[   0.8465 ] tegrabct --bct P2180_A00_LP4_DSC_204Mhz.cfg --chip 0x21
[   0.8473 ] Copying Sdram info from 0 to 1 set
[   0.8896 ] Copying Sdram info from 1 to 2 set
[   0.8896 ] Copying Sdram info from 2 to 3 set
[   0.8896 ] 
[   0.8897 ] Updating boot device parameters
[   0.8904 ] tegrabct --bct P2180_A00_LP4_DSC_204Mhz.bct --chip 0x21 --updatedevparam flash.bin
[   0.8977 ] Warning: No sdram params
[   0.8977 ] 
[   0.8977 ] Updating bl info
[   0.9018 ] tegrabct --bct P2180_A00_LP4_DSC_204Mhz.bct --chip 0x21 --updateblinfo flash.bin --updatesig images_list_signed.xml
[   0.9055 ] 
[   0.9055 ] Updating secondary storage information into bct
[   0.9071 ] tegraparser --pt flash.bin --chip 0x21 --updatecustinfo P2180_A00_LP4_DSC_204Mhz.bct
[   0.9105 ] 
[   0.9105 ] Updating board information from board config into bct
[   0.9143 ] tegraparser --boardconfig board_config_p2597-devkit.xml --chip 0x21 --updatecustinfo P2180_A00_LP4_DSC_204Mhz.bct
[   0.9180 ] 
[   0.9182 ] Updating Odmdata
[   0.9191 ] tegrabct --bct P2180_A00_LP4_DSC_204Mhz.bct --chip 0x21 --updatefields Odmdata = 0x84000
[   0.9201 ] Warning: No sdram params
[   0.9202 ] 
[   0.9203 ] Get Signed section bct
[   0.9214 ] tegrabct --bct P2180_A00_LP4_DSC_204Mhz.bct --chip 0x21 --listbct bct_list.xml
[   0.9332 ] 
[   0.9333 ] Signing BCT
[   0.9352 ] tegrasign --key None --list bct_list.xml --pubkeyhash pub_key.key
[   0.9361 ] Assuming zero filled SBK key
[   0.9361 ] 
[   0.9361 ] Updating BCT with signature
[   0.9368 ] tegrabct --bct P2180_A00_LP4_DSC_204Mhz.bct --chip 0x21 --updatesig bct_list_signed.xml
[   0.9396 ] 
[   0.9397 ] Copying signatures
[   0.9406 ] tegrahost --chip 0x21 --partitionlayout flash.bin --updatesig images_list_signed.xml
[   0.9414 ] Run tegrabct to update tboot signature in bct
[   0.9542 ] 
[   0.9543 ] Updating BFS information
[   0.9551 ] tegrabct --bct P2180_A00_LP4_DSC_204Mhz.bct --chip 0x21 --updatebfsinfo flash.bin
[   0.9732 ] 
[   0.9733 ] Boot Rom communication
[   0.9763 ] tegrarcm --chip 0x21 --rcm rcm_list_signed.xml
[   0.9771 ] BR_CID: 0x32101001640d06c92000000009fd0240

As you can see the TX1 freeze at Boot Rom communication but this is over my knowledge, I can’t understand by myself from what it could be come from. For the TK1, it’s worse because it freeze just after the ./tegraflash.py message (there is no “[0.000] Generating RCM messages”)… I tried with different KVM usb controller (ich9 ehci and nec-xhci) and since the usb work at the flash start I can’t realize why it freeze so I hope someone here can tell me what I missed in the process? Or perhaps a way/command to access error log on what could be wrong? It is the same result if I just use the clonning process.

I can perfectly flash without any problem from a VM in VirtualBox on my computer but I need it to be flashed from the headless server which would be in the rack cabinets (VNC on VPS). It is impossible to reflash manually every time a student would mess up an OS because there would be 20 TK1 and 4 TX1.

Host config (if helpfull about virtualization limit):

  • Xeon E3-1225;
  • MB Gigabyte MW21-SE0;

Thank you to anyone who can help me!

Note that Ubuntu 14.04 is used for JetPack to perform a flash to either a JTK1 or a JTX1 regardless of whether the final Jetson target is to be Ubuntu 14.04 or Ubuntu 16.04 (though I think an Ubuntu 16.04 host might be possible for flash to a JTX1 under the most recent JetPack…I can’t verify what steps would be required for this to be true).

Also note that if you do not use JetPack, but instead use the driver package plus sample rootfs (or any cloned root partition), then you no longer need JetPack…what you’d need is just any x86_64 Linux host.

The USB halting issue is quite common with a VM. I don’t know enough about VM issues to tell you about the fix, but in most all cases adjustments to the USB settings have resolved the issue. In a few of the cases I’ve heard of the VM could not be adjusted to resolve this and a different host was used instead. Each person I’ve heard of working on this via a VM has had different levels of VM configuration knowledge, so in the cases of success or failure to get the USB working correctly on the VM it may have been nothing more than some level of VM knowledge where the VM itself could have always worked even for cases of failing.

Thank you for your answers, it’s a very good news that I can flash without bothering with the JetPack and host OS. But it’s also bad that I can’t use the VM for the flashing purpose, when new thing will be added they would be obligated to create at least one image outside of the cluster in place of doing it on the remote guest.

Anyway I tried to use the drivers package but I am a litte lost… I see about nv_flash and tegraflash.py, if I understand correctly tegraflash.py is the most recent with a “friendly” terminal and nv-flash isn’t in the drivers package anymore so I tried the method for cloning that I saw in one of your topic: https://devtalk.nvidia.com/default/topic/898999/tx1-r23-1-new-flash-structure-how-to-clone-/

But the flash stuck about the cboot.bin:

root@HEH-SC-Server:/home/sangoku/NVidia/Linux_for_Tegra/bootloader# ./tegraflash.py --bl cboot.bin --applet nvtboot_recovery.bin --chip 0x21 --cmd "read APP clone.img"
Welcome to Tegra Flash
version 1.0.0
Type ? or help for help and q or quit to exit
Use ! to execute system commands
 
[   0.0014 ] Generating RCM messages
[   0.0024 ] tegrarcm --listrcm rcm_list.xml --chip 0x21 --download rcm nvtboot_recovery.bin 0 0
[   0.0031 ] RCM 0 is saved as rcm_0.rcm
[   0.0035 ] RCM 1 is saved as rcm_1.rcm
[   0.0035 ] List of rcm files are saved in rcm_list.xml
[   0.0035 ] 
[   0.0035 ] Signing RCM messages
[   0.0041 ] tegrasign --key None --list rcm_list.xml --pubkeyhash pub_key.key
[   0.0048 ] Assuming zero filled SBK key
[   0.0096 ] 
[   0.0096 ] Copying signature to RCM mesages
[   0.0102 ] tegrarcm --chip 0x21 --updatesig rcm_list_signed.xml
[   0.2100 ] 
[   0.2101 ] Boot Rom communication
[   0.2121 ] tegrarcm --chip 0x21 --rcm rcm_list_signed.xml
[   0.2141 ] BR_CID: 0x32101001640d06c92000000009fd0240
[   0.2150 ] RCM version 0X210001
[   0.2256 ] Boot Rom communication completed
[   1.2332 ] 
[   1.2333 ] Retrieving storage infomation
[   1.2353 ] tegrarcm --oem platformdetails storage storage_info.bin
[   1.2372 ] Applet version 00.01.0000
[   1.2509 ] Saved platform info in storage_info.bin
[   1.3191 ] 
[   1.3192 ] Reading BCT from device for further operations
[   1.3192 ] Sending bootloader and pre-requisite binaries
[   1.3212 ] tegrarcm --download ebt cboot.bin 0 0
[   1.3232 ] Applet version 00.01.0000
[   1.3346 ] File cboot.bin open failed
[   1.3346 ] File cboot.bin open failed
[   1.3346 ] 
Error: Return value 19
Command tegrarcm --download ebt cboot.bin 0 0

So if I understand I don’t have the cboot.bin file or it’s not the good bootloader used by the board that I want to clone. But what I have to change? And it’s notified that tegraflash create a image of only one partition so what command I need to read an reload the two (boot+fs)? Actualy the purpose is just to take the filesystem of one board and clone it on others.

FYI, a VM can be made to work…but typically a VM is a nuisance to get to work correctly and not reliable. If you know a lot about the VM it might be possible to use this…I wouldn’t advise relying on it.

If you just flash, use something like this first without attempting to restore a clone (use the same version the clone is from…I’m assuming the size for using the most of eMMC…you can vary this if the clone is from a different size):

sudo ./flash.sh -S 14580MiB jetson-tx1 mmcblk0p1

Then with the clone in place flash as “bootloader/system.img” again flash like this to load the clone:

sudo ./flash.sh <b>-r</b> -S 14580MiB jetson-tx1 mmcblk0p1

When cloning a root partition there may be an implicit requirement that the arrangement of hidden partitions agree with that size…I’m uncertain of any of the details, but the purpose of flashing once normally using matching parameters to the device the clone is from fulfills this requirement.

Cloning can extract any subset of eMMC (including specific partitions other than root partition). Restore can also flash specific subsets of eMMC (typically root partition is the goal). Letting the flash software do all of the rest of the flash should provide a compatible layout of all of the hidden partitions needed prior to handing off to the root partition.

If you have some specific use-case of cloning and restore it might be useful to know more about the exact requirements, including source and destination L4T versions.