I have a frozen image that I took after developing on a TX2 board, and then I tried to reflash it on the jetson I am working on and it works out of the box.
We wanted to use this cloned image on the rest of the Jetson TX2 we have, and then unfortunately it blows up while booting even after a successful flashing. I have the TX2 frozen at the following screen and couldn’t do much, and I was not able to put it back to the recovery mode to flash the fresh image. I completely lost control of it. Any help would be appreciated.
You’ll need to provide the exact clone steps and which L4T release the clone is from, and what you did to install the clone, including if the flash software which created the original L4T install is the same as the release used to restore the clone.
I do see lots of ext4 errors. Do you have both a raw and sparse clone?
@linuxdev Sure.
Thanks for the imminent response.
I used the following command to take the clone of the image : ./flash.sh -r -k APP -G backup.img cti/tx2/quasar mmcblk0p1
Once the cloning is done, I have both the raw and sparse image in the folder.
Clone and the flashed image are both from : L4T R32.4 with Jetpack 4.4
Yes the flash software that created the L4T install is same as that of the release.
What I did to install the clone is:
I added the saved clone image as the system.img inside the bootloader folder
I ran the following command ./flash.sh -r cti/tx2/quasar mmcblk0p1
This is probably the issue. The clone is used only after boot has succeeded up to some point. However, if you have a custom carrier board, then the non-rootfs content would have to be updated for that carrier board. I am guessing that the content surrounding the clone is not correct for that carrier board.
However, this would not change the ability to reach recovery mode. The flash log says you did reach recovery mode, otherwise none of that would have been possible for logging. When you say “I am not able to set back the TX2 to recovery mode”, are you just saying flash fails to create a bootable Jetson? Not reaching recovery mode is far different than not being able to boot.
Btw, you can flash from any board, but the customized boot content in non-rootfs partitions will matter as soon as you actually try to boot.
@linuxdev So you mean this custom content is only update on the new board on fresh install?. I took a new jetson and flashed the cloned image directly, at this time I was able to set it properly in the recovery mode and then flash it, and the log says successful, as you said may be the non-rootfs partition is still not with customized configuration. So this messed up. How can I solve this?
Once this is done, I powered the device off and turned it on again after few seconds, and then that’s it, the boot is stuck at the above screen. Then, I tried to put the device in recovery mode by pressing recovery button while powering on the device, this doesn’t help at all. I don’t see it in the lsusb log. Even a non recovery state is not displayed in lsusb.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 046d:c05b Logitech, Inc. M-U0004 810-001317 [B110 Optical USB Mouse]
Bus 001 Device 003: ID 046d:c31c Logitech, Inc. Keyboard K120
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Do you have any suggestions to solve the issue I have?
The custom content from outside of the rootfs is only updated if you update your flash software. The clone will not preserve this content. If you can flash successfully and get boot to succeed with your flash to the custom carrier board device, then flashing again with the clone should do the job (you might need to specify the rootfs size if it is not default).
A flash puts in a lot of non-rootfs content. This is essentially from small images in other parts of the flash software which are picked based on arguments passed to flash.sh. Unless those images/content are updated for your custom carrier board prior to flash, then there will be no valid flash of that content for that custom carrier board. The flash will succeed, but the content will not boot. Mostly the changes are device tree.
If powering up with recovery held down does not show up on the host PC’s lsusb, then there is indeed an error which goes beyond software. One possibility is the micro-B USB cable just isn’t working (charger cables are notoriously bad for data, do you use the supplied micro-B USB of a dev kit?). Most such failures, when they are hardware and not due to cabling/hubs, are a problem of the carrier board, and not the module. Do you have another carrier board you can test with for recovery mode?
@linuxdev Thanks for the clear explanation. So, as per what you said, the non-rootfs content should also correspond to the same part as I have flashed the devices with the same contents on both devices.
The strange part is I used the same micro-B USB cable to use it in recovery mode, and it was flashed successfully, Later I used the same cable to put the device back to recovery but then as I explained no lsusb info found. So, if it is not the cable issue, then it should be the issue of carrier board as you said, that’s worrying. Yes i do have another one, I will try to give it a try, I was just worried that i will also mess it up and leave it without recovery mode. I will keep you posted.
It sounds weird the board cannot enter recovery mode anymore after you use cloned image.
The recovery mode is triggered by the hardware. The software installed on it should have no effect to recovery mode.
Hello @WayneWWW,
Yes, I tried with different USB cables and also different host machine, and we were unsuccessful to change it to recovery mode. I was successful 5/5 time to put a working board in recovery, but not the one’s I flashed with cloned image.
Yes, I flashed on 2 of them and they went to this condition and couldn’t set them to the recovery. But I tried it on the same device on which I cloned, and it works flawlessly. This is what bothering me.
@WayneWWW The rootfs and non rootfs files are same, as they are self contained inside the docker, and i was able to flash, clone, and re-flash the image successfully from the same docker time to time.
@WayneWWW@linuxdev
Update: I think I know the issue, the MD5SUM of the cloned image I have and the MD5SUM of the image we cloned on the Jetson’s are different, and this is caused as the scp may not be successful before, because we use different machines to flash the other Jetson, and upon checking the MD5SUM, we came to know that the image we flashed on Jetson is not complete.
What could be done in this case?. Do you think this would result in this behavior?, is there a way to recover it?. Please let me know.
@WayneWWW
I am holding recovery button while powering on the device, this is the strategy I usually follow to place the device in recovery mode. I am using a customized carrier board.
I can share, but the problem arised as I wrongly flashed an incomplete cloned image, so I believe any incomplete cloned image might result in this behavior. I flashed the complete proper image with correct MD5SUM and the flashing works perfectly and the device is able to boot properly and I am able to set it up back in recovery mode.
So the culprit here is the incomplete cloned image, as few MBs are missing in the size. Do you know if we can recover from this somehow?
Actually, I focus more on why a incomplete cloned image would cause recovery mode failure… Not really sure what culprit might be.
Please note that we also tried to erase the whole emmc and then did the flash before and it could still work.
And I don’t quite get your latest comment here. Are there multiple TX2 modules for you to do this test? So you have one TX2 module that is totally not able to work now?
I flashed the complete proper image with correct MD5SUM and the flashing works perfectly and the device is able to boot properly and I am able to set it up back in recovery mode.