Multiple Tegraflash Commands

The background to this question is that I have modified the eMMC layout to add a new small partition, which I want to be “permanent”. By this, I mean that when updating the device via a reflash, this particular partition shouldn’t be overwritten. This doesn’t seem possible, so the way I thought to do this is to first read back that new partition, and then run a complete reflash as usual, thereby re-writing the partition I just read back. I am also using the OE4T Yocto layer to generate an image, and using the scripts provided there to flash the Orin.

What I’m finding is that I am only able to write as the first command to tegraflash.py. As an example: these work:
read A_kernel kernel.bin; read A_kernel-dtb dtb.bin; reboot
write A_kernel kernel.bin; read A_kernel-dtb dtb.bin; reboot

But this fails:
read A_kernel kernel.bin; write A_kernel-dtb dtb.bin; reboot
with the following error:

tegrarcm_v2 --new_session --chip 0x23 0 --uid --download bct_br br_bct_BR.bct --download mb1 mb1_t234_prod_aligned_sigheader.bin.encrypt --download psc_bl1 psc_bl1_t234_prod_aligned_sigheader.bin.encrypt --download bct_mb1 mb1_bct_MB1_sigheader.bct.encrypt
Error: Return value 8

After this error, I do still see the device as available in recovery mode over USB (lsusb shows the 0955:7023 NVIDIA Corp. APX device) however generally the fan is spinning hard and it needs resetting. It seems like other calls to tegrarcm_v2, or re-running flashing scripts also end in failure. I cannot see anything different in the serial debug output between these scenarios.

One other related thing I have noticed; if I run a read followed by recovery reset (e.g. read A_kernel kernel.bin; reboot recovery), then it also ends up in this bad recovery state. In the debug UART output the last output is this:

I> Reading A_kernel partition.
I> Rebooting : reboot-recovery


E> NV3P_SERVER: Failed to reset recovery device.

Earlier up in the logs for this command - prior to the read taking place - there is a recovery reboot which is successful, with a snippet of the output looking this like:

I> Rebooting : reboot-recovery


[0032.272] I> MB1 (version: 0.32.0.0-t234-54845784-57325615)
[0032.278] I> t234-A01-0-Silicon (0x12347) Prod
[0032.282] I> Boot-mode : Coldboot
[0032.285] I> Emulation: 
[0032.288] I> Entry timestamp: 0x00000000

So I wonder if something happens after the first command, which means it’s not in a clean state even though further reads are possible?

Following this, my questions are:

  1. What can be causing the error when I try a read followed by a write? Is it a fundamental limitation or would different commands result in a success?
  2. Where can I find more information about tegraflash.py, tegrarcm_v2, tegrabct_v2` and so on? I can see very little reference to them in the Linux Developer Guide, let alone specific documentation for them.
  3. Is my original assumption wrong, and it is possible to exclude certain partitions when flashing the entire eMMC? Then I could avoid doing the read followed by write.

Thanks.

Someone else needs to answer, but I want to point out something about flashing and recovery mode: Although there are several commands on the host PC during a flash, you cannot perform two or more flash type operations in a row without resetting the Jetson to a fresh recovery mode each time (within one flash it is a progression of state, but some commands only work in the starting state). The 3P server is basically the state I am referring to.

As examples, you could not clone twice without resetting between clones. You cannot flash twice without resetting between flashes.

Hi,

Check the comment by @linuxdev

I don’t think these tools are to be used by users directly, so we don’t document them.

No. Flashing tools will clear every partitions, and will not keep any data on the device, as there may be possible partition layout changes, r32 to r35, for example.

But with the term updating, I assume these operations are done when your device is still functioning properly, and it’s not like it gets broken and you want to fix it by re-flashing. If that’s the case, you may try image-based OTA, with which you can set what files you want to preserve after OTA is done:
https://docs.nvidia.com/jetson/archives/r35.3.1/DeveloperGuide/text/SD/SoftwarePackagesAndTheUpdateMechanism.html#back-up-and-restore-files-on-the-app-partition

Thanks both. @DaveYYY I am looking to move to an OTA solution, but I still need this solution in the short-medium term.

That’s interesting, though I’m not sure it fully explains the problem I’m having. Does a write count as two operations, since it erases the partition first? Either way, what’s the different between write-read working, and read-write not working? As far as I can tell, its the same number of operations with no fresh recovery in the middle? Also, since a read-recovery seems to fail, is there another way I can refresh the recovery mode between operations?

Hi,

I just noticed that your use case may be suitable with this section:
(Check To clone a Jetson device and flash)
https://docs.nvidia.com/jetson/archives/r35.3.1/DeveloperGuide/text/SD/FlashingSupport.html

Swap the partition name with the one you define, and make sure the file name is specified in the partition layout file.
Then you don’t need to interact with tegraflash.py directly.

The Jetson is in a given state once recovery mode is started. Some commands will write, and will change the state. I’m certain there are many writes possible, but there is preparation to get to the state where a write works. There are also things which might be disabled by the first write. If you try to clone and then flash without a reboot of recovery mode between clone and flash, then it will fail, but each clone or flash is itself a series of smaller operations (they are chains of commands).

@DaveYYY I tried flashing the stock 5.1.1 Ubuntu, and then ran sudo ./flash.sh -r -k A_kernel -G kernel.bu jetson-orin-agx-devkit mmcblk0p1 and it read the partition successfully, but it’s still the same case as before, where the device is no longer in a usable recovery state. E.g. if I just try and run the same command again, it will error out with Error: probing the target board failed. (even though it is present according to lsusb), and I have to physically put the dev kit back into a proper recovery mode before doing anything else.

What I need to be able to do is to read and then write without any manual intervention. Based on my experiments and from what @linuxdev is saying, it seems that what I’m missing is a way to put the device back into a good recovery state before running any other operations?

I think you are right here.
The device has to be put back into recovery mode before you can do anything about flashing on it.

@DaveYYY to be clear, this means that the only option is to do a physically reboot into recovery mode - the recovery mode options in software utilties such as tegrarcm_v2 are not sufficient?

Yes, and tegrarcm_v2 is just a tool that will be used during flashing, but not for entering recovery mode. What I cannot understand is that if you just need to backup and restore a certain partition, why is

such operation needed? I don’t think you are able to do that.

@DaveYYY as mentioned, what I really want to do is to flash all of the partitions except one. The most logical workaround I can see is to first read back this special partition, and then reflash the entire eMMC (in the process writing the special partition’s data that I just read).

Yes, and tegrarcm_v2 is just a tool that will be used during flashing, but not for entering recovery mode

I mentioned tegrarcm_v2 because of its --reboot recovery option that gets used in the flashing scripts, but really I just mean anyway of getting into a clean recovery mode through software.

Hi,

I just tried with tegracm_v2 and failed, so seems like it only works in a flashing environment.
I’d suggest sudo reboot –force forced-recovery, or read out the partition on your device with dd, transfer the image to your host, flash it again, put the image back to your device, and write the partition back with dd. That way, you can avoid the need of entering recovery mode.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.