Flashing a single partition with ./flash.sh causes a boot loop

I was trying to flash only cboot with the following command but although the flash command exits normally, it puts the module into a boot loop. The only way to recover is to flash all.

# ./flash.sh -r -no-systemimg -k cpu-bootloader jetson-xavier-nx-devkit mmcblk0p1
[0000.024] W> RATCHET: MB1 binary ratchet value 4 is too large than ratchet level 2 from HW fuses.
[0000.033] I> MB1 (prd-version: 1.5.1.3-t194-41334769-d2a21c57)
[0000.038] I> Boot-mode: Coldboot
[0000.041] I> Chip revision : A02 
[0000.044] I> Bootrom patch version : 15 (correctly patched)
[0000.049] I> ATE fuse revision : 0x200
[0000.053] I> Ram repair fuse : 0x0
[0000.056] I> Ram Code : 0x0
[0000.058] I> rst_source : 0xb
[0000.061] I> rst_level : 0x1
[0000.065] I> Boot-device: QSPI
[0000.067] I> Qspi flash params source = brbct
[0000.071] I> Qspi using bpmp-dma
[0000.074] I> Qspi clock source : pllp
[0000.078] I> QSPI Flash Size = 32 MB
[0000.081] I> Qspi initialized successfully
[0000.085] W> No valid slot number is found in scratch register
[0000.091] W> Return default slot: _a
[0000.094] I> Active Boot chain : 0
[0000.097] I> Boot-device: QSPI
[0000.100] I> Qspi flash params source = brbct
[0000.106] W> MB1_PLATFORM_CONFIG: device prod data is empty in MB1 BCT.
[0000.112] I> Temperature = 45000
[0000.115] W> Skipping boost for clk: BPMP_CPU_NIC
[0000.119] W> Skipping boost for clk: BPMP_APB
[0000.123] W> Skipping boost for clk: AXI_CBB
[0000.127] W> Skipping boost for clk: AON_CPU_NIC
[0000.132] W> Skipping boost for clk: CAN1
[0000.135] W> Skipping boost for clk: CAN2
[0000.140] I> Boot-device: QSPI
[0000.142] I> Boot-device: QSPI
[0000.145] I> Qspi flash params source = mb1bct
[0000.149] I> Qspi using bpmp-dma
[0000.152] I> Qspi clock source : pllc_out0
[0000.156] I> Qspi reinitialized
[0000.159] I> Qspi flash params source = mb1bct
[0000.164] I> ECC region[0]: Start:0x0, End:0x0
[0000.169] I> ECC region[1]: Start:0x0, End:0x0
[0000.173] I> ECC region[2]: Start:0x0, End:0x0
[0000.177] I> ECC region[3]: Start:0x0, End:0x0
[0000.181] I> ECC region[4]: Start:0x0, End:0x0
[0000.185] I> Non-ECC region[0]: Start:0x80000000, End:0x100000000
[0000.191] I> Non-ECC region[1]: Start:0x0, End:0x0
[0000.195] I> Non-ECC region[2]: Start:0x0, End:0x0
[0000.200] I> Non-ECC region[3]: Start:0x0, End:0x0
[0000.204] I> Non-ECC region[4]: Start:0x0, End:0x0
[0000.210] E> FAILED: Thermal config
[0000.217] E> FAILED: MEMIO rail config
[0000.227] I> Boot-device: QSPI
[0000.230] I> Qspi flash params source = mb1bct
[0000.239] I> Qspi flash params source = mb1bct
[0000.251] I> Qspi flash params source = mb1bct
[0000.317] I> Qspi flash params source = mb1bct
[0000.326] I> Qspi flash params source = mb1bct
[0000.353] I> Qspi flash params source = mb1bct
[0000.365] I> MB1 done

����main enter
SPE VERSION #: R01.00.14 Created: Sep 19 2018 @ 11:03:21
HW Function test
Start Scheduler.
in late init
��
  [0000.373] I> Welcome to MB2(TBoot-BPMP) (version: 00.00.2018.32-mobile-aa987a31)
[0000.374] I> DMA Heap @ [0x526fa000 - 0x52ffa000]
[0000.374] I> Default Heap @ [0xd486400 - 0xd48a400]
[0000.375] E> DEVICE_PROD: Invalid value data = 70020000, size = 0.
[0000.381] W> device prod register failed
[0000.385] I> Boot-device: QSPI
[0000.387] I> Boot_device: QSPI_FLASH instance: 0
[0000.393] I> QSPI Flash Size = 32 MB
[0000.395] I> Qspi initialized successfully
[0000.399] I> qspi flash-0 params source = boot args
[0000.404] E> Failed: Unknown device 6
[0000.411] I> Found 47 partitions in QSPI_FLASH (instance 0)
[0000.414] W> No valid slot number is found in scratch register
[0000.419] W> Return default slot: _a
[0000.422] I> Active Boot chain : 0
[0000.426] I> parsing oem signed section of bpmp-fw header done
[0000.431] I> bpmp-fw binary init read from storage
[0000.436] I> oem authentication of bpmp-fw header done
[0000.450] I> bpmp-fw binary done read from storage
[0000.451] I> bpmp-fw: Authentication init Done
[0000.452] I> parsing oem signed section of cpubl header done
[0000.455] I> cpubl binary init read from storage
[0000.460] I> bpmp-fw: Authentication Finalize Done
[0000.464] E> Stage2Signature validation failed with SHA2!!
[0000.469] I> load/auth: execution failed
[0000.473] E> Top caller module: LOADER, error module: LOADER, reason: 0x18, aux_info: 0x00
[0000.481] I> AB warm reset

Hi,

What if you use this command without “-no-systemimg” ? We didn’t use such parameters before.

./flash.sh -r -k cpu-bootloader jetson-xavier-nx-devkit mmcblk0p1

No difference. Still loops.

The first command with the “-no-systemimg” would have modified things. Can you cleanly reflash all and then run the version without “-no-systemimg”? Or are you saying you had a booting system and the final command left it in a boot loop too?

Correct. I can go back and forth. I> AB warm reset happens before cboot is loaded, let alone the kernel.

With a previously running device I want to flash a custom cboot…

./flash.sh -r -no-systemimg -k cpu-bootloader jetson-xavier-nx-devkit mmcblk0p1

puts the device into a boot loop before cboot is even loaded.

./flash.sh -r -no-systemimg jetson-xavier-nx-devkit mmcblk0p1

recovers it with my custom cboot.

Neither the system image nor cboot version make any differece.

Not very clear with current situation.

Neither the system image nor cboot version make any differece.

Do you mean even if I use default cboot from jetpack with the command in #2 would have problem? or it is an issue after using custom cboot ?

Default cboot has the same issue. Just try it…

$ sudo ./flash.sh -r --no-systemimg -k cpu-bootloader jetson-xavier-nx-devkit mmcblk0p1

I’ve also tried flashing cpu-bootloader_b and bootloader-dtb with the default images with the same result.

When you flash everything, the script erases the entire qspi-nor device before writing to it. If you flash only a single partition, I wonder if that partition’s erase blocks aren’t being erased before the partition is written resulting in corrupted data.

Oh by the way… --no-systemimg still writes the system image.

Hi gtj,

We will try.

Oh by the way… --no-systemimg still writes the system image.

Actually, flash.sh is a script with long history since TK1 era. There are some options that may not be verified now.
It seems you insist using “–no-systmeimg”. However, this option is probably not verified for long time so I don’t suggest to use it.

Please just use below to flash and it should not erase your app partition.

./flash.sh -r -k jetson-xavier-nx-devkit mmcblk0p1

We will check the cboot issue internally.

OK, thanks!

We have some missing patches in this release. Will be fixed in next release.

@WayneWWW
is it possible to share ETA for this fix?
Every time I have to flash full image to update the cboot. is there any other way to update the cboot in QSPI memory faster.

Thanks,
Husain

Hi,

2 patches are needed. Please try to add them to your flash.sh.

Patch1

diff --git a/scripts/flash.sh b/scripts/flash.sh
index 407cedd..bcba0fd 100755
--- a/scripts/flash.sh
+++ b/scripts/flash.sh
@@ -910,6 +910,8 @@
 	echo "Board ID(${board_id}) version(${board_version}) sku(${board_sku}) revision(${board_revision})" >/dev/stderr;
 }
 
+ext_target_board_canonical=`readlink -e "${ext_target_board}".conf`
+ext_target_board_canonical=`basename "${ext_target_board_canonical}" .conf`
 source ${ext_target_board}.conf
 
 # set up path variables
@@ -1471,7 +1473,7 @@
 	echo -n "copying initrd(${kernelinitrd})... ";
 	cp -f "${kernelinitrd}" initrd;
 	chkerr;
-	# Code below for the initrd boot. Further details: http://nvbugs/2053323
+	# Code below for the initrd boot. Further details: see 2053323
 	if [ "${target_rootdev}" = "cloning_root" ]; then
 		clone_restore_dir="${LDK_DIR}/clone_restore"
 		if [ ! -f ${clone_restore_dir}/nvbackup_copy_bin.func ]; then
@@ -2300,7 +2302,7 @@
 		;;
 	#
 	# Comment out sc7 support. It is found that sc7 sigheader is different and it needs special handling
-	# See bug http://nvbugs/200617500
+	# See 200617500
 	#
 	# sc7 | sc7_b) target_partfile="${wb0bootname}";
 	#	need_sign=1;
@@ -2381,8 +2383,15 @@
 		if [ ${no_flash} -eq 1 ]; then
 			FLASHARGS="--chip ${tegraid} --cmd \"sign ${target_partfile}\" ";
 		else
-			# issue an erase command before write
-			FLASHARGS+="erase ${target_partname}; ";
+			# Only issue erase command for QSPI device.
+			# The sdmmc erase/trim operation may corrupt other partitions.
+			# See 200565454 and 200615787
+			if [[ "${ext_target_board_canonical}" == "p3509-0000+p3668"* ||
+				"${ext_target_board_canonical}" == "p3448-0000-sd"* ||
+				"${ext_target_board_canonical}" == "p3448-0000-max-spi"* ]]; then
+				# issue an erase command before write
+				FLASHARGS+="erase ${target_partname}; ";
+			fi
 
 			if [ ${need_sign} -eq 1 ]; then
 				# special handling for MB1_BCT and T210

Patch 2

diff --git a/scripts/flash.sh b/scripts/flash.sh
index 8a2e90f..084d566 100755
--- a/scripts/flash.sh
+++ b/scripts/flash.sh
@@ -2348,6 +2348,9 @@
 		if [ ${no_flash} -eq 1 ]; then
 			FLASHARGS="--chip ${tegraid} --cmd \"sign ${target_partfile}\" ";
 		else
+			# issue an erase command before write
+			FLASHARGS+="erase ${target_partname}; ";
+
 			if [ ${need_sign} -eq 1 ]; then
 				# special handling for MB1_BCT and T210
 				if [ "${target_partname}" = "MB1_BCT" ] ||
1 Like

Thanks @WayneWWW. It worked like a charm.