No. Shorting the GPIO and doing a hard reset/power cycle always works.
A few days ago I was able to narrow down the cause further and find a workaround, which further points to scratch register 109 (SCR109).
SCR109 bits 4:5 represent the boot chain to use. On coldboot BR reads the current boot slot from BR-BCT into SCR109, on warmboot the value in the register is used as-is. The value in the register is then used by MB1 and MB2 to know which boot chain is used (see Jetpack docs).
When accessing SCR109 manually using devmem
, reading the contents shows the expected results:
# boot chain A active
busybox devmem 0x0C3903CC 32 # result: 0x00000000
# boot chain B active
busybox devmem 0x0C3903CC 32 # result: 0x00000010
I can reliably reproduce the problem from both active boot slots by simply setting the register before rebooting into RCM:
# always reboots to working RCM, regardless of current boot slot
busybox devmem 0x0C3903CC 32 0x00000000 && reboot forced-recovery
# always reboots to broken RCM, regardless of current boot slot
busybox devmem 0x0C3903CC 32 0x00000010 && reboot forced-recovery
The failure is exactly the same as if only rebooting to RCM from the B slot (i.e., successfully enter RCM, but MB2 applet loading crashes).
My best guess is that when triggering RCM in hardware (REC/GND), BR goes into RCM before setting SCR109 from BR-BCT, so it is always zero’d. On a warmboot to RCM the register retains its old value (non-zero in boot chain B). The MB1 downloaded via USB then reads the register (see “Slot: 1” printed in MB1 log) and tries to load the MB2/MB2 applet from an invalid location (instead of downloading it via USB as intended).
While this provides a workaround, it relies on devmem
being available and able to access the memory. Since our custom hardware does not have external access to the recovery GPIO, rebooting to RCM is the only option for RCM-level operations (e.g., fusing).
Not extensively, but the general test I did (as described in the original post) yielded the same results.