Maybe worth mentioning that the image.img i flash has originally been cloned from a Xavier with NVME. But no OTA rootfs performed on it.
Could you firstly try default setting and try @carolyuu’s steps here and see if you can achieve the same result?
Default setting means no nvme, no cloned system image, no partition layout change.
Just to clarify if you can have a base case as our side too.
Interesting behavior. I zeroed completely the NVME (dd if=/dev/zero of=/dev/nvme0n1 …) and put it back in.
On first boot, still exactly the same. NVME boot fails, and doing successfully a ‘Fixed storage boot’.
I then just created an empty ext4 filesystem on it (on the whole drive, no partition table) and again it is the ‘old behavior’, like doing a NVME boot to boot from EMMC. Previously we also had directly an ext4 fs on it, no partition table.
Again root PARTUUID and usbcore option given.boot_0.log (92.5 KB)
Log attached
Now i deleted the FS again on the NVME and created a partition table with a single partition that i ext4 formatted, it is again doing the ‘Fixed storage boot’ on the eMMC.
Logs:
boot_0.log (38.6 KB)
-
So there are 2 behaviors:
- No NVME
- zeroed NVME
- NVME with partition table
all use the ‘Fixed storage boot’
-
Having an ext4 FS directly on the NVME, without partition table, NVME boot is used to boot the MMC. Even if no files present on the FS.
Yes, i can do that. Probably only tomorrow.
I think below logs might be needed. Better dumping them during your test.
-
flash.sh with rootfs_ab enabled. Boot up the device, it should have uuid in both extlinux.conf in each slot.
Switching between 0 and 1 slot and dump two boot log. -
After doing the image base OTA, check if you hit the same issue. Again dump the log from slot 0 and slot 1.
Reflashed from scratch after deleting the Linux_for_Tegra folder and untaring again the jetpack, the rootfs and the otatools.
No modification made.
Created the image with sudo ROOTFS_AB=1 ./flash.sh -C 'usbcore.usbfs_memory_mb=2048' --no-flash jetson-agx-xavier-devkit mmcblk0p1
and flashed it with sudo ROOTFS_AB=1 ./flash.sh -r jetson-agx-xavier-devkit mmcblk0p1
Bootlog of first boot on slot 0 and slot 1 attached.
boot_0.log (48.6 KB)
boot_1.log (37.3 KB)
Hi,
Please finish the boot process and check the kernel cmdline in the kernel. Not in the cboot.
After configuring both slots (accepting license and so on).
Both slots boot fine, with the correct rootfs mounted. Both /boot/extlinux/extlinux.conf correct, and both kernels got the correct cmdline. Just in the cboot logs the cmdline is not complete but that is not a problem.
So everything is nominal without any NVME.
Now i could either check a rootfs OTA upgrade or check the behavior with an NVME at this point (zeroed, with ext4 FS on full NVME and with ext4 FS on a partition on the NVME).
On slot 0:
ansible@xavier:~$ cat /boot/extlinux/extlinux.conf
TIMEOUT 30
DEFAULT primary
MENU TITLE L4T boot options
LABEL primary
MENU LABEL primary kernel
LINUX /boot/Image
INITRD /boot/initrd
APPEND ${cbootargs} quiet root=PARTUUID=3165bbb4-ba80-4986-b83e-ed385a48de4c rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 usbcore.usbfs_memory
_mb=2048
# When testing a custom kernel, it is recommended that you create a backup of
# the original kernel and add a new entry to this file so that the device can
# fallback to the original kernel. To do this:
#
# 1, Make a backup of the original kernel
# sudo cp /boot/Image /boot/Image.backup
#
# 2, Copy your custom kernel into /boot/Image
#
# 3, Uncomment below menu setting lines for the original kernel
#
# 4, Reboot
# LABEL backup
# MENU LABEL backup kernel
# LINUX /boot/Image.backup
# INITRD /boot/initrd
# APPEND ${cbootargs}
ansible@xavier:~$ cat /proc/cmdline
console=ttyTCU0,115200 video=tegrafb earlycon=tegra_comb_uart,mmio32,0x0c168000 gpt rootfs.slot_suffix= usbcore.old_scheme_first=1 tegraid=19.1.2.0.0 maxcpus=8 boot.slot_suffix= boot.ratchetvalues=0.4.2 vpr_resi
ze sdhci_tegra.en_boot_part_access=1 quiet root=PARTUUID=3165bbb4-ba80-4986-b83e-ed385a48de4c rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 usbcor
e.usbfs_memory_mb=2048
On slot 1:
ansible@xavier:~$ cat /boot/extlinux/extlinux.conf
TIMEOUT 30
DEFAULT primary
MENU TITLE L4T boot options
LABEL primary
MENU LABEL primary kernel
LINUX /boot/Image
INITRD /boot/initrd
APPEND ${cbootargs} quiet root=PARTUUID=b2df52e5-35a5-4186-a67e-fd3b154821fc rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 usbcore.usbfs_memory
_mb=2048
# When testing a custom kernel, it is recommended that you create a backup of
# the original kernel and add a new entry to this file so that the device can
# fallback to the original kernel. To do this:
#
# 1, Make a backup of the original kernel
# sudo cp /boot/Image /boot/Image.backup
#
# 2, Copy your custom kernel into /boot/Image
#
# 3, Uncomment below menu setting lines for the original kernel
#
# 4, Reboot
# LABEL backup
# MENU LABEL backup kernel
# LINUX /boot/Image.backup
# INITRD /boot/initrd
# APPEND ${cbootargs}
ansible@xavier:~$ cat /proc/cmdline
console=ttyTCU0,115200 video=tegrafb earlycon=tegra_comb_uart,mmio32,0x0c168000 gpt rootfs.slot_suffix=_b usbcore.old_scheme_first=1 tegraid=19.1.2.0.0 maxcpus=8 boot.slot_suffix=_b boot.ratchetvalues=0.4.2 vpr_
resize sdhci_tegra.en_boot_part_access=1 quiet root=PARTUUID=b2df52e5-35a5-4186-a67e-fd3b154821fc rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 us
bcore.usbfs_memory_mb=2048
boot_0.log (37.2 KB)
boot_1.log (37.3 KB)
Continued testing.
Generated an OTA rootfs update from the same image.bin.raw as the one generated from the initial flash.sh command. Not a cloned image.
Successfully updated partition B from partition A as well as updated partition A from partition B.
Everything seems nominal. Kernel cmdline is still correct and correct partition as rootfs.
Logs not attached as no difference than before the update.
NVME test 1
I then put back the zeroed NVME (no FS, no partition table, just zeros).
Booting slot 0: everything nominal
Booting slot 1: everything nominal
Updating slot 1 from slot 0: everything nominal
Updating slot 0 from slot 1: everything nomial
(correct boot logs, rootfs correct and correct cmdline for kernel)
Logs not attached as no difference than before.
NVME test 2
A partition table has been created on the NVME and a single partition has been created on it with an ext4 formated FS.
The same 4 tests have been done again, again everything nominal.
Logs not attached as no difference than before.
NVME test 3
As on our initial configuration, an ext4 FS has been created directly on the NVME, without using a partition.
Booting slot 0: Boots fine, but again with the ‘NVME’ boot in the boot logs
boot_0.log (92.3 KB)
ansible@xavier:~$ cat /proc/cmdline
console=ttyTCU0,115200 root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 video=tegrafb earlycon=tegra_comb_uart,mmio32,0x0c168000 gp
t rootfs.slot_suffix= usbcore.old_scheme_first=1 tegraid=19.1.2.0.0 maxcpus=8 boot.slot_suffix= boot.ratchetvalues=0.4.2 vpr_resize sdhci_tegra.en_boot_part_access=1
Booting slot 1: Boots on the wrong rootfs, again with the ‘NVME’ boot in the boot logs
boot_1.log (92.4 KB)
ansible@xavier:~$ cat /proc/cmdline
console=ttyTCU0,115200 root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 video=tegrafb earlycon=tegra_comb_uart,mmio32,0x0c168000 gp
t rootfs.slot_suffix=_b usbcore.old_scheme_first=1 tegraid=19.1.2.0.0 maxcpus=8 boot.slot_suffix=_b boot.ratchetvalues=0.4.2 vpr_resize sdhci_tegra.en_boot_part_access=1
Before the test 3 i didn’t reflash the board from scratch. So both slots are the ones OTA rootfs upgraded from test 2.
Probably if reflashed, booting from slot 1 would boot the correct partition but still on the ‘NVME boot’.
Summary for now
If a NVME is present and the NVME has an EXT4 FS on its root (no partition table) the boot behaviour changes.
It uses ‘NVME boot mechanisme’ on the bootloader to boot the MMC. This can still boot fine on both slots 0 and 1 on a fresh flashed xavier, but fails to correctly boot on slot 1 for an OTA updated rootfs. Even if the update has been done in a case where both slots worked fine.
With all this i’ll retry to flash my original A/B custom rootfs and perform again an OTA rootfs update on it. The only thing i’ll change is to create a partition table with an EXT4 partition on the NVME instead of the EXT4 directly on the NVME. Let’s see if that works.
There seems to be a ‘boot control’ partition on xaviers that could be used to configure the boot options. So maybe it would be possible to disable the ‘NVME bootloader’ through it.
I think it is just the cboot reads the extlinux.conf from the FS from the NVMe and that extlinux.conf does not have the info for the UUID.
No, there is no extlinux.conf file on the NVME and there has never been one. Not even a boot folder on it.
It only contains data from our own software.
I even deleted all the files on it… no directory, no file. The problem still persists.
So CBoot’s NVME loader somehow falls back to boot MMC in some situations instead of failing.
In any case i reflashed our custom A/B image on it and managed an OTA rootfs update through Mender. Only change from the initial situation is that the NVME contains a partition table with an ext4 FS in a partition instead of an EXT4 FS directly on its root.
CBoot really gets confused just by the presence of an ext4 formatted NVME. Even without any data on it.
Thanks for pointing out the NVME issue, i never would have suspected it to cause such changes in the boot behaviour.
I read the log again. So it is not cboot reads the extlinux.conf on the nvme. It tries to read, but fails. Then it uses kernel from partition. And never cares about the extlinux.conf anymore. That kernel is also from kernel partition but not APP.
So kernel cmdline is from cbootargs. But rootfs A/B expects the UUID from extlinux.conf so it is broken.
Looks like the situation is like that.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.