MB2 still trying to read CVB EEPROM after changing misc dts

Hello,

I am flashing an Orin NX 8GB on a custom carrier board using Jetpack 5.12. I ran into the EEPROM failure documented in the 35.4.1 Developer Guide and I made the change shown in the documentation:

Linux_for_Tegra/bootloader/t186ref/BCT/tegra234-mb2-bct-misc-p3767-0000.dts

/dts-v1/;

#include "tegra234-mb2-bct-common.dtsi"

/ {
	mb2-misc {
		eeprom {
			cvm_eeprom_i2c_instance = <0>;
			cvm_eeprom_i2c_slave_address = <0xa0>;
			cvm_eeprom_read_size = <0x100>;
			cvb_eeprom_i2c_instance = <0x0>;
			cvb_eeprom_i2c_slave_address = <0xae>;
			cvb_eeprom_read_size = <0x0>;
		};
	};
};

The command I used to flash the nvme drive:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1 -c tools/kernel_flash/flash_l4t_external.xml -p "-c bootloader/t186ref/cfg/flash_t234_qspi_nvme.xml" --showlogs --network usb0 boardname internal

The configuration for my board is the same as a regular devkit, with some light modifications:

source "${LDK_DIR}/p3768-0000+p3767-0000.conf";


EMMC_CFG="flash_t234_qspi_nvme.xml";

if [ "${UPHYLANE}" = "c7x1" ]; then
	UPHYLANE_CONFIG="tegra234-mb1-bct-uphylane-si-c7x1.dtsi";
	EMMC_CFG="flash_t234_qspi_nvme_c7.xml";
elif [ "${UPHYLANE}" = "c7x2" ]; then
	UPHYLANE_CONFIG="tegra234-mb1-bct-uphylane-si-c7x2.dtsi";
	EMMC_CFG="flash_t234_qspi_nvme_c7.xml";
fi


###########################
#  customizations                            
###########################

DTB_FILE="boardname.dtb";
PINMUX_CONFIG="boardname_pinmux.dtsi";
PMC_CONFIG="boardname_padvoltage.dtsi";

After these changes are made, I recompiled the kernel and flashed the board. Attached is the flash log
flash_3-4_0_20240821-152431.log (41.3 KB)

Unfortunately, MB2 is still attempting to read this EEPROM, despite my changes.

I> Task: Prepare eeprom data (0x50018ac4)
E> I2C: slave not found in slaves.
E> I2C: Could not write 0 bytes to slave: 0x00ae with repeat start true.
E> I2C_DEV: Failed to send register address 0x00000000.
E> I2C_DEV: Could not read 256 registers of size 1 from slave 0xae at 0x00000000 via instance 0.
E> eeprom: Failed to read I2C slave device
C> Task 0x0 failed (err: 0x1f1e050d)
E> Top caller module: I2C_DEV, error module: I2C, reason: 0x0d, aux_info: 0x05
I> Busy Spin

I have tried other solutions, but nothing so far has worked.

There is a copy of tegra234-mb2-bct-misc-p3767-0000.dts in Linux_for_Tegra/bootloader/. too. Please make sure that one got modified too.

I had changed that one also. Same result unfortunately.

Here is a grep of the source for more instances of cvb_eeprom_read_size:

user@arrakis:~/nvidia/nvidia_sdk/JetPack_5.1.2_Linux_JETSON_ORIN_NX_TARGETS/Linux_for_Tegra$ grep -wrn cvb_eeprom_read_size
Binary file bootloader/tegrabct_v2 matches
bootloader/tegra234-mb2-bct-misc-p3767-0000.dts:13:			cvb_eeprom_read_size = <0x0>;
bootloader/tegra234-mb2-bct-common.dtsi:41:            cvb_eeprom_read_size = <0x0>;
bootloader/t186ref/BCT/tegra194-mb1-bct-misc-l4t.cfg:125:eeprom.cvb_eeprom_read_size = 0;
bootloader/t186ref/BCT/tegra234-mb2-bct-misc-p3767-0000.dts:13:			cvb_eeprom_read_size = <0x0>;
bootloader/t186ref/BCT/tegra194-mb1-bct-misc-l4t-maxn.cfg:120:eeprom.cvb_eeprom_read_size = 0;
bootloader/t186ref/BCT/tegra194-mb1-bct-misc-flash.cfg:122:eeprom.cvb_eeprom_read_size = 0;
bootloader/t186ref/BCT/tegra234-mb2-bct-misc-p3701-0002-p3711-0000.dts:12:            cvb_eeprom_read_size = <0x0>;
bootloader/t186ref/BCT/tegra194-mb1-bct-misc-sd-l4t.cfg:125:eeprom.cvb_eeprom_read_size = 0;
bootloader/t186ref/BCT/tegra234-mb2-bct-misc-p3701-0002-p3740-0002.dts:41:            cvb_eeprom_read_size = <0x100>;
bootloader/t186ref/BCT/tegra194-mb1-bct-misc-l4t-jaxi.cfg:116:eeprom.cvb_eeprom_read_size = 256;
bootloader/t186ref/BCT/tegra194-mb1-bct-misc-flash-jaxi.cfg:116:eeprom.cvb_eeprom_read_size = 256;

I had set these all to 0 when I started seeing issues. The remaining ones are for EEPROMs at different addresses, so I didn’t touch those. What else can I try?

Do you have the full uart log to provide?

Yes, here it is:
uart_log.txt (23.5 KB)

There is one minor mistake in the flash command.

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1
-c tools/kernel_flash/flash_l4t_external.xml -p “-c bootloader/t186ref/cfg/flash_t234_qspi.xml”
–showlogs --network usb0 jetson-orin-nano-devkit internal

Please use above.

Hi Wayne,

The qspi variant fails to flash the target. I originally tried it and ended up switching to the *_nvme variant to make it work.

Attached is the flash log of the command you provided, and also a screenshot of the nvme after the flash process concludes. You can see there are no partitions on it.
qspi_flash_log.txt (302.7 KB)

Would we be able to verify if the boot loaders are stale copies that aren’t being overwritten by the flashing process? That is one reason I can think of why these changes aren’t taking effect. The other is that building from source somehow overrides these settings.

What we have found is that the Orin flashing process fails unless we manually delete the partitions from the NVME drive before flashing. We have been deleting the partitions but not formatting the drive completely. I am attempting a flash after formatting the drive completely and will post the results.

I wonder if this is some known issue in earlier releases. Could you also test rel35.5?

I think my hypothesis was correct, and that the flashing process was not erasing anything. I could erase the partitions and still boot into MB2.

I wanted to give an update that I found a command that can flash an Orin NX + NVMe combination that now works completely. It can handle the nvme drive and I no longer have to manually erase partitions. Here it is:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1 -c tools/kernel_flash/flash_l4t_t234_nvme.xml -p "-c bootloader/t186ref/cfg/flash_t234_qspi_nvme.xml -d ./kernel/dtb/<boardname>.dtb" --showlogs --network usb0 <boardname> internal

However, for whatever reason, during flashing, I can see in the output that it is writing <boardname>.dtb into the extlinux.conf file for the primary rootfs, but then it doesn’t actually do that. It just places the default dtb into /boot/ and extlinux.conf points to it. I can open a different ticket for that.

It would be very useful if the SDK manager would print the full command that it uses to flash the devkit at the start of the process so that users could use it as a reference.

Hi,

I think this issue requires further discussion.

Just to clarify first.

  1. What are the exact steps to reproduce your issue?
  2. Is your issue able to reproduce on NV devkit? Ignoring CVB eeprom is also a possible configuration on NV devkit.
  3. What would happen if don’t add that -d dtb thing in the command?
  4. What jetpack version is in use in the end?

One point to mention. There is no other users reporting this issue to us before. That is why we need your info here so that we can fix it on our side.

To clarify, the EEPROM issue seems to have been caused by the flashing process not working. With an Orin NX + nvme, if there are existing partitions, the manual flashing commands described in all of the jetpack 5.1.2 instructions that I have seen will fail. Once the changes were actually written to the nvme with this new flashing command, the MB2 CVB EEPROM check worked fine.

  1. To reproduce:

You need a pre-flashed drive with incorrect CVB EEPROM settings. Then try to use either the manual flash command I first posted, or use the SDK manager to try to flash. The flash process will fail. Then put the drive into a USB enclosure, manually delete the partitions (but do not format). Try the same flash process again and it will pass, but the EEPROM issue will still be present (because I believe nothing is being written to the drive, despite the flash process indicating it succeeded. See the original logs).

You can delete the partitions manually again and try to boot from the drive. For me, the bootloaders came up. After formatting the drive completely and re-doing the original flash process, the EEPROM issue went away.

Next, use the updated flash command on a pre-flashed nvme drive… it handles the drive without the need to format or remove partitions.

  1. I didn’t try to reproduce the EEPROM issue on the devkit, but I have seen flash failures with a pre-flashed nvme drive, so my assumption is that you could do the same without a custom carrier.
  2. The same thing happens with or without the -d flag.
  3. Jetpack 5.1.2 for everything mentioned in this ticket

Thank you

Hi,

So for a NV devkit case, I can try to change the EEPROM settings too and see if it reads the CVB eeprom or not, right?

According to what you said, if I don’t format the nvme drive and let it have pre-flashed content, then my change in MB2 CVB EEPROM read size will never take effect? with the flash command listed in quick start guide (those I posted in previous comment)/ or sdkmanager?
Is this the issue you tried to report here?

Are you sure this issue is still stably reproduced on your side?

Also, have you tried jp5.1.3?

Hi,

I notice one mistake in your previous log.

In this qspi_flash_log.txt, the flash command you are using is wrong because you used ” instead of "

image

Correct one should be

image

Please correct this command first and check if you can still reproduce issue or not.

This topic is weird as many other users using the same commands as what we shared without problem. They don’t need to manually erase NVMe to make flash work (on custom board).

Thank you, Wayne. Yes, we agree it is odd that the process seems so fragile. Sometimes it works and sometimes it does not… and not just on our carrier board, but also flashing a fully Nvidia NX SOM + Nano carrier dev kit with nvme drive works only sometimes.

Regarding the CVB EEPROM boot issue, we have seen the same hardware and boot image work with one SOM and not another.

Is there a way to just flash the bootloaders including UEFI? That might help us experiment more with the flashing process.

Regarding the devkit flashing issue… deleting the JetPack_5.1.2_Linux_JETSON_ORIN_NX_TARGETS folder and forcing a re-download/re-copy of the files seems to fix an issue that we see where the flashing process fails due to a USB timeout error. Are there particular files that would put the process into such a state?

There is a known issue that Orin would hit usb timeout error on host PC if your host PC has USB auto suspend function.