Jetson ORIN NX not booting all of suddenly,EEPROM, error module, busy spin error

Hi,

We had shipped Jetson Orin NX pluggable VNX module which sits inside chassis at the customer site as part of another system.

This unit was working fine all these months, but we recieved complaint that unit is not booting all of a suddenly with some I2C EERPOM , busy spin errors.

  1. Please let us know why this issue occured all of a suddenly?

Below is the snapshot for reference:

  1. Please note, we had already made device tree changes for EEPROM size 0 as per below link, before flashing as our custom carrier board dont have CVB EEPROM .

As per the study I made this is CVM EEPROM error, how can we solve this now?

  1. Does this require hardware change or software change in PINMUXING or both?

  2. Mean while, we have asked for the serial logs from the customer for further debugging?

  3. Also in the forum searching, found a thread

where it says, booting by removing ethernet cable fixed the issue? Any idea how this would have worked.

  1. Does reflashing the OS will fix the issue?

Thanks,
Nagesh R

Hi nagesh_accord,

Are you using the devkit or custom board for Orin NX?
What’s the Jetpack version in use?

Please share the full log as file here instead of the screenshot or picture.

The current screenshot is not clear, do you connect other I2C device?

We don’t think ethernet would affect the I2C bus, but you can still request the customer to do the similar test to check if they hit the similar issue.

Thanks for the updates. Please find the response as shown below:

Custom board.

OS: L4T 36.3.0
Jetpack 6.0

we have asked our customer to provide the same. Once we receive, we will upload here.

As per the customer, they have not connected any i2c device

we have requested them to do the same.

Okay, please provide them once you get the results.

We would also like to know if the issue happens on specific module. (since the eeprom is inside the module)
If so, please also let us know the failed rate.

Please find the boot logs attached below.
vnx1_gpu_boot.txt (51.7 KB)

vnx1_gpu-with-eth_boot.txt (51.7 KB)

As per the latest information, this issue has happened on only one module out of 4 modules delivered to customer last July 2024.

This module was working fine from almost one year and suddenly this boot error has occurred and the unit is not booting any more.
they might be sending the unit back to us for RMA.

Let us know the

  1. root cause for this. Is it a hardware issue or software issue
  2. how do we trouble shoot this and make sure it wont occur on other modules in future.

Thanks,
Nagesh R

It sounds more like the HW issue as there’s only one module hit this issue and the EEPROM is inside the module itself.

Will it hit the similar I2C error in each boot?
I would also like to know if reflash it could help to recover this failed state.

We have asked he customer to ship the module to our site so that we trouble shoot. Will update you once we receive the module at our site.
It may take 2-3 weeks. Till then kindly request to keep this thread open.

Yes. It is hitting same issue now in each boot it seems.

We will try reflashing once unit reaches our site.

Thanks for the support.

It seems something wrong for the EEPROM inside SoM.

Sure, please let us know the result.

If the topic has been closed automatically after 2 weeks, you can simply create a new topic and link to the current one and @ me for further check.
Thanks.

Ok. We will try to check once we receive the unit for RMA.

Ok

Ok. Thanks.

Is this still an issue to support? Any result can be shared?

We have still not received the unit for RMA from customer.
We will update once we receive them.

Is this still an issue to support? Any result can be shared?

We have still not received the unit for RMA so we can proceed further only after we get the hardware for debugging here

Customer has reported the same type of eeprom spin error again on three Orin MX modules now.

Other two are working booting fine.

Will update more details soon…

Please let us know the failed rate in your case.
We would also like to know if those problematic modules can work on the devkit.

Sure.
Modules are at customer site.
It will take some time if they ship the modules to us for RMA.

Mean while we don’t have dev kit for orin nx to check whether they boot fine on dev kit.

Mean while could you provide us command just to flash the bootloader only. Customer is asking this step just to check if they can fix this issue at thier site itself.

We will try get complete serial log othe module from then on coming days.

Thanks

If you don’t use the custom carrier board for Orin NX, please refer to the following command to flash QSPI only.

$ sudo ./flash.sh  -c bootloader/generic/cfg/flash_t234_qspi.xml jetson-orin-nano-devkit internal

Thanks. For customized carrier board, what is the command for flashing only bootloader?

Do you mean UEFI partition?
If so, please specify the partition name with -k option to flash single partition only.

$ sudo ./flash.sh -k A_cpu-bootloader  -c bootloader/generic/cfg/flash_t234_qspi.xml jetson-orin-nano-devkit internal
$ sudo ./flash.sh -k B_cpu-bootloader  -c bootloader/generic/cfg/flash_t234_qspi.xml jetson-orin-nano-devkit internal

Just for my understanding I am asking this query.

Since the QSPI is residing within the Orin NX SOM module, why does the commands should be different for flashing the bootloader for dev kit and customized carrier board?

UEFI is part of the bootloader only right? or is it viceverasa. why do we have two seperate partitions? Do we need to flash both of them one after the other?

Please clarify. Thanks.