Module not booting correctly on custom carrier board

Hi,
I have a boot error when trying to boot a new AGX Xavier Industrial module just flashed with the Jetpack 4.6.2. on my custom carrier board.
Here is the full booting log:
fail-boot-JCD.txt (83.3 KB)
When I compared it with the success boot log (on a Auvidea carrier board):
success-boot-auvidea.txt (96.1 KB)

→ I saw that the first difference between both logs are these lines :

E> I2C: Timeout while polling for transfer complete. Last value 0x00000002.
E> I2C: Could not write 0 bytes to slave: 0x0078 with repeat start true.
E> I2C_DEV_BASIC: Failed to send register address 0x53.
E> I2C_DEV_BASIC: Could not read data of size 2 at register address 0x0053 from slave 0x78 via i[0000.251] E> I2C_DEV_BASIC: Failed to update 2 byte value at register 0x53 of slave 0x78 via instance 4[0000.260] C> NONE: Failed to update reg address 0x53 of slave 0x78 in i2c block :0 in pad voltage config table.
E> FAILED: Generic i2c config

That’s an error I previously had when trying to flash from the Nvidia SDK and that I managed to dodge by flashing in command line but obviously that’s not a good solution because now this error seems to cause a critical error for the module to boot correctly.

I checked this error more in details and it appears to be a communication problem internally in the module on the PWR_I2C bus with the internal PMIC…

→ I don’t understand why the carrierboard would be in cause then ?

The final error in the boot is that there’s a problem with mmcblk0p1:

[ 7.196047] Freeing unused kernel memory: 8640K
[ 7.235627] Root device found: mmcblk0p1
▒▒
[57209529] : [ LOG ] : dram-ecc init success
▒▒[ 17.438692] ERROR: mmcblk0p1 not found

I saw on another post Broken flash.sh on JetPack 5.1.1
that [user56856] had the same issue during his flashing attempt and apparently resolved it by modifying the flashing script but I’m beginner in this environment so I don’t understand everything…

Can you please help me ?

Romain

Please move to jetpack4.6.4 first… there are some known issue on jeptack4.6.2 cboot…

Dear WayneWWW,

It’s even worse using 4.6.4 because it does not flash correctly:

Flash Jetson AGX Xavier.txt (2.6 KB)

I have absolutely no idea of why.

Here’s the UART log:

[0047.583] W> Profiler not initialized
[0047.587] I> Welcome to MB2(TBoot-BPMP) Applet (version: 00.00.2018.32-mobile-b6863ba4)
[0047.595] W> Profiler not initialized
[0047.598] I> DMA Heap @ [0x40020000 - 0x40065800]
[0047.603] I> Default Heap @ [0xd486400 - 0xd48a400]
[0047.607] W> Profiler not initialized
[0047.611] W> Profiler not initialized
[0047.615] W> Profiler not initialized
[0047.633] I> sdmmc DDR50 mode
[0047.636] I> QSPI Flash Size = 64 MB
[0047.645] I> Qspi initialized successfully
[0047.696] E> Link startup dme_set failed
[0047.700] E> UFS initialization failed
[0047.704] I> UFS is not present
[0047.707] W> Profiler not initialized
[0047.711] W> Cannot find any partition table for 00000003
[0047.719] I> Found 11 partitions in SDMMC_USER (instance 3)
[0047.727] I> Found 51 partitions in QSPI_FLASH (instance 0)
[0047.733] W> Profiler not initialized
[0047.736] W> Profiler not initialized
[0047.740] W> Profiler not initialized
[0047.743] I> Entering 3p server
[0047.747] I> USB configuration success
[0048.615] I> Populate chip info
[0048.627] I> Populate eeprom info
[0048.630] I> Populate eeprom info for module cvm
[0055.017] I> Populate chip info
[0055.028] I> Populate eeprom info
[0055.032] I> Populate eeprom info for module cvb
[0055.037] E> I2C: slave not found in slaves.
[0055.041] E> I2C: Could not write 0 bytes to slave: 0x00ac with repeat start true.
[0055.049] E> I2C_DEV: Failed to send register address 0x00000000.
[0055.055] E> I2C_DEV: Could not read 256 registers of size 1 from slave 0xac at 0x00000000 via instance 0.
[0055.064] E> eeprom: Failed to read I2C slave device
[0055.069] E> NV3P_SERVER: Failed to get eeprom handle for cvb from eeprom manager.
[0055.076] E> NV3P_SERVER: Could not read eeprom for module cvb.

Regards,

Romain

Could you always attach the full log from both host side and device side?

Here you are:
fail flash - host.txt (9.7 KB)

fail flash - device.txt (2.4 KB)

Regards,

Romain

Hi,

Does your custom board use the eeprom i2c lining coming out of the module itself? Looks like that design affects the carrier board eeprom reading during flash prcoess.

Please try to use manual flash with BOARDID provided as parameter prior to flash.sh tool too.

Sdkmanager cannot flash your board.

Nope It doesn’t.

It only use the I2C2 (J61,K61) & I2C3 (F53,E53).

But I agree with you that the problem always seems to be an I²C problem related.

Regards,

Romain

Ok I’ll try this and let you know.

Regards,

Romain

Dear @WayneWWW,

I succeeded to flash the module in command line using flash.sh.

But the booting is freezing as you can see on the following log:

fail boot 4-6-4 official - JCD board.txt (46.3 KB)

Can you help ?

Kind regards,

Romain

Does your carrier board really successfully run any kind of Xavier module before?

Such level of error log indicates a hardware defect. Lots of regulators are missing and even the PMIC is not able to find. This won’t be minor error triggered by software.

If same module can work fine on devkit, I would suggest go through a detail review of hardware.

Nope, it’s the first version of my carrier board and this is the first try so no other Jetson module was ever tested.

That said, my carrierboard is not very complex but I do not have the EEPROM AT24C02 that’s on the P2822 Nvidia Carrier board.

Also, for the moment I still changed nothing to the files (dts, dtb, pinmux, config, etc) → Can it be the cause ?

I went through a HW test of the carrierboard before plugging the module and all the functions such as Power-on, Discharge & Loss detection were functionnal.

Also, the HV & MV rails are well working at the input of the Module.

UART link is working.

It doesn’t remains a lot of things excepted HDMI, RGMII PHY, PCIe Controller (i210), USB for recovery (obv. working) & CAN.

Regarding the I²C, I’m only using the generic I²C 2 (J61 & K61) and the dedicated I²C for the HDMI interface.

Power consumptions seems good as they are quite low (about 2W on 5VDC MV rail & 7W on 12VDC HV rail).

For which reasons the internal PMIC could not be found ? (It’s on an internal I²C bus that user has no access so I don’t get it.)

Regards,

Romain

I am no hardware guy so cannot comment. If you need, I can ask hardware folks to check your issue.

It would be appreciated if you can ask for a check.

Regards,

Romain

Please check your schematic design based ont he checklist sheet attached in the Design Guide. And also you can check your design referring to the P2822 schematic.

Dear @Trumany,

I’ve already done these both things before producing my PCB.

Anyway, I’ll pass through it again and let you know.

But regarding device tree , pinmux config, etc, you have no comment ?

Regards,

Romain

You can put your module on devkit first. If that things can work fine on devkit (e.g. at least PMIC got enabled), then it means the device tree and pinmux are correct.

If same thing moving to your board would crash, then this indicate hardware diff.

Dear @WayneWWW,

The same module on the devkit from Auvidea is booting fine as said in the first post of this topic.

But, the device tree and pinmux are the dedicated ones for this devkit and so they are not dedicated for my custom carrier board so I imagine this could cause a problem ?

Of course my custom carrierboard is different from the Nvidia P2822 devkit or the Auvidea devkit but does that mean my carrier board is wrong or that I need to modify the dts and pinmux configs ?

Kind regards,

Romain

Just to clarify from software aspect.

There are common parts and different parts of device tree for devkit and a custom board.

Common part, for example, the PMIC part won’t need be changed and it shall work.
Different part could be some I/O enabled/disabled.

The problem here is even the “common part” does not work. And since PMIC has wrong, lots of I/O that requires to get power supply are broken too.

The common part device tree and pinmux shall never need to be configured.

Thank you for your clarification.

Too bad I really don’t see, for the moment, what could cause this problem in the common part.

I’ll check again and let know it here soon.

Regards,

Romain