We encountered an issue in production. Suddenly all the Xavier NX 8GB fail to flash with massflash.
We switched everything, Baseboard/SSD, but the SOC remains as the problematic component that causes the flash to fail. The issue also persists when flashing on the devkit.
The system shows the following serial output when booting the initrd from the massflash, afterwards it stops:
FATAL ERROR [FILE=platform/drivers/emc/drv/dt/emc-dt-parse.c, ERR_UID=829]: EMC table malformed!
???@(nil) sp 0x5001e620 stack: 5001e400 - 5001e800
call stack:
sp 0x5001e620 pc 0x50002fd8
sp 0x5001e668 pc 0x50001d32
sp 0x5001e6a0 pc 0x50002b54
sp 0x5001e6b8 pc 0x500441b6
sp 0x5001e6e0 pc 0x50043f8a
sp 0x5001e740 pc 0x50045e48
sp 0x5001e7b0 pc 0x50044808
sp 0x5001e7c8 pc 0x50032efe
sp 0x5001e800 pc 0x00000000
eht_idx_find: 0x00000000 not a valid code address
no eidx for 0x00000000
??I/TC:
I/TC: Non-secure external DT found
I/TC: OP-TEE version: 3.21 (gcc version 9.3.0 (Buildroot 2020.08)) #2 Tue Aug 1 19:39:56 UTC 2023 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check Porting guidelines — OP-TEE documentation documentation
I/TC: Primary CPU initializing
I/TC: Primary CPU switching to normal world boot
Please let me know if the boards need to be sent in.
We picked 5 socs from packs of 10, so potentially 50 SOCs are affected.
I have added the “patch” for PCN210641, which should be the only one for our Xavier NX 8GB.
The tegra194-a02-bpmp-p3668-a00.dtb is referenced by the p3668.conf.common as BPFDTB_FILE, which we do not override in our boardconfig, so that should be fine.
The tegra194-mb1-bct-memcfg-p3668-0001-a00.cfg inside the BCT folder is also referenced by the EMMC_BCT variable inside the p3668.conf.common.
I have been informed that with the change, the flash process fails for 301 and 302, which seems a bit weird to me.
Am I right that those files are only used when flashing and not during the build process? Our current dtb file for the board did not change with replacing the files. I can’t find any build step that directly uses the files.
We only offer you the version mentioned on pdf file. It didn’t guarantee to work on other versions. If there is no patch for the BSP you want to use, then you would need to upgrade BSP to the one that supports this PCN update.
We use a production release and froze the version already. Upgrading is not an option.
We expect the software which was advertised as production release to work as such.
The upgrade to a later L4T version is not an option due to the lack of proper testing on Nvidia side. We can not expect our product to work relaibly anymore when we upgrade to a new L4T version.
Maybe you are not aware that products need to be tested internally, with field tests etc. A simple change of the EMMC should not render the whole product malfunctional.