So we did more investigation and I’ve found some leads. I was successfull to reproduce the issue with UEFI debug build to have more logs. There seems to be a problem with missing CRC in VER partition, which prevents Fmp lib from correct inicialization:
FwImageLibProtocolCallback: Got FW Image protocol, Name=mb1
FwImageLibProtocolCallback: Got FW Image protocol, Name=psc_bl1
FwImageLibProtocolCallback: Got FW Image protocol, Name=MB1_BCT
FwImageLibProtocolCallback: Got FW Image protocol, Name=MEM_BCT
FwImageLibProtocolCallback: Got FW Image protocol, Name=tsec-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=nvdec
FwImageLibProtocolCallback: Got FW Image protocol, Name=mb2
FwImageLibProtocolCallback: Got FW Image protocol, Name=xusb-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=bpmp-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=bpmp-fw-dtb
FwImageLibProtocolCallback: Got FW Image protocol, Name=psc-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=mts-mce
FwImageLibProtocolCallback: Got FW Image protocol, Name=sc7
FwImageLibProtocolCallback: Got FW Image protocol, Name=pscrf
FwImageLibProtocolCallback: Got FW Image protocol, Name=mb2rf
FwImageLibProtocolCallback: Got FW Image protocol, Name=cpu-bootloader
FwImageLibProtocolCallback: Got FW Image protocol, Name=secure-os
FwImageLibProtocolCallback: Got FW Image protocol, Name=smm-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=eks
FwImageLibProtocolCallback: Got FW Image protocol, Name=dce-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=spe-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=rce-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=adsp-fw
FwImageLibProtocolCallback: Got FW Image protocol, Name=BCT-boot-chain_backup
FwImageLibProtocolCallback: Got FW Image protocol, Name=VER
FwImageLibProtocolCallback: No handles: Not Found
GetFuseSettings: fuse=1, offset=18
GetTnSpec: TegraPlatformCompatSpec=3701-500-0004--1--cti-orin-agx-agx201-00-
MmSendCommBuffer: doing communicate
MmSendCommBuffer: communicate returned: Success
VerPartitionGetVersion: Crc mismatch expected=0x0, received=0x2B6575C7
GetVersionInfo: Failed to parse version info: Volume Corrupt
GetVersionInfo: Version=0x0, Str=(<null string>), Status=Volume Corrupt
FmpDeviceFwImageCallback: GetVersionInfo failed, FMP library will not be initialized: Unsupported
FwImageInstallFmp: no installer
MmSendCommBuffer: doing communicate
MmSendCommBuffer: communicate returned: Success
VerPartitionGetVersion: Crc mismatch expected=0x0, received=0x2B6575C7
GetVersionInfo: Failed to parse version info: Volume Corrupt
GetVersionInfo: Version=0x0, Str=(<null string>), Status=Volume Corrupt
FmpDeviceFwImageCallback: GetVersionInfo failed, FMP library will not be initialized: Unsupported
FwImageInstallFmp: installing FMP
FmpDxe(NVIDIA System Firmware): Variable 7C374309-1649-4682-8BEE-04F3A8399414 FmpVersion
FmpDxe(NVIDIA System Firmware): Variable 7C374309-1649-4682-8BEE-04F3A8399414 FmpLsv
FmpDxe(NVIDIA System Firmware): Variable 7C374309-1649-4682-8BEE-04F3A8399414 LastAttemptStatus
FmpDxe(NVIDIA System Firmware): Variable 7C374309-1649-4682-8BEE-04F3A8399414 LastAttemptVersion
FmpDxe(NVIDIA System Firmware): Variable 7C374309-1649-4682-8BEE-04F3A8399414 FmpState
FmpDxe(NVIDIA System Firmware): GetVersionString() unsupported in FmpDeviceLib.
FmpTegraGetLowestSupportedVersion: Read fmp-lowest-supported-version=2294785: Success
InstallProtocolInterface: 86C77A67-0B97-4633-A187-49104D0685C7 8061BC428
InstallProtocolInterface: 1849BDA2-6952-4E86-A1DB-559A3C479DF1 806104E68
FmpDxe(NVIDIA System Firmware): FmpDeviceLib registration returned EFI_SUCCESS. Expect FMP to be installed during the BDS/Device connection phase.
Then later there are other logs saying that Fmp library is not ready and capsule installation is failed.
FSOpen: Open '\EFI\UpdateCapsule\' Success
FSOpen: Open 'TEGRA_BL.Cap' Success
Successfully read capsule file TEGRA_BL.Cap from disk.
GetFileImageInAlphabetFromDir status 0
FSOpen: Open 'TEGRA_BL.Cap' Success
RemoveFileFromDir status 0
BuildGatherList: creating capsule descriptors at 0x31B0A98
ProcessCapsuleImage for FmpCapsule ...
ValidateFmpCapsule ...
ValidateFmpCapsule - Success
ProcessFmpCapsuleImage ...
FmpCapsule: route payload to right FMP instance ...
FMP (0) ImageInfo:
Fmp->SetImage ...
ImageTypeId - BF0D4599-20D4-414E-B2C5-3595B1CDA402, PayloadIndex - 0x0, ImageIndex - 0x1 (UpdateHardwareInstance - 0x0)(ImageCapsuleSupport - 0x1)
FmpDxe(NVIDIA System Firmware): Set variable 7C374309-1649-4682-8BEE-04F3A8399414 FmpState LastAttemptVersion 00000000
FmpDxe(NVIDIA System Firmware): Certificate #2 [8060DCA79..8060DCE6D].
AuthenticateFmpImage - CertType: 4AAFD29D-68DF-49EE-8AA9-347D375665A7
FmpAuthenticatedHandlerPkcs7 - Image: 0x004A3078 - 0x009AE492
FmpAuthenticatedHandlerPkcs7: PASS verification
FmpTegraCheckImage: Capsule update triggered. Image=0x8004A3B7D ImageSize=10148237
FmpTegraCheckImage: FMP library not initialized
FmpDxe(NVIDIA System Firmware): CheckTheImage() - FmpDeviceLib CheckImage failed. Status = Not Ready
FmpDxe(NVIDIA System Firmware): SetTheImage() - Check The Image failed with Not Ready.
FmpDxe(NVIDIA System Firmware): SetTheImage() LastAttemptStatus: 6162.
FmpDxe(NVIDIA System Firmware): Set variable 7C374309-1649-4682-8BEE-04F3A8399414 FmpState LastAttemptStatus 0000181 ÿâ
I’ve was able to readout the VER partition by flash.sh script and it seems, that therer is really no CRC. There is a comparison with VER partition from device which was successfully updated. I attach the faulty partition. Also attaching the “corrupted” partition.
ver_a_read.bin.zip (391 Bytes)
=== GOOD device (OTA works) ===
Line 1 ( 4 bytes): ‘NV4\n’
Line 2 ( 22 bytes): ‘# R36 , REVISION: 4.3\n’
Line 3 ( 35 bytes): ‘BOARDID=3701 BOARDSKU=0004 FAB=500\n’
Line 4 ( 15 bytes): ‘20260226110211\n’
Line 5 ( 9 bytes): ‘0x240403\n’
Line 6 ( 24 bytes): ‘BYTES:85 CRC32:BFB238E3\n’
BYTES: 85
Stored: 0xBFB238E3
Computed: 0xBFB238E3
Result: PASS — UEFI will accept capsule
=== BAD device (OTA broken) ===
Line 1 ( 4 bytes): ‘NV4\n’
Line 2 ( 22 bytes): ‘# R35 , REVISION: 4.1\n’
Line 3 ( 35 bytes): ‘BOARDID=3701 BOARDSKU=0004 FAB=500\n’
Line 4 ( 15 bytes): ‘20260320082928\n’
Line 5 ( 9 bytes): ‘0x230401\n’
Line 6 ( 16 bytes): ‘BYTES:85 CRC32:\n’
CRC32: EMPTY → 0x00000000
Computed: 0x50E3311D
Result: FAIL — UEFI will reject capsule
Two questions:
- How is it possible that CRC is missing on some of our devices? What could have cause that?
- Can we somehow fix this remotely (e.g. over SSH) in production so OTA can be performed?
- I considered patching UEFI to ignore VER partition CRC error, is it good idea?
Thank you.