Yes and here are the boot logs of the problematic boots.
Crash in EL3 that occurs during the DXE phase of the UEFI:
[04:11:16.300] DtPlatformLoadDtb: Defaulting to UEFI DTB
[04:11:16.301] Processing "L4T Configuration Settings" DTB overlay
[04:11:16.303] Deleting fragment fragment@0
[04:11:16.303] Processing "P3767 Overlay Support" DTB overlay
[04:11:16.303] Deleting fragment fragment@0
[04:11:16.303] Deleting fragment fragment@1
[04:11:16.304] Deleting fragment fragment@2
[04:11:16.305] Deleting fragment fragment@3
[04:11:16.368] UpdateRamOopsMemory: RamOopsBase: 0x46EB70000, RamOopsSize: 0x200000
[04:11:16.369] FtpmProtocol Not Found - Not Found
[04:11:16.369] DisplayLocateChildGopHandle: failed to enumerate graphics output device handles: Not Found
[04:11:16.369] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/UsbPadCtlDxe/UsbPadCtlDxe/DEBUG/UsbPadCtlDxe.dll 0x467C68000
[04:11:16.370] Loading driver at 0x00467C67000 EntryPoint=0x00467C6EC74 UsbPadCtlDxe.efi
[04:11:16.370] DeviceDiscoveryNotify: Couldn't get gNVIDIAPinMuxProtocolGuid Handle: Not Found
[04:11:16.417] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/XusbControllerDxe/XusbControllerDxe/DEBUG/XusbControllerDxe.dll 0x467C5E000
[04:11:16.421] Loading driver at 0x00467C5D000 EntryPoint=0x00467C62320 XusbControllerDxe.efi
[04:11:16.421] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/SdMmcControllerDxe/SdMmcControllerDxe/DEBUG/SdMmcControllerDxe.dll 0x467C53000
[04:11:16.422] Loading driver at 0x00467C52000 EntryPoint=0x00467C57F08 SdMmcControllerDxe.efi
[04:11:16.466] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/AndroidBootDxe/AndroidBootDxe/DEBUG/AndroidBootDxe.dll 0x467C47000
[04:11:16.469] Loading driver at 0x00467C46000 EntryPoint=0x00467C4D3A8 AndroidBootDxe.efi
[04:11:16.470] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/PcieControllerDxe/PcieControllerDxe/DEBUG/PcieControllerDxe.dll 0x467C34000
[04:11:16.470] Loading driver at 0x00467C33000 EntryPoint=0x00467C3F240 PcieControllerDxe.efi
[04:11:16.785] PCIe Controller-1 Link is DOWN
[04:11:17.089] PCIe Controller-4 Link is DOWN
[04:11:17.392] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/SecurityPkg/Hash2DxeCrypto/Hash2DxeCrypto/DEBUG/Hash2DxeCrypto.dll 0x467C2A000
[04:11:17.445] Loading driver at 0x00467C29000 EntryPoint=0x00467C2ED9C Hash2DxeCrypto.efi
[04:11:17.445] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/EqosDeviceDxe/EqosDeviceDxe/DEBUG/EqosDeviceDxe.dll 0x467B90000
[04:11:17.446] Loading driver at 0x00467B8F000 EntryPoint=0x00467B964A8 EqosDeviceDxe.efi
[04:11:17.446] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/NonDiscoverablePciDeviceDxe/NonDiscoverablePciDeviceDxe/DEBUG/NonDiscoverablePciDeviceDxe.dll 0x467B85000
[04:11:17.496] Loading driver at 0x00467B84000 EntryPoint=0x00467B8A8CC NonDiscoverablePciDeviceDxe.efi
[04:11:17.498] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Tegra/T194/Drivers/T194GraphicsOutputDxe/T194GraphicsOutputDxe/DEBUG/T194GraphicsOutputDxe.dll 0x467B78000
[04:11:17.499] Loading driver at 0x00467B77000 EntryPoint=0x00467B7EFE4 T194GraphicsOutputDxe.efi
[04:11:17.499] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/FmpDevicePkg/FmpDxe/7C374309-1649-4682-8BEE-04F3A8399414/DEBUG/FmpDxe.dll 0x467A9B000
[04:11:17.538] Loading driver at 0x00467A9A000 EntryPoint=0x00467AA5648 FmpDxe.efi
[04:11:17.538] GetFuseSettings: Error getting TegraPlatformSpec size: Not Found
[04:11:17.539] GetTnSpec: Error getting TegraPlatformCompatSpec size: Not Found
[04:11:17.539] VerPartitionGetVersion: Crc mismatch expected=0x0, received=0xB13B1EC2
[04:11:17.540] GetVersionInfo: Failed to parse version info: Volume Corrupt
[04:11:17.540] VerPartitionGetVersion: Crc mismatch expected=0x0, received=0xB13B1EC2
[04:11:17.578] GetVersionInfo: Failed to parse version info: Volume Corrupt
[04:11:17.579] add-symbol-file /home/edk2/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridgeDxe/DEBUG/PciHostBridgeDxe.dll 0x467A88000
[04:11:17.579] Loading driver at 0x00467A87000 EntryPoint=0x00467A9157C PciHostBridgeDxe.efi
[04:11:17.865] [ext4] Needs journal recovery, mounting read-only
[04:11:18.009] Installed Fat filesystem on 4647E8E18
[04:11:23.017] NvmExpressPassThru: Timeout occurs for an NVMe command.
[04:11:23.110] ÿäUnhandled Exception in EL3.
[04:11:23.111] x30 = 0x0000000050000c50
[04:11:23.111] x0 = 0x0000000000000000
[04:11:23.111] x1 = 0x00000000be000011
[04:11:23.111] x2 = 0x0000000000000000
[04:11:23.112] x3 = 0x0000000000000011
[04:11:23.113] x4 = 0x0000000000100000
[04:11:23.113] x5 = 0x000000046e9fdf98
[04:11:23.113] x6 = 0x0000000401000000
[04:11:23.114] x7 = 0x0000000401000000
[04:11:23.114] x8 = 0x0000000000000000
[04:11:23.114] x9 = 0x0000000041223020
[04:11:23.114] x10 = 0x000000000003073d
[04:11:23.114] x11 = 0x0000000000000000
[04:11:23.154] x12 = 0x0000000000000002
[04:11:23.155] x13 = 0x0000000000000002
[04:11:23.156] x14 = 0x0000000000000001
[04:11:23.157] x15 = 0x00000000000000ff
[04:11:23.158] x16 = 0x00000004686c0dbc
[04:11:23.158] x17 = 0x0000000020a983c7
[04:11:23.158] x18 = 0x00000004686cc280
[04:11:23.158] x19 = 0x0000000000000000
[04:11:23.159] x20 = 0x00000004648190a0
[04:11:23.159] x21 = 0x0000000000000001
[04:11:23.159] x22 = 0x0000000467c3f218
[04:11:23.159] x23 = 0x0000000000000001
[04:11:23.160] x24 = 0x0000000000000001
[04:11:23.197] x25 = 0x000000046e9fe136
[04:11:23.200] x26 = 0x0000000000000002
[04:11:23.203] x27 = 0xffff0000f0000003
[04:11:23.204] x28 = 0x0000000000000002
[04:11:23.205] x29 = 0x000000046e9fdfa0
[04:11:23.206] scr_el3 = 0x000000000003073d
[04:11:23.207] sctlr_el3 = 0x0000000030cd183f
[04:11:23.207] cptr_el3 = 0x0000000000000000
[04:11:23.208] tcr_el3 = 0x0000000080823518
[04:11:23.208] daif = 0x00000000000002c0
[04:11:23.208] mair_el3 = 0x00000000004404ff
[04:11:23.208] spsr_el3 = 0x00000000600003c9
[04:11:23.208] elr_el3 = 0x00000004686c6280
[04:11:23.209] ttbr0_el3 = 0x0000000050026341
[04:11:23.241] esr_el3 = 0x00000000be000011
[04:11:23.242] far_el3 = 0x0000000000000000
[04:11:23.243] spsr_el1 = 0x0000000000000000
[04:11:23.244] elr_el1 = 0x0000000000000000
[04:11:23.244] spsr_abt = 0x0000000000000000
[04:11:23.244] spsr_und = 0x0000000000000000
[04:11:23.244] spsr_irq = 0x0000000000000000
[04:11:23.245] spsr_fiq = 0x0000000000000000
[04:11:23.245] sctlr_el1 = 0x0000000030d00800
[04:11:23.245] actlr_el1 = 0x0000000000000000
[04:11:23.245] cpacr_el1 = 0x0000000000300000
[04:11:23.245] csselr_el1 = 0x0000000000000000
[04:11:23.245] sp_el1 = 0x0000000000000000
[04:11:23.284] esr_el1 = 0x0000000000000000
[04:11:23.285] ttbr0_el1 = 0x0000000000000000
[04:11:23.286] ttbr1_el1 = 0x0000000000000000
[04:11:23.286] mair_el1 = 0x0000000000000000
[04:11:23.287] amair_el1 = 0x0000000000000000
[04:11:23.287] tcr_el1 = 0x0000000000000000
[04:11:23.287] tpidr_el1 = 0x0000000000000000
[04:11:23.287] tpidr_el0 = 0x0000000080000000
[04:11:23.288] tpidrro_el0 = 0x0000000000000000
[04:11:23.288] par_el1 = 0x0000000000000800
[04:11:23.288] mpidr_el1 = 0x0000000081000000
[04:11:23.288] afsr0_el1 = 0x0000000000000000
[04:11:23.288] afsr1_el1 = 0x0000000000000000
[04:11:23.288] contextidr_el1 = 0x0000000000000000
[04:11:23.328] vbar_el1 = 0x0000000000000000
[04:11:23.329] cntp_ctl_el0 = 0x0000000000000005
[04:11:23.329] cntp_cval_el0 = 0x000000001e531281
[04:11:23.330] cntv_ctl_el0 = 0x0000000000000000
[04:11:23.331] cntv_cval_el0 = 0x0000000000000000
[04:11:23.331] cntkctl_el1 = 0x0000000000000000
[04:11:23.331] sp_el0 = 0x00000004686cc280
[04:11:23.332] isr_el1 = 0x0000000000000040
[04:11:23.332] cpuectlr_el1 = 0xa000000b40543000
[04:11:23.332] gicd_ispendr regs (Offsets 0x200 - 0x278)
[04:11:23.332] Offset: value
[04:11:23.332] 0000000000000200: 0x0000000000000000
[04:11:23.333] 0000000000000204: 0x0000000000000000
[04:11:23.371] 0000000000000208: 0x0000000000000000
[04:11:23.372] 000000000000020c: 0x0000000000000000
[04:11:23.372] 0000000000000210: 0x0000000000000000
[04:11:23.373] 0000000000000214: 0x0000000000000000
[04:11:23.373] 0000000000000218: 0x0000000000010000
[04:11:23.373] 000000000000021c: 0x0000000000020000
[04:11:23.373] 0000000000000220: 0x0000000000000000
[04:11:23.374] 0000000000000224: 0x0000000000000000
[04:11:23.374] 0000000000000228: 0x0000000000000000
[04:11:23.374] 000000000000022c: 0x0000000000000000
[04:11:23.375] 0000000000000230: 0x0000000000000000
[04:11:23.376] 0000000000000234: 0x0000000000000000
[04:11:23.376] 0000000000000238: 0x0000000000000000
[04:11:23.415] 000000000000023c: 0x0000000000000000
[04:11:23.415] 0000000000000240: 0x0000000000000000
[04:11:23.415] 0000000000000244: 0x0000000000000000
[04:11:23.415] 0000000000000248: 0x0000000000000000
[04:11:23.416] 000000000000024c: 0x0000000000000000
[04:11:23.416] 0000000000000250: 0x0000000000000000
[04:11:23.416] 0000000000000254: 0x0000000000000000
[04:11:23.416] 0000000000000258: 0x0000000000000000
[04:11:23.416] 000000000000025c: 0x0000000000000000
[04:11:23.417] 0000000000000260: 0x0000000000000000
[04:11:23.417] 0000000000000264: 0x0000000000000000
[04:11:23.417] 0000000000000268: 0x0000000000000000
[04:11:23.417] 000000000000026c: 0x0000000000000000
[04:11:23.441] 0000000000000270: 0x0000000000000000
[04:11:23.442] 0000000000000274: 0x0000000000000000
[04:11:23.443] 0000000000000278: 0x0000000000000000
Another very similar crash that has occurred several days ago:
[15:13:20.206] add-symbol-file /sources/bootloader/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/PcieControllerDxe/PcieControllerDxe/DEBUG/PcieControllerDxe.dll 0x467C0A000
[15:13:20.207] Loading driver at 0x00467C09000 EntryPoint=0x00467C150E8 PcieControllerDxe.efi
[15:13:20.528] PCIe Controller-1 Link is DOWN
[15:13:20.832] PCIe Controller-4 Link is DOWN
[15:13:21.182] add-symbol-file /sources/bootloader/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/SecurityPkg/Hash2DxeCrypto/Hash2DxeCrypto/DEBUG/Hash2DxeCrypto.dll 0x467C00000
[15:13:21.183] Loading driver at 0x00467BFF000 EntryPoint=0x00467C04D40 Hash2DxeCrypto.efi
[15:13:21.184] add-symbol-file /sources/bootloader/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/EqosDeviceDxe/EqosDeviceDxe/DEBUG/EqosDeviceDxe.dll 0x467B66000
[15:13:21.186] Loading driver at 0x00467B65000 EntryPoint=0x00467B6C4E8 EqosDeviceDxe.efi
[15:13:21.187] add-symbol-file /sources/bootloader/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Drivers/NonDiscoverablePciDeviceDxe/NonDiscoverablePciDeviceDxe/DEBUG/NonDiscoverablePciDeviceDxe.dll 0x467B5B000
[15:13:21.227] Loading driver at 0x00467B5A000 EntryPoint=0x00467B60890 NonDiscoverablePciDeviceDxe.efi
[15:13:21.228] add-symbol-file /sources/bootloader/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Tegra/T194/Drivers/T194GraphicsOutputDxe/T194GraphicsOutputDxe/DEBUG/T194GraphicsOutputDxe.dll 0x467B4E000
[15:13:21.230] Loading driver at 0x00467B4D000 EntryPoint=0x00467B54F8C T194GraphicsOutputDxe.efi
[15:13:21.278] add-symbol-file /sources/bootloader/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/FmpDevicePkg/FmpDxe/7C374309-1649-4682-8BEE-04F3A8399414/DEBUG/FmpDxe.dll 0x467A73000
[15:13:21.279] Loading driver at 0x00467A72000 EntryPoint=0x00467A7C354 FmpDxe.efi
[15:13:21.280] add-symbol-file /sources/bootloader/nvidia-uefi/Build/Jetson/DEBUG_GCC5/AARCH64/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridgeDxe/DEBUG/PciHostBridgeDxe.dll 0x467A61000
[15:13:21.280] Loading driver at 0x00467A60000 EntryPoint=0x00467A6A42C PciHostBridgeDxe.efi
[15:13:21.682] LocatePciExpressCapabilityRegBlock: [01|00|00] failed to access config space at offset 0x100
[15:13:22.404] ??Unhandled Exception in EL3.
[15:13:22.405] x30 = 0x0000000050000c50
[15:13:22.405] x0 = 0x0000000000000000
[15:13:22.406] x1 = 0x00000000be000011
[15:13:22.406] x2 = 0x0000000000000000
[15:13:22.407] x3 = 0x0000000000000011
[15:13:22.407] x4 = 0x0000000000100000
[15:13:22.408] x5 = 0x000000046e9fe568
[15:13:22.408] x6 = 0x0000002001000000
[15:13:22.408] x7 = 0x0000002001000000
[15:13:22.409] x8 = 0x0000000000000000
[15:13:22.409] x9 = 0x0000000041223020
[15:13:22.409] x10 = 0x000000000003073d
[15:13:22.410] x11 = 0x000c010200000000
[15:13:22.447] x12 = 0x000000000a0341d0
[15:13:22.447] x13 = 0xff7f000000060101
[15:13:22.447] x14 = 0x00001417ffff0000
[15:13:22.447] x15 = 0x41d0000c01020000
[15:13:22.448] x16 = 0x000000046869fdb0
[15:13:22.448] x17 = 0x000000009bd90d09
[15:13:22.448] x18 = 0x00000004686ab2f0
[15:13:22.448] x19 = 0x0000000000000000
[15:13:22.448] x20 = 0x00000004647d41a0
[15:13:22.448] x21 = 0x0000000000000002
[15:13:22.448] x22 = 0x0000000467c150c0
[15:13:22.448] x23 = 0x0000000000000001
[15:13:22.448] x24 = 0x0000000000000001
[15:13:22.490] x25 = 0x0000000467a6e439
[15:13:22.491] x26 = 0x0000000000000000
[15:13:22.491] x27 = 0x000000046e9fe6e0
[15:13:22.492] x28 = 0x0000000000000004
[15:13:22.492] x29 = 0x000000046e9fe570
[15:13:22.492] scr_el3 = 0x000000000003073d
[15:13:22.493] sctlr_el3 = 0x0000000030cd183f
[15:13:22.493] cptr_el3 = 0x0000000000000000
[15:13:22.493] tcr_el3 = 0x0000000080823518
[15:13:22.494] daif = 0x00000000000002c0
[15:13:22.494] mair_el3 = 0x00000000004404ff
[15:13:22.494] spsr_el3 = 0x00000000600002c9
[15:13:22.495] elr_el3 = 0x0000000467a63c18
[15:13:22.495] ttbr0_el3 = 0x0000000050026341
[15:13:22.534] esr_el3 = 0x00000000be000011
[15:13:22.534] far_el3 = 0x0000000000000000
[15:13:22.535] spsr_el1 = 0x0000000000000000
[15:13:22.535] elr_el1 = 0x0000000000000000
[15:13:22.536] spsr_abt = 0x0000000000000000
[15:13:22.536] spsr_und = 0x0000000000000000
[15:13:22.536] spsr_irq = 0x0000000000000000
[15:13:22.537] spsr_fiq = 0x0000000000000000
[15:13:22.537] sctlr_el1 = 0x0000000030d00800
[15:13:22.537] actlr_el1 = 0x0000000000000000
[15:13:22.538] cpacr_el1 = 0x0000000000300000
[15:13:22.538] csselr_el1 = 0x0000000000000000
[15:13:22.538] sp_el1 = 0x0000000000000000
[15:13:22.577] esr_el1 = 0x0000000000000000
[15:13:22.578] ttbr0_el1 = 0x0000000000000000
[15:13:22.578] ttbr1_el1 = 0x0000000000000000
[15:13:22.579] mair_el1 = 0x0000000000000000
[15:13:22.579] amair_el1 = 0x0000000000000000
[15:13:22.581] tcr_el1 = 0x0000000000000000
[15:13:22.582] tpidr_el1 = 0x0000000000000000
[15:13:22.582] tpidr_el0 = 0x0000000080000000
[15:13:22.582] tpidrro_el0 = 0x0000000000000000
[15:13:22.582] par_el1 = 0x0000000000000800
[15:13:22.582] mpidr_el1 = 0x0000000081000000
[15:13:22.582] afsr0_el1 = 0x0000000000000000
[15:13:22.582] afsr1_el1 = 0x0000000000000000
[15:13:22.582] contextidr_el1 = 0x0000000000000000
[15:13:22.621] vbar_el1 = 0x0000000000000000
[15:13:22.621] cntp_ctl_el0 = 0x0000000000000005
[15:13:22.622] cntp_cval_el0 = 0x0000000016309e46
[15:13:22.622] cntv_ctl_el0 = 0x0000000000000000
[15:13:22.622] cntv_cval_el0 = 0x0000000000000000
[15:13:22.623] cntkctl_el1 = 0x0000000000000000
[15:13:22.623] sp_el0 = 0x00000004686ab2f0
[15:13:22.624] isr_el1 = 0x0000000000000040
[15:13:22.624] cpuectlr_el1 = 0xa000000b40543000
[15:13:22.624] gicd_ispendr regs (Offsets 0x200 - 0x278)
[15:13:22.625] Offset: value
[15:13:22.625] 0000000000000200: 0x0000000000000000
[15:13:22.625] 0000000000000204: 0x0000000000000000
[15:13:22.664] 0000000000000208: 0x0000000000000000
[15:13:22.664] 000000000000020c: 0x0000000000000000
[15:13:22.665] 0000000000000210: 0x0000000000000000
[15:13:22.665] 0000000000000214: 0x0000000000000000
[15:13:22.666] 0000000000000218: 0x0000000000010000
[15:13:22.666] 000000000000021c: 0x0000000000020000
[15:13:22.666] 0000000000000220: 0x0000000000000000
[15:13:22.667] 0000000000000224: 0x0000000000000000
[15:13:22.667] 0000000000000228: 0x0000000000000000
[15:13:22.668] 000000000000022c: 0x0000000000000000
[15:13:22.668] 0000000000000230: 0x0000000000000000
[15:13:22.668] 0000000000000234: 0x0000000000000000
[15:13:22.669] 0000000000000238: 0x0000000000000000
[15:13:22.708] 000000000000023c: 0x0000000000000000
[15:13:22.708] 0000000000000240: 0x0000000000000000
[15:13:22.708] 0000000000000244: 0x0000000000000000
[15:13:22.709] 0000000000000248: 0x0000000000000000
[15:13:22.709] 000000000000024c: 0x0000000000000000
[15:13:22.710] 0000000000000250: 0x0000000000000000
[15:13:22.710] 0000000000000254: 0x0000000000000000
[15:13:22.710] 0000000000000258: 0x0000000000000000
[15:13:22.711] 000000000000025c: 0x0000000000000000
[15:13:22.711] 0000000000000260: 0x0000000000000000
[15:13:22.712] 0000000000000264: 0x0000000000000000
[15:13:22.712] 0000000000000268: 0x0000000000000000
[15:13:22.712] 000000000000026c: 0x0000000000000000
[15:13:22.734] 0000000000000270: 0x0000000000000000
[15:13:22.734] 0000000000000274: 0x0000000000000000
[15:13:22.735] 0000000000000278: 0x0000000000000000
Here are our observations:
- In both cases X30/LR is 0x0000000050000c50 so the routine that crashes is called from the same location in both cases.
- The last log before the crash is different one is
NvmExpressPassThru: Timeout occurs for an NVMe command.and the other one isLocatePciExpressCapabilityRegBlock: [01|00|00] failed to access config space at offset 0x100 - When the crash occurs the
PCIe Controller-4 Link is DOWNis displayed. It is not the case during a normal/working boot.
To perform the reboot cycles without reaching the kernel or user space
We modified UEFI to trigger a reboot by calling the RuntimeServiceResetSystem function in edk2/MdeModulePkg/Universal/ResetSystemRuntimeDxe/ResetSystem.c
Which in turn calls the ResetCold function in ./ArmPkg/Library/ArmSmcPsciResetSystemLib/ArmSmcPsciResetSystemLib.c which invokes the tegra_soc_prepare_system_reset via an SMC instruction
In tegra_soc_prepare_system_reset we have made a modification to clear the scratch register RSV109_lo this forces the device to always to use slot 0 no matter what. And we let it run until we see a crash. :-)
I can send our patches to replicate the boot loop.
The PCIe Controller-4 Link is DOWN message appears 20-30 times in ~5000 boot cycles. But always when there is a crash, we are going to dig deeper into this but any help will be greatly appreciated.