System Reboot After Multiple Suspends on Custom Carrier Board

hello, Nvidia

HW: Orin NX 16G
BSP Version: R36.4
Jetpack Version: 6.2
When using the suspend function, we observed significant differences between our custom carrier board and the official devkit. On the devkit, everything works well and the suspend function can be used repeatedly without issues. However, on our custom board, an exception occurs during the second use of the suspend function.

Below is the log from the first successful suspend:

nvidia@miivii-tegra:~$ sudo systemctl suspend
nvidia@miivii-tegra:~$ ▒▒DCE RM Suspend rmStatus:0x0
▒▒[  151.072827] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[  151.072836] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[  151.072840] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[  151.072845] tegra-ivc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
▒▒▒▒clk_mach_suspend_early
clk_mach_suspend_early done
suspended vdd_core @ 867625uV
clk_mach_suspend
clk_mach_suspend done
fmon_suspend done
adc_suspend done
WAKE_MASK[31:0]  = 0x21000000
WAKE_MASK[63:32] = 0x0
WAKE_MASK[95:64] = 0x17f200
TIER0[31:0]      = 0x0
TIER0[63:32]     = 0x0
TIER0[95:64]     = 0x0
TIER1[31:0]      = 0x0
TIER1[63:32]     = 0x0
TIER1[95:64]     = 0x0
TIER2[31:0]      = 0x21000000
TIER2[63:32]     = 0x0
TIER2[95:64]     = 0x7f200
[0171.014] I> MB1 (version: 1.4.0.4-t234-54845784-82c7462e)
[0171.020] I> t234-A01-1-Silicon (0x12347) Prod
[0171.026] I> Boot-mode : SC7 Exit
[0171.030] I> Entry timestamp: 0x00000000
[0171.035] I> last_boot_error: 0x0
[0171.039] I> BR-BCT: preprod_dev_sign: 0
[0171.043] I> rst_source: 0xc, rst_level: 0x3
[0171.049] I> Task: SE error check
[0171.053] I> Task: Enable SLCG
[0171.056] I> Task: CRC check
[0171.060] I> Task: Crypto init
[0171.064] I> Task: NVRNG health check
[0171.068] I> NVRNG: Health check success
[0171.073] I> Task: MSS Bandwidth limiter settings for iGPU clients
[0171.080] I> Task: Verify SDRAM params
[0171.086] I> Task: MSS Encrypt status check
[0171.091] I> Task: NV_SC7 integrity check
[0171.100] I> Task: Initialize SOC Therm
[0171.105] I> Task: Program NV master stream id
[0171.110] I> Task: Verify boot mode
[0171.117] I> Task: Alias fuses
[0171.121] W> FUSE_ALIAS: Fuse alias on production fused part is not supported.
[0171.130] I> Task: Print SKU type
[0171.134] I> FUSE_OPT_CCPLEX_CLUSTER_DISABLE = 0x000001c0
[0171.140] I> FUSE_OPT_GPC_DISABLE = 0x00000002
[0171.145] I> FUSE_OPT_TPC_DISABLE = 0x000000f0
[0171.151] I> FUSE_OPT_DLA_DISABLE = 0x00000000
[0171.156] I> FUSE_OPT_PVA_DISABLE = 0x00000000
[0171.161] I> FUSE_OPT_NVENC_DISABLE = 0x00000000
[0171.166] I> FUSE_OPT_NVDEC_DISABLE = 0x00000000
[0171.172] I> FUSE_OPT_FSI_DISABLE = 0x00000001
[0171.177] I> FUSE_OPT_EMC_DISABLE = 0x00000000
[0171.182] I> FUSE_BOOTROM_PATCH_VERSION = 0x7
[0171.187] I> FUSE_PSCROM_PATCH_VERSION = 0x7
[0171.192] I> FUSE_OPT_ADC_CAL_FUSE_REV = 0x2
[0171.197] I> FUSE_SKU_INFO_0 = 0xd3
[0171.202] I> FUSE_OPT_SAMPLE_TYPE_0 = 0x3 PS
[0171.207] I> FUSE_PACKAGE_INFO_0 = 0x2
[0171.211] I> SKU: Prod
[0171.214] I> Task: Boost clocks
[0171.218] I> Initializing NAFLL for BPMP_CPU_NIC.
[0171.225] I> BPMP NAFLL: fll_lock = 1, dvco_min_reached = 0
[0171.231] I> BPMP NAFLL lock success.
[0171.236] I> BPMP_CPU_NIC : src = 42, divisor = 0
[0171.241] I> Initializing PLLC2 for AXI_CBB.
[0171.247] I> AXI_CBB : src = 35, divisor = 0
[0171.252] I> Task: Voltage monitor
[0171.256] I> VMON: Vmon re-calibration and fine tuning done
[0171.263] I> Task: TSC init
[0171.266] I> Task: Thermal check
[0171.270] I> BCT max_tmon_limit = 105
[0171.275] I> BCT min_tmon_limit = -28
[0171.279] I> BCT max_tmon_limit = 105
[0171.284] I> BCT min_tmon_limit = -28
[0171.288] I> SKU specific max_chip_limit = 105
[0171.293] I> SKU specific min_chip_limit = -28
[0171.299] I> BCT max_chip_limit = 105
[0171.303] I> BCT min_chip_limit = -28
[0171.308] I> enable_soctherm_polling = 0
[0171.312] I> max temp read = 44
[0171.316] I> min temp read = 42
[0171.320] I> Enabling thermtrip
[0171.324] I> Task: Update FSI SCR with thermal fuse data
[0171.330] I> Task: Enable WDT 5th expiry
[0171.335] I> Task: I2C register
[0171.339] I> Task: Set I2C bus freq
[0171.343] I> Task: Reset FSI
[0171.347] I> Task: Pinmux init
[0171.354] I> skipped mmio_addr = 0xc2f1080
[0171.360] I> skipped mmio_addr = 0xc2f1460
[0171.366] I> skipped mmio_addr = 0xc2f1400
[0171.371] I> skipped mmio_addr = 0xc2f1420
[0171.377] I> skipped mmio_addr = 0xc2f142c
[0171.383] I> skipped mmio_addr = 0xc2f1430
[0171.388] I> skipped mmio_addr = 0xc2f1440
[0171.394] I> skipped mmio_addr = 0xc2f144c
[0171.400] I> skipped mmio_addr = 0xc2f1450
[0171.406] I> skipped mmio_addr = 0xc2f1880
[0171.411] I> skipped mmio_addr = 0xc2f188c
[0171.417] I> skipped mmio_addr = 0xc2f1890
[0171.423] I> skipped mmio_addr = 0xc2f18a0
[0171.428] I> skipped mmio_addr = 0xc2f18ac
[0171.434] I> skipped mmio_addr = 0xc2f18b0
[0171.440] I> skipped mmio_addr = 0xc2f1a60
[0171.446] I> skipped mmio_addr = 0xc2f1a6c
[0171.451] I> skipped mmio_addr = 0xc2f1a70
[0171.459] I> skipped mmio_addr = 0x9240008
[0171.464] I> skipped mmio_addr = 0x9240000
[0171.470] I> skipped mmio_addr = 0x9240010
[0171.476] I> skipped mmio_addr = 0x9240018
[0171.481] I> skipped mmio_addr = 0x9240020
[0171.487] I> skipped mmio_addr = 0x9240030
[0171.493] I> skipped mmio_addr = 0x9240028
[0171.499] I> skipped mmio_addr = 0x9240038
[0171.504] I> skipped mmio_addr = 0x9240040
[0171.510] I> skipped mmio_addr = 0x9240048
[0171.516] I> skipped mmio_addr = 0x9241000
[0171.521] I> skipped mmio_addr = 0x9241008
[0171.527] I> skipped mmio_addr = 0x9241010
[0171.533] I> skipped mmio_addr = 0x9241018
[0171.539] I> skipped mmio_addr = 0x9241020
[0171.544] I> skipped mmio_addr = 0x9241028
[0171.550] I> skipped mmio_addr = 0x9241030
[0171.556] I> skipped mmio_addr = 0x9241038
[0171.562] I> skipped mmio_addr = 0x9241040
[0171.567] I> skipped mmio_addr = 0x9242000
[0171.573] I> skipped mmio_addr = 0x9242008
[0171.579] I> Task: Prod config init
[0171.583] I> Task: Pad voltage init
[0171.588] I> Task: Prod init
[0171.592] I> Task: Common rail init
[0171.596] I> DONE: Thermal config
[0171.602] W> DEVICE_PROD: module = 13, instance = 4 not found in device prod.
[0171.612] I> DONE: SOC rail config
[0171.617] W> PMIC_CONFIG: Rail: MEMIO rail config not found in MB1 BCT.
[0171.625] I> DONE: MEMIO rail config
[0171.630] W> PMIC_CONFIG: Rail: GPU rail info not found in MB1 BCT.
[0171.638] I> DONE: GPU rail info
[0171.643] W> PMIC_CONFIG: Rail: CV rail info not found in MB1 BCT.
[0171.650] I> DONE: CV rail info
[0171.654] I> Task: Misc. board config
[0171.660] I> PMIC_CONFIG: Platform config not found in MB1 BCT.
[0171.666] I> Task: Enable clock-mon
[0171.672] I> FMON: Fmon re-programming done
[0171.677] I> Task: CCPLEX IST init
[0171.681] I> Task: CPU WP0
[0171.685] I> Loading MCE
[0171.696] I> Sending WP0 mailbox command to PSC
[0171.706] I> Task: MB1 fixed firewalls
[0171.733] I> Task: BPMP fw ast config
[0171.737] I> Task: Load nvdec-fw
[0171.755] I> Task: Program TSEC carveout
[0171.759] I> TSEC-FW load not enabled
[0171.764] I> Task: GPIO interrupt map
[0171.768] I> Task: Unpowergate AOPG CAN
[0171.773] I> Task: Misc NV security settings
[0171.778] I> NVDEC sticky bits programming done
[0171.784] I> Successfully powergated NVDEC
[0171.789] I> Task: Disable/Reload WDT
[0171.793] I> Task: Disable SCPM/POD reset
[0171.798] I> Task: Program BPMP-IPC carveouts
[0171.803] I> Program IPC carveouts
[0171.809] I> SLCG Global override status := 0x0
[0171.814] I> MB1: MSS reconfig completed
[0171.823] I> Program carveout CARVEOUT_TEMP_MB2RF before image load
I> MB2 (version: 0.0.0.0-t234-54845784-db255de9)
I> t234-A01-1-Silicon (0x12347)
I> Boot-mode : SC7 Exit
I> Emulation:
I> Entry timestamp: 0x0a3e5dfc
I> Regular heap: [base:0x40040000, size:0x2000]
I> Task: SE error check
I> Task: Crypto init
I> Task: Enable CCPLEX WDT 5th expiry
I> Task: ARI update carveout TZDRAM
I> Task: Configure OEM set LA/PTSA values
I> Task: Check MC errors
I> Task: SMMU external bypass disable
I> Task: Enable hot-plug capability
I> Task: PSC mailbox init
I> Task: Enable clock for external modules
I> Task: OEM SC7 context integrity check
I> Task: Restore FSI padctl config
I> Task: Program CBB PCIE AMAP regions
I> Task: Load and authenticate registered FWs
I> Task: Load AUXP FWs
I> Skipping SCE FW load
I> Successfully register RCE FW load task with MB2 loader
I> Successfully register DCE FW load task with MB2 loader
I> Unpowergating APE
I> Unpowergate done
I> Successfully register APE FW load task with MB2 loader
I> Skipping FSI FW load
I> Successfully register XUSB FW load task with MB2 loader
I> Successfully register PVA FW load task with MB2 loader
I> Binary rce scrubbed successfully
W> FW header for binary dce not saved in context!
W> FW data for binary dce not saved in context!
W> Skip post read scrub for binary dce
I> rce: Authentication Finalize Done
I> Binary rce loaded successfully at 0x472a00000
I> Binary ape scrubbed successfully
I> dce: Authentication Finalize Done
I> Binary dce loaded successfully at 0x47a000000
I> Binary xusb scrubbed successfully
I> ape: Authentication Finalize Done
I> Binary ape loaded successfully at 0x47d800000
I> Binary pva-fw scrubbed successfully
I> xusb: Authentication Finalize Done
I> Binary xusb loaded successfully at 0x472f00000
I> pva-fw: Authentication Finalize Done
I> Binary pva-fw loaded successfully at 0x473180000
I> Task: Check MC errors
I> Task: Carveout setup
I> Scrub remaining OEM carveouts
W> Skip scrubbing unallocated co:41
I> Program remaining OEM carveouts
I> Task: Enable FSITHERM
I> Task: Enable FSI VMON
I> Task: Validate FSI Therm readings
I> Task: Enable FSI SE clock
I> Task: Initialize SBSA UART CAR
I> Task: Unpowergate APE
W> mb2_unpowergate_ape: skip! APE is in unpowergated state
I> Task: Memctrl reconfig pending clients
I> Task: OEM firewalls
I> OEM firewalls configured
I> Task: Powergate APE
I> Powergating APE
I> Powergate done
I> Task: OEM firewall restore saved settings
I> Task: Deinit UART
adc_resume done
clk_mach_resume
clk_mach_resume done
hwwdt_init: WDT boot cfg 0x710010 sts 0x10
▒▒
  I> Task: Trigger load FSI keyblob
I> Task: Complete load FSI keyblob
▒▒clk_mach_resume_late
▒▒I> Task: MB2-PSC_FW Key Manager Init
I> Sending opcode OP_PSC_KEY_MANAGER to psc-fw
I> Sending opcode 0x4b45594d to psc
I> Received ACK from psc
I> Task: Unhalt FSI
I> FSI unhalt skipped
I> Task: Unhalt AUXPs
I> SCE unhalt skipped
I> Unhalting RCE
I> RCE unhalt successful
I> Unhalting DCE
I> DCE unhalt successful
I> APE unhalt skipped
I> Task: Load HV/CPUBL
I> Task: Load TOS
▒▒clk_mach_resume_late done
▒▒
  ▒▒[   171.973673] Camera-FW on t234-rce-safe started
TCU early console enabled.
▒▒DCE Started
DCE_R5_Init
▒▒I> Task: Trigger load TSEC leyblob
▒▒MPU enabled
DCE_SW_Init
▒▒I> Sending opcode 0x53535452 to psc
I> Sent opcode to psc
I> Task: Load and authenticate registered FWs
I> Task: Disable MSS perf stats
I> Task: Program display sticky bits
I> Task: SMMU init
▒▒STATUS_R[31:0]  = 0x0
STATUS_R[63:32] = 0x0
STATUS_R[95:64] = 0x120000
▒▒W> smmu ctx restore: addr is NULL!!
I> Task: Program GICv3 registers
I> Task: Audit firewall settings
I> Task: Lock fusing
I> MB2 finished

▒▒NOTICE:  tegra_soc_pwr_domain_on_finish: exited SC7 successfully. Entering normal world.
▒▒[   172.150896] Camera-FW on t234-rce-safe ready SHA1=e2238c99 (crt 2.752 ms, total boot 180.183 ms)
▒▒Admin Task Init
Admin Task Init complete
Print Task Init
RM Task Init
SHA Task Init
Admin Task Started
DCE SC7 SHA Enabled
RM Task Started
RM Task Running
Print Task Started
Print Task Running
SHA Task Started
DCE: FW Boot Complete
Admin Task Running
SHA Task Running
▒▒TCU debug prints will be routed to traces.
▒▒[  152.467733] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[  152.467738] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[  152.467740] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[  152.467743] tegra-ivc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
▒▒Starting RmBootstrap
Registered event_type:[0] for dce_core_ipc_type:[1]
Registered event_type:[1] for dce_core_ipc_type:[3]
dce_ipc State Initialized
RmBootstrap completed successfully
Resume RM
DCE RM resume rmStatus:0x0 ret:true
▒▒

And here is the log from the second suspend attempt, which fails:

▒▒▒▒DCE RM Suspend rmStatus:0x0
▒▒[  111.096831] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[  111.096846] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[  111.096852] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[  111.096859] tegra-ivc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
[  111.730803] IRQ298: set affinity failed(-22).
[  111.733789] IRQ298: set affinity failed(-22).
[  111.736581] IRQ298: set affinity failed(-22).
[  111.739230] IRQ298: set affinity failed(-22).
[  111.744413] IRQ298: set affinity failed(-22).
[  111.795557] IRQ298: set affinity failed(-22).
[  111.888461] IRQ298: set affinity failed(-22).
▒▒[   144.275715] Camera-FW on t234-rce-safe started
TCU early console enabled.
[   144.315494] Camera-FW on t234-rce-safe ready SHA1=e2238c99 (crt 0.574 ms, total boot 40.395 ms)
TCU debug prints will be routed to traces.
▒▒[  115.209326] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[  115.209334] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[  115.209339] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[  115.209344] tegra-ivc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
[  115.209363] tegra-hsp b950000.tegra-hsp: Try increasing MBOX_TX_QUEUE_LEN
[  115.209373] tegra-hsp b950000.tegra-hsp: Try increasing MBOX_TX_QUEUE_LEN
[  115.209400] tegra-hsp b950000.tegra-hsp: Try increasing MBOX_TX_QUEUE_LEN
[  115.209414] tegra-hsp b950000.tegra-hsp: Try increasing MBOX_TX_QUEUE_LEN
[  146.400238] PM: dpm_run_callback(): nv_pmops_resume+0x0/0x50 [nvidia] returns -5
[  146.400492] nv_platform 13800000.display: PM: failed to resume: error -5
[  146.482899] r8168 0008:01:00.0 eth0: Device reseting!
[  146.585595] NVRM nvAssertFailedNoLog: Assertion failed: pRpc != NULL @ resource.c:308
[  146.585606] CPU: 6 PID: 4542 Comm: Xorg Tainted: G           OE     5.15.148-tegra #1
[  146.585611] Hardware name: NVIDIA NVIDIA Jetson Orin NX Engineering Reference Developer Kit Super/Jetson, BIOS beta-v4.0.0-9c0ffaf0d 05/08/2025
[  146.585613] Call trace:
[  146.585613]  dump_backtrace+0x0/0x1c0
[  146.585624]  show_stack+0x34/0x50
[  146.585628]  dump_stack_lvl+0x68/0x84
[  146.585634]  dump_stack+0x18/0x34
[  146.585637]  os_dump_stack+0x1c/0x28 [nvidia]
[  146.585755]  nvAssertFailedBacktrace.part.0+0x80/0x90 [nvidia]
[  146.585852]  rmresControl_Prologue_IMPL+0x114/0x1d0 [nvidia]
[  146.585945]  resControl_IMPL+0xf0/0x1e0 [nvidia]
[  146.586034]  serverControl+0x1b4/0x2c0 [nvidia]
[  146.586123]  _rmapiRmControl+0x2d8/0x480 [nvidia]
[  146.586213]  rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
[  146.586302]  rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
[  146.586391]  _nv04ControlWithSecInfo.constprop.0+0x38/0x50 [nvidia]
[  146.586480]  Nv04ControlKernel+0x50/0x60 [nvidia]
[  146.586568]  nvkms_call_rm+0x60/0x9c [nvidia_modeset]
[  146.586646]  nvRmApiControl+0x50/0x70 [nvidia_modeset]
[  146.586709]  nvidia_frontend_unlocked_ioctl+0x60/0x80 [nvidia]
[  146.586804]  __arm64_sys_ioctl+0xb4/0x100
[  146.586809]  invoke_syscall+0x5c/0x130
[  146.586813]  el0_svc_common.constprop.0+0x64/0x110
[  146.586815]  do_el0_svc+0x74/0xa0
[  146.586818]  el0_svc+0x28/0x80
[  146.586820]  el0t_64_sync_handler+0xa4/0x130
[  146.586822]  el0t_64_sync+0x1a4/0x1a8
[  146.586927] Unable to handle kernel NULL pointer dereference at virtual address 00000000000008c0
[  146.586934] Mem abort info:
[  146.586935]   ESR = 0x0000000096000004
[  146.586937]   EC = 0x25: DABT (current EL), IL = 32 bits
[  146.586939]   SET = 0, FnV = 0
[  146.586941]   EA = 0, S1PTW = 0
[  146.586942]   FSC = 0x04: level 0 translation fault
[  146.586943] Data abort info:
[  146.586944]   ISV = 0, ISS = 0x00000004
[  146.586945]   CM = 0, WnR = 0
[  146.586947] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001704af000
[  146.586949] [00000000000008c0] pgd=0000000000000000, p4d=0000000000000000
[  146.586955] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[  146.586960] Modules linked in: miivii_eeprom(E) xt_conntrack(E) xt_MASQUERADE(E) nfnetlink(E) ip6table_nat(E) nvidia_drm(OE) ip6table_filter(E) ip6_tables(E) nvidia_modeset(OE) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) xt_addrtype(E) iptable_filter(E) lzo_rle(E) lzo_compress(E) zram(E) zsmalloc(E) nvme_fabrics(E) ramoops(E) reed_solomon(E) bridge(E) stp(E) llc(E) usb_f_ncm(E) usb_f_mass_storage(E) usb_f_acm(E) u_serial(E) usb_f_rndis(E) u_ether(E) libcomposite(E) snd_soc_tegra186_asrc(OE) snd_soc_tegra210_admaif(OE) snd_soc_tegra210_ope(OE) snd_soc_tegra186_arad(OE) snd_soc_tegra210_afc(OE) snd_soc_tegra_pcm(E) snd_soc_tegra210_mixer(OE) snd_soc_tegra210_mvc(OE) snd_soc_tegra186_dspk(OE) snd_soc_tegra210_amx(OE) snd_soc_tegra210_adx(OE) snd_soc_tegra210_sfc(OE) snd_soc_tegra210_i2s(OE) snd_soc_tegra210_dmic(OE) r8168(OE) snd_soc_tegra210_ahub(OE) tegra210_adma(E) spidev(E) nvvrs_pseq_rtc(OE) snd_soc_tegra_machine_driver(OE)
[  146.587023]  snd_soc_tegra_utils(OE) tegra234_oc_event(OE) snd_soc_simple_card_utils(E) joydev(E) crct10dif_ce(E) nvpmodel_clk_cap(OE) thermal_trip_event(OE) tegra_cactmon_mc_all(OE) tegra234_aon(OE) tegra_aconnect(E) pwm_tegra_tachometer(OE) snd_hda_codec_hdmi(E) snd_hda_tegra(E) spi_tegra114(E) snd_hda_codec(E) snd_hda_core(E) igc(E) at24(E) mv_max96724(OE) tegra_pcie_dma_test(OE) tegra_pcie_edma(OE) mc_hwpm(OE) nvidia(OE) host1x_fence(OE) tegra_dce(OE) nvidia_vrs_pseq(OE) tsecriscv(OE) nvhost_isp5(OE) nvhost_vi5(OE) nvhost_nvcsi_t194(OE) tegra_camera(OE) v4l2_dv_timings(E) nvhost_nvcsi(OE) tegra_camera_platform(OE) capture_ivc(OE) tegra_camera_rtcpu(OE) governor_userspace(E) ivc_bus(OE) hsp_mailbox_client(OE) ivc_ext(OE) tegra_drm(OE) v4l2_fwnode(E) v4l2_async(E) videobuf2_dma_contig(E) videobuf2_memops(E) nvhost_pva(OE) videobuf2_v4l2(E) tegra_wmark(OE) nvhost_nvdla(OE) videobuf2_common(E) videodev(E) cec(E) nvhwpm(OE) drm_kms_helper(E) nvhost_capture(OE) mc(E) host1x_nvhost(OE)
[  146.587069]  rtw_8852be(E) rtw_8852b(E) rtw89pci(E) rtw89core(E) mac80211(E) cfg80211(E) nvidia_p2p(OE) ina3221(E) nvgpu(OE) governor_pod_scaling(OE) host1x(OE) mc_utils(OE) nvmap(OE) nvsciipc(OE) mcp251xfd(E) mttcan(OE) nvpps(OE) can_dev(E) can_raw(E) can(E) quota_v2(E) quota_tree(E) ch9344(E) drm(E) fuse(E) ip_tables(E) x_tables(E) ipv6(E) pwm_fan pwm_tegra tegra_bpmp_thermal tegra_xudc ucsi_ccg typec_ucsi typec nvme nvme_core phy_tegra194_p2u pcie_tegra194
[  146.587102] CPU: 5 PID: 4909 Comm: gnome-shell Tainted: G           OE     5.15.148-tegra #1
[  146.587105] Hardware name: NVIDIA NVIDIA Jetson Orin NX Engineering Reference Developer Kit Super/Jetson, BIOS beta-v4.0.0-9c0ffaf0d 05/08/2025
[  146.587107] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  146.587110] pc : nvInitFlipEvoHwState+0x38/0x140 [nvidia_modeset]
[  146.587184] lr : nvFlipEvo+0x1e4/0x9f0 [nvidia_modeset]
[  146.587245] sp : ffff800017fc3b90
[  146.587246] x29: ffff800017fc3d30 x28: 0000000000000000 x27: ffff800013d1d0d0
[  146.587249] x26: 0000000000000000 x25: ffff000096b00008 x24: 0000000000000000
[  146.587251] x23: ffff000096b00008 x22: 0000000000000000 x21: ffff800013d1d0d0
[  146.587252] x20: 0000000000000000 x19: ffff80000d961008 x18: 0000000000000000
[  146.587254] x17: 0000000000000000 x16: ffffcc0d35ab7240 x15: 0000ffffdfe67400
[  146.587256] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[  146.587258] x11: 0000000000000001 x10: ffff000088f88850 x9 : 0000000000000000
[  146.587260] x8 : ffff800013d46508 x7 : 0000000000000000 x6 : 000000000000003f
[  146.587262] x5 : 0000000000000040 x4 : fffffffffffffff0 x3 : 0000000000000000
[  146.587263] x2 : 00000000000009e0 x1 : 0000000000000000 x0 : ffff800013d1d0d0
[  146.587266] Call trace:
[  146.587267]  nvInitFlipEvoHwState+0x38/0x140 [nvidia_modeset]
[  146.587326]  nvidia_frontend_unlocked_ioctl+0x60/0x80 [nvidia]
[  146.587434]  __arm64_sys_ioctl+0xb4/0x100
[  146.587439]  invoke_syscall+0x5c/0x130
[  146.587443]  el0_svc_common.constprop.0+0x64/0x110
[  146.587446]  do_el0_svc+0x74/0xa0
[  146.587449]  el0_svc+0x28/0x80
[  146.587452]  el0t_64_sync_handler+0xa4/0x130
[  146.587454]  el0t_64_sync+0x1a4/0x1a8
[  146.587457] Code: aa0303e0 52800001 f94236e3 8b160063 (f9446073)
[  146.587461] ---[ end trace 7ff1fd8bca1d1b17 ]---
[  146.600773] PM: suspend exit

Could you please help us analyze the reason for the failure during the second suspend? From our side, we suspect it might be related to nvgpu.

Thank you very much.

Hi HuangZeng,

Are you using Jetpack 6.1(r36.4.0) or Jetpack 6.2(r36.4.3)?

May I know the fail rate in your case?
Do you have any customization in kernel? What are you connecting on your board?

Could you try to disable 13800000.display in device tree to check if it could help?

Hi,

  1. Just to clarify, we are using BSP version R36.4.3, JetPack 6.2.
  2. The failure rate is 100% — following the steps below, the issue always occurs. Let me explain the sequence in detail:
  • Power on the board and connect an external HDMI display.
  • Enter suspend mode.
  • Wake up the system using a keyboard.
  • Run reboot or continue .
  • Enter suspend mode again — this fails, and the system will automatically reboot after a short time.
  • The issue cannot be recovered by rebooting; only a full power cycle restores suspend functionality.
  1. We use an HDMI display as our external monitor. We have configured the related pinmux settings and enabled the corresponding hot-plug device tree configuration.

  2. disabled the 13800000.display , suspend is right.

Let me know if there’s anything else I should check or try. Thank you for your support!

Best regards,
HuangZeng

BTW:
suspend also works normally when HDMI is not connected.

Hi,

We also observed a similar situation (system crash on the 2nd suspend) on our custom board with HDMI connected.

Here is our logs:

1st:

test@tegra-ubuntu:~$ [   46.366121] PM: suspend entry (deep)
[   46.376621] Filesystems sync: 0.010 seconds
[   46.378781] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   46.381788] OOM killer disabled.
[   46.381789] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   46.382995] printk: Suspending console(s) (use no_console_suspend to debug)
踜CE RM Suspend rmStatus:0x0
墁   48.418233] dce: dce_handle_irq_status:240  DCE can be safely powered-off now
[   48.419213] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[   48.419222] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[   48.419226] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[   48.419231] tegra-ivc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
[   48.594531] rtk_btusb: btusb_suspend: suspending...
[   48.695378] fusb301 1-0025: fusb301_set_mode: mode (4)(4)
[   48.734978] tegra194-pcie 140a0000.pcie: Link didn't transition to L2 state
[   48.855063] tegra194-pcie 140a0000.pcie: Link didn't go to detect state
[   48.942953] tegra194-pcie 14100000.pcie: Link didn't transition to L2 state
[   49.070882] tegra194-pcie 14100000.pcie: Link didn't go to detect state
[   49.073852] Disabling non-boot CPUs ...
[   49.074280] IRQ240: set affinity failed(-22).
[   49.074287] IRQ241: set affinity failed(-22).
[   49.074290] IRQ244: set affinity failed(-22).
[   49.074450] psci: CPU1 killed (polled 0 ms)
[   49.076728] IRQ240: set affinity failed(-22).
[   49.076735] IRQ241: set affinity failed(-22).
[   49.076739] IRQ244: set affinity failed(-22).
[   49.076932] psci: CPU2 killed (polled 0 ms)
[   49.078749] IRQ240: set affinity failed(-22).
[   49.078755] ?嘌RQ241: set affinity failed(-22).
[   49.078759] IRQ244: set affinity failed(-22).
[   49.078943] psci: CPU3 killed (polled 0 ms)
[   49.080886] IRQ240: set affinity failed(-22).
[   49.080988] psci: CPU4 killed (polled 0 ms)
歊lk_mach_suspend_early
clk_mach_suspend_early done
suspended vdd_core @ 867625uV
跴pe entering sc7
歊lk_mach_suspend
clk_mach_suspend done
fmon_suspend done
adc_suspend done
WAKE_MASK[31:0]  = 0x21000002
WAKE_MASK[63:32] = 0x0
WAKE_MASK[95:64] = 0x17f200
TIER0[31:0]      = 0x0
TIER0[63:32]     = 0x0
TIER0[95:64]     = 0x0
TIER1[31:0]      = 0x0
TIER1[63:32]     = 0x0
TIER1[95:64]     = 0x0
TIER2[31:0]      = 0x21000002
TIER2[63:32]     = 0x0
TIER2[95:64]     = 0x7f200
[0082.359] I> MB1 (version: 1.4.0.4-t234-54845784-82c7462e)
[0082.365] I> t234-A01-1-Silicon (0x12347) Prod
[0082.371] I> Boot-mode : SC7 Exit
[0082.375] I> Entry timestamp: 0x00000000
[0082.379] I> last_boot_error: 0x0
[0082.383] I> BR-BCT: preprod_dev_sign: 0
[0082.388] I> rst_source: 0xc, rst_level: 0x3
[0082.393] I> Task: SE error check
[0082.397] I> Task: Enable SLCG
[0082.401] I> Task: CRC check
[0082.404] I> Task: Crypto init
[0082.408] I> Task: NVRNG health check
[0082.412] I> NVRNG: Health check success
[0082.417] I> Task: MSS Bandwidth limiter settings for iGPU clients
[0082.424] I> Task: Verify SDRAM params
[0082.430] I> Task: MSS Encrypt status check
[0082.435] I> Task: NV_SC7 integrity check
[0082.444] I> Task: Initialize SOC Therm
[0082.449] I> Task: Program NV master stream id
[0082.454] I> Task: Verify boot mode
[0082.461] I> Task: Alias fuses
[0082.466] W> FUSE_ALIAS: Fuse alias on production fused part is not supported.
[0082.474] I> Task: Print SKU type
[0082.478] I> FUSE_OPT_CCPLEX_CLUSTER_DISABLE = 0x000001c8
[0082.484] I> FUSE_OPT_GPC_DISABLE = 0x00000002
[0082.489] I> FUSE_OPT_TPC_DISABLE = 0x000000fc
[0082.494] I> FUSE_OPT_DLA_DISABLE = 0x00000003
[0082.500] I> FUSE_OPT_PVA_DISABLE = 0x00000001
[0082.505] I> FUSE_OPT_NVENC_DISABLE = 0x00000001
[0082.510] I> FUSE_OPT_NVDEC_DISABLE = 0x00000000
[0082.515] I> FUSE_OPT_FSI_DISABLE = 0x00000001
[0082.521] I> FUSE_OPT_EMC_DISABLE = 0x0000000c
[0082.526] I> FUSE_BOOTROM_PATCH_VERSION = 0x7
[0082.531] I> FUSE_PSCROM_PATCH_VERSION = 0x7
[0082.536] I> FUSE_OPT_ADC_CAL_FUSE_REV = 0x2
[0082.541] I> FUSE_SKU_INFO_0 = 0xd6
[0082.545] I> FUSE_OPT_SAMPLE_TYPE_0 = 0x3 PS
[0082.550] I> FUSE_PACKAGE_INFO_0 = 0x2
[0082.555] I> SKU: Prod
[0082.558] I> Task: Boost clocks
[0082.561] I> Initializing NAFLL for BPMP_CPU_NIC.
[0082.568] I> BPMP NAFLL: fll_lock = 1, dvco_min_reached = 0
[0082.574] I> BPMP NAFLL lock success.
[0082.579] I> BPMP_CPU_NIC : src = 42, divisor = 0
[0082.584] I> Initializing PLLC2 for AXI_CBB.
[0082.589] I> AXI_CBB : src = 35, divisor = 0
[0082.594] I> Task: Voltage monitor
[0082.599] I> VMON: Vmon re-calibration and fine tuning done
[0082.605] I> Task: TSC init
[0082.609] I> Task: Thermal check
[0082.613] I> BCT max_tmon_limit = 105
[0082.617] I> BCT min_tmon_limit = -28
[0082.621] I> BCT max_tmon_limit = 105
[0082.626] I> BCT min_tmon_limit = -28
[0082.630] I> SKU specific max_chip_limit = 105
[0082.636] I> SKU specific min_chip_limit = -28
[0082.641] I> BCT max_chip_limit = 105
[0082.645] I> BCT min_chip_limit = -28
[0082.650] I> enable_soctherm_polling = 0
[0082.654] I> max temp read = 40
[0082.658] I> min temp read = 38
[0082.662] I> Enabling thermtrip
[0082.666] I> Task: Update FSI SCR with thermal fuse data
[0082.672] I> Task: Enable WDT 5th expiry
[0082.676] I> Task: I2C register
[0082.680] I> Task: Set I2C bus freq
[0082.684] I> Task: Reset FSI
[0082.688] I> Task: Pinmux init
[0082.695] I> skipped mmio_addr = 0xc2f1040
[0082.701] I> skipped mmio_addr = 0xc2f1080
[0082.707] I> skipped mmio_addr = 0xc2f1460
[0082.712] I> skipped mmio_addr = 0xc2f1400
[0082.718] I> skipped mmio_addr = 0xc2f140c
[0082.724] I> skipped mmio_addr = 0xc2f1410
[0082.729] I> skipped mmio_addr = 0xc2f1440
[0082.735] I> skipped mmio_addr = 0xc2f144c
[0082.741] I> skipped mmio_addr = 0xc2f1450
[0082.746] I> skipped mmio_addr = 0xc2f1880
[0082.752] I> skipped mmio_addr = 0xc2f188c
[0082.758] I> skipped mmio_addr = 0xc2f1890
[0082.763] I> skipped mmio_addr = 0xc2f1420
[0082.769] I> skipped mmio_addr = 0xc2f142c
[0082.775] I> skipped mmio_addr = 0xc2f1430
[0082.780] I> skipped mmio_addr = 0xc2f18a0
[0082.786] I> skipped mmio_addr = 0xc2f18ac
[0082.792] I> skipped mmio_addr = 0xc2f18b0
[0082.797] I> skipped mmio_addr = 0xc2f1a60
[0082.803] I> skipped mmio_addr = 0xc2f1a6c
[0082.809] I> skipped mmio_addr = 0xc2f1a70
[0082.816] I> skipped mmio_addr = 0x9240008
[0082.821] I> skipped mmio_addr = 0x9240000
[0082.827] I> skipped mmio_addr = 0x9240010
[0082.833] I> skipped mmio_addr = 0x9240018
[0082.838] I> skipped mmio_addr = 0x9240020
[0082.844] I> skipped mmio_addr = 0x9240030
[0082.850] I> skipped mmio_addr = 0x9240028
[0082.855] I> skipped mmio_addr = 0x9240038
[0082.861] I> skipped mmio_addr = 0x9240040
[0082.867] I> skipped mmio_addr = 0x9240048
[0082.872] I> skipped mmio_addr = 0x9241000
[0082.878] I> skipped mmio_addr = 0x9241008
[0082.884] I> skipped mmio_addr = 0x9241010
[0082.889] I> skipped mmio_addr = 0x9241018
[0082.895] I> skipped mmio_addr = 0x9241020
[0082.901] I> skipped mmio_addr = 0x9241028
[0082.906] I> skipped mmio_addr = 0x9241030
[0082.912] I> skipped mmio_addr = 0x9241038
[0082.918] I> skipped mmio_addr = 0x9241040
[0082.923] I> skipped mmio_addr = 0x9242000
[0082.929] I> skipped mmio_addr = 0x9242008
[0082.935] I> Task: Prod config init
[0082.939] I> Task: Pad voltage init
[0082.943] I> Task: Prod init
[0082.947] I> Task: Common rail init
[0082.952] I> DONE: Thermal config
[0082.958] W> DEVICE_PROD: module = 13, instance = 4 not found in device prod.
[0082.967] I> DONE: SOC rail config
[0082.973] W> PMIC_CONFIG: Rail: MEMIO rail config not found in MB1 BCT.
[0082.980] I> DONE: MEMIO rail config
[0082.986] W> PMIC_CONFIG: Rail: GPU rail info not found in MB1 BCT.
[0082.993] I> DONE: GPU rail info
[0082.998] W> PMIC_CONFIG: Rail: CV rail info not found in MB1 BCT.
[0083.005] I> DONE: CV rail info
[0083.009] I> Task: Misc. board config
[0083.014] I> PMIC_CONFIG: Platform config not found in MB1 BCT.
[0083.021] I> Task: Enable clock-mon
[0083.026] I> FMON: Fmon re-programming done
[0083.031] I> Task: CCPLEX IST init
[0083.035] I> Task: CPU WP0
[0083.039] I> Loading MCE
[0083.050] I> Sending WP0 mailbox command to PSC
[0083.060] I> Task: MB1 fixed firewalls
[0083.086] I> Task: BPMP fw ast config
[0083.091] I> Task: Load nvdec-fw
[0083.108] I> Task: Program TSEC carveout
[0083.113] I> TSEC-FW load not enabled
[0083.117] I> Task: GPIO interrupt map
[0083.122] I> Task: Unpowergate AOPG CAN
[0083.127] I> Task: Misc NV security settings
[0083.132] I> NVDEC sticky bits programming done
[0083.137] I> Successfully powergated NVDEC
[0083.142] I> Task: Disable/Reload WDT
[0083.146] I> Task: Disable SCPM/POD reset
[0083.151] I> Task: Program BPMP-IPC carveouts
[0083.156] I> Program IPC carveouts
[0083.162] I> SLCG Global override status := 0x0
[0083.167] I> MB1: MSS reconfig completed
[0083.176] I> Program carveout CARVEOUT_TEMP_MB2RF before image load
I> MB2 (version: 0.0.0.0-t234-54845784-db255de9)
I> t234-A01-1-Silicon (0x12347)
I> Boot-mode : SC7 Exit
I> Emulation:
I> Entry timestamp: 0x04f5c0d9
I> Regular heap: [base:0x40040000, size:0x2000]
I> Task: SE error check
I> Task: Crypto init
I> Task: Enable CCPLEX WDT 5th expiry
I> Task: ARI update carveout TZDRAM
I> Task: Configure OEM set LA/PTSA values
I> Task: Check MC errors
I> Task: SMMU external bypass disable
I> Task: Enable hot-plug capability
I> Task: PSC mailbox init
I> Task: Enable clock for external modules
I> Task: OEM SC7 context integrity check
I> Task: Restore FSI padctl config
I> Task: Program CBB PCIE AMAP regions
I> Task: Load and authenticate registered FWs
I> Task: Load AUXP FWs
I> Skipping SCE FW load
I> Successfully register RCE FW load task with MB2 loader
I> Successfully register DCE FW load task with MB2 loader
I> Unpowergating APE
I> Unpowergate done
I> Successfully register APE FW load task with MB2 loader
I> Skipping FSI FW load
I> Successfully register XUSB FW load task with MB2 loader
I> Successfully register PVA FW load task with MB2 loader
I> Binary rce scrubbed successfully
W> FW header for binary dce not saved in context!
W> FW data for binary dce not saved in context!
W> Skip post read scrub for binary dce
I> rce: Authentication Finalize Done
I> Binary rce loaded successfully at 0x172a00000
I> Binary ape scrubbed successfully
I> dce: Authentication Finalize Done
I> Binary dce loaded successfully at 0x17a000000
I> Binary xusb scrubbed successfully
I> ape: Authentication Finalize Done
I> Binary ape loaded successfully at 0x17d800000
I> Binary pva-fw scrubbed successfully
I> xusb: Authentication Finalize Done
I> Binary xusb loaded successfully at 0x172f00000
I> pva-fw: Authentication Finalize Done
I> Binary pva-fw loaded successfully at 0x173180000
I> Task: Check MC errors
I> Task: Carveout setup
I> Scrub remaining OEM carveouts
W> Skip scrubbing unallocated co:41
I> Program remaining OEM carveouts
I> Task: Enable FSITHERM
I> Task: Enable FSI VMON
I> Task: Validate FSI Therm readings
I> Task: Enable FSI SE clock
I> Task: Initialize SBSA UART CAR
I> Task: Unpowergate APE
W> mb2_unpowergate_ape: skip! APE is in unpowergated state
I> Task: Memctrl reconfig pending clients
I> Task: OEM firewalls
I> OEM firewalls configured
I> Task: Powergate APE
I> Powergating APE
I> Powergate done
I> Task: OEM firewall restore saved settings
I> Task: Deinit UART
adc_resume done
clk_mach_resume
clk_mach_resume done
hwwdt_init: WDT boot cfg 0x710010 sts 0x10
spe exiting sc7
歊lk_mach_resume_late
?
  I> Task: Trigger load FSI keyblob
I> Task: Complete load FSI keyblob
I> Task: MB2-PSC_FW Key Manager Init
I> Sending opcode OP_PSC_KEY_MANAGER to psc-fw
I> Sending opcode 0x4b45594d to psc
I> Received ACK from psc
I> Task: Unhalt FSI
I> FSI unhalt skipped
I> Task: Unhalt AUXPs
I> SCE unhalt skipped
I> Unhalting RCE
I> RCE unhalt successful
I> Unhalting DCE
I> DCE unhalt succ嬂    83.337041] Camera-FW on t234-rce-safe started
TCU early conso歊lk_mach_resume_late done
嶚e enabled.
?
  DCE Started
墈ssful
I> APE unhalt skipped
I> Task: Load HV/CPUBL
I> Task: Load TOS
踜CE_R5_Init
MPU enabl嘌> Task: Trigger load TSEC leyblob
鋀d
DCE_SW_Init
嘌> Sending opcode 0x53535452 to psc
I> Sent opcode to psc
I> Task: Load and 榹TATUS_R[31:0]  = 0x20000000
STATUS_R[63:32] = 0x0
STATUS_R[95:64] = 0x100000
墑uthenticate registered FWs
I> Task: Disable MSS perf stats
I> Task: Program display sticky bits
I> Task: SMMU init
W> smmu ctx restore: addr is NULL!!
I> Task: Program GICv3 registers
I> Task: Audit firewall settings
I> Task: Lock fusing
I> MB2 finished

誏OTICE:  tegra_soc_pwr_domain_on_finish: exited SC7 successfully. Entering normal world.
墁   49.083916] psci: CPU5 killed (polled 0 ms)
[   49.086398] Enabling non-boot CPUs ...
[   49.086757] Detected PIPT I-cache on CPU1
[   49.086782] GICv3: CPU1: found redistributor 100 region 0:0x000000000f460000
[   49.086816] CPU1: Booted secondary processor 0x0000000100 [0x410fd421]
[   49.087401] CPU1 is up
[   49.087635] Detected PIPT I-cache on CPU2
[   49.087645] GICv3: CPU2: found redistributor 200 region 0:0x000000000f480000
[   49.087661] CPU2: Booted secondary processor 0x0000000200 [0x410fd421]
[   49.087901] CPU2 is up
[   49.088134] Detected PIPT I-cache on CPU3
[   49.088143] GICv3: CPU3: found redistributor 300 region 0:0x000000000f4a0000
[   49.088159] CPU3: Booted secondary processor 0x0000000300 [0x410fd421]
[   49.088408] CPU3 is up
[   49.090605] Detected PIPT I-cache on CPU4
[   49.090630] GICv3: CPU4: found redistributor 10200 region 0:0x000000000f500000
[   49.090663] CPU4: Booted secondary processor 0x0000010200 [0x410fd421]
[   49.091127] cpufreq: cpufreq_online: CPU4: Running at unlisted initial frequency: 1559000 KHz, changing to: 1510400 KHz
[   49.091469] CPU嬂    83.506064] Camera-FW on t234-rce-safe ready SHA1=e2238c99 (crt 2.760 ms, total boot 171.989 ms)
踒dmin Task Init
Admin Task I? is up
[   49.091732] Detected PIPT I-cache on CPU5
[   49.091745] GICv3: CPU5: found redistributor 10300 region 0:0x000000000f520000
[   49.091765] CPU5: Booted secondary processor 0x0000010300 [0x410fd421]
[   49.092136] CPU5 is up
[   49.204831] tegra194-pcie 14100000.pcie: Link up
鋝it complete
Print Task Init
RM Task Init
SHA Task Init
Admin Task Started
DCE SC7 SHA Enabled
RM Task Started
RM Task Running
Print Task Started
Print Task Running
SHA Task Started
DCE: FW Boot Complete
Admin Task Running
SHA Task Running
墁   49.312829] tegra194-pcie 14160000.pcie: Link up
[   49.420824] tegra194-pcie 141e0000.pcie: Link up
[   49.636825] tegra194-pcie 140a0000.pcie: Link up
[   49.788909] fusb301 1-0025: fusb301_set_mode: mode (32)(32)
[   49.799864] tegra-xusb 3610000.usb: Firmware timestamp: 2023-02-10 03:48:10 UTC
[   50.030361] rtk_btusb: chip type value: 0x73
嫿CU debug prints will be routed to traces.
墁   50.258269] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[   50.258274] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[  遫tarting RmBootstrap
Registered event_type:[0] for dce_core_ipc_type:[1]
Registered even?50.258277] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[   50.258281] tegra-鋈_type:[1] for dce_co壾vc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
[   50.25868鋨e_ipc_type:[3]
dce_ipc State Initialized
RmBootstrap completed successfully
Resume RM
?] nvme nvme0: 6/0/0 default/read/poll queues
[   50.260013] dce: dce_mailbox_set_full_interrupt:157  Intr bit set multiple times for MB : [0x5]
[   50.260020] dce: dce_mailbox_set_full_interrupt:157  Intr bit set multiple times for MB : [0x5]
[   50.260277] dce: dce_admin_send_cmd_ver:456  version : [0x3] err : [0x0]
[   50.260470] dce: dce_mailbox_set_full_interrupt:157  Intr bit set multiple times for MB : [0x1]
[   50.260475] dce: dce_mailbox_set_full_interrupt:157  Intr bit set multiple times for MB : [0x1]
[   50.260481] dce: dce_admin_setup_clients_ipc:585  Channel Reset Complete for Type [1] ...
[   50.260483] dce: dce_admin_setup_clients_ipc:561  Get queue info failed for [2]
[   50.260679] dce: dce_mailbox_set_full_interrupt:157  Intr bit set multiple times for MB : [0x2]
[   50.260685] dce: dce_admin_setup_clients_ipc:585  Channel Reset Complete for Type [3] ...
[   50.323928] r8168 0001:01:00.0 elan2: Device reseting!
[   50.349510] r8168 0008:01:00.0 elan1: Device reseting!
[   50.421506] dce: dce_start_boot_flow:166  DCE_BOOT_DONE
踜CE RM resume rmStatus:0x0 ret:true
墁   52.071060] rtk_btcoex: Close BTCOEX
[   52.083185] OOM killer enabled.
[   52.083191] Restarting tasks ...
[   52.085908] rtk_btusb: chip type value: 0x73
[   52.086080] done.
[   52.102681] rtk_btusb: config filename rtl8822cu_config
[   52.102707] rtk_btusb: Origin cfg len 6
[   52.102712] rtk_btusb: New cfg len 6
[   52.108033] PM: suspend exit
[   52.150976] elan1: 0xffff800008bb1000, 3c:6d:66:03:ab:76, IRQ 241
[   52.205536] elan2: 0xffff800008ba1000, 00:90:fb:83:72:66, IRQ 240
[   52.583330] rtk_btusb: btusb_open set HCI UP RUNNING
[   52.583370] rtk_btcoex: Open BTCOEX

2nd:

test@tegra-ubuntu:~$ [   88.859052] PM: suspend entry (deep)
[   88.890539] Filesystems sync: 0.031 seconds
[   88.892007] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   88.894813] OOM killer disabled.
[   88.894815] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   88.896024] printk: Suspending console(s) (use no_console_suspend to debug)
踜CE RM Suspend rmStatus:0x0
墁  101.686221] dce: dce_handle_irq_status:240  DCE can be safely powered-off now
[  101.688113] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[  101.688120] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[  101.688123] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[  101.688128] tegra-ivc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
[  101.867799] rtk_btusb: btusb_suspend: suspending...
[  101.969897] fusb301 1-0025: fusb301_set_mode: mode (4)(4)
[  102.010906] tegra194-pcie 140a0000.pcie: Link didn't transition to L2 state
[  102.130965] tegra194-pcie 140a0000.pcie: Link didn't go to detect state
[  102.218901] tegra194-pcie 14100000.pcie: Link didn't transition to L2 state
[  102.338958] tegra194-pcie 14100000.pcie: Link didn't go to detect state
[  102.341481] Disabling non-boot CPUs ...
[  102.342606] migrate_one_irq: 5 callbacks suppressed
[  102.342611] IRQ240: set affinity failed(-22).
[  102.342614] IRQ241: set affinity failed(-22).
[  102.342616] IRQ244: set affinity failed(-22).
[  102.342683] psci: CPU1 killed (polled 0 ms)
[  102.346148] IRQ240: set affinity failed(-22).
[  102.346153] IRQ241: set affinity failed(-22).
[  102.346156] IRQ244: set affinity failed(-22).
[  102.346751] psci: CPU2 killed (polled 0 ms)
[  102.407477] IRQ240: set affinity failed(-22).
[  102.407483] IRQ241: set affinity failed(-22).
[  102.407486] IRQ244: set affinity failed(-22).
[  102.408303] psci: CPU3 killed (polled 0 ms)
[  102.454531] IRQ240: set affinity failed(-22).
[  102.454888] psci: CPU4 killed (polled 0 ms)
[  102.502428] psci: CPU5 killed (polled 0 ms)
[  102.551561] Enabling non-boot CPUs ...
[  102.565260] Detected PIPT I-cache on CPU1
[  102.565285] GICv3: CPU1: found redistributor 100 region 0:0x000000000f460000
[  102.565319] CPU1: Booted secondary processor 0x0000000100 [0x410fd421]
[  102.566571] CPU1 is up
[  102.572158] Detected PIPT I-cache on CPU2
[  102.572173] GICv3: CPU2: found redistributor 200 region 0:0x000000000f480000
[  102.572195] CPU2: Booted secondary processor 0x0000000200 [0x410fd421]
[  102.573940] CPU2 is up
[  102.579020] Detected PIPT I-cache on CPU3
[  102.579032] GICv3: CPU3: found redistributor 300 region 0:0x000000000f4a0000
[  102.579050] CPU3: Booted secondary processor 0x0000000300 [0x410fd421]
[  102.579698] CPU3 is up
[  102.588231] Detected PIPT I-cache on CPU4
[  102.588378] GICv3: CPU4: found redistributor 10200 region 0:0x000000000f500000
[  102.588554] CPU4: Booted secondary processor 0x0000010200 [0x410fd421]
[  102.593488] CPU4 is up
[  102.599279] Detected PIPT I-cache on CPU5
[  102.599421] GICv3: CPU5: found redistributor 10300 region 0:0x000000000f520000
[  102.599584] CPU5: Booted secondary processor 0x0000010300 [0x410fd421]
[  102.603297] CPU5 is up
[  102.744945] tegra194-pcie 14100000.pcie: Link up
[  102.856910] tegra194-pcie 14160000.pcie: Link up
[  102.968899] tegra194-pcie 141e0000.pcie: Link up
[  103.184886] tegra194-pcie 140a0000.pcie: Link up
[  103.371346] fusb301 1-0025: fusb301_set_mode: mode (32)(32)
[  103.383249] tegra-xusb 3610000.usb: Firmware timestamp: 2023-02-10 03:48:10 UTC
[  103.622926] rtk_btusb: chip type value: 0x73
[  103.853579] nvme nvme0: 6/0/0 default/read/poll queues
嬂   138.181259] Camera-FW on t234-rce-safe started
TCU early console enabled.
墁  103.919385] r8168 0001:01:00.0 elan2: Device reseting!
[  103.945236] r8168 0008:01:00.0 elan1: Device reseting!
嬂   138.270320] Camera-FW on t234-rce-safe ready SHA1=e2238c99 (crt 1.201 ms, total boot 90.362 ms)
TCU debug prints will be routed to traces.
墁  103.955000] tegra-hsp b950000.tegra-hsp: Try increasing MBOX_TX_QUEUE_LEN
[  103.955939] tegra-ivc-bus bc00000.rtcpu:ivc-bus:echo@0: ivc channel driver missing
[  103.955983] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@1: ivc channel driver missing
[  103.956007] tegra-ivc-bus bc00000.rtcpu:ivc-bus:dbg@2: ivc channel driver missing
[  103.956033] tegra-ivc-bus bc00000.rtcpu:ivc-bus:diag@5: ivc channel driver missing
[  105.670846] rtk_btcoex: Close BTCOEX
[  134.109420] dce: tegra_dce_register_ipc_client:163  dce boot wait failed (-110)
[  134.109420]
[  134.109514] NVRM dceclientInitRpcInfra_IMPL: Register dce ipc client failed for DCE_CLIENT_RM_IPC_TYPE_SYNC error 0xffffff92
[  134.109567] NVRM dceclientStateLoad_IMPL: dceclientInitRpcInfra failed
[  134.109622] NVRM nvAssertOkFailedNoLog: Assertion failed: Unknown error code! (0xFFFFFF92) returned from gpuStateLoad(pGpu, flags | GPU_STATE_FLAGS_PRESERVING | GPU_STATE_FLAGS_PM_TRANSITION) @ power-management-tegra.c:56
[  134.109672] CPU: 5 PID: 3002 Comm: systemd-sleep Tainted: G           O      5.15.148-tegra #1
[  134.109710] Hardware name: NVIDIA Portwell PJAI-1100F-ON4G/Jetson, BIOS r36.4.3-85becaf8-dirty 01/23/2025
[  134.109729] Call trace:
[  134.109736]  dump_backtrace+0x0/0x1e0
[  134.109820]  show_stack+0x34/0x50
[  134.109873]  dump_stack_lvl+0x68/0x84
[  134.109909]  dump_stack+0x18/0x34
[  134.109934]  os_dump_stack+0x1c/0x28 [nvidia]
[  134.111708]  nvAssertFailedBacktrace+0xb0/0xb8 [nvidia]
[  134.113277]  gpuPowerManagementResumeTegra.isra.0+0x6c/0x80 [nvidia]
[  134.114870]  gpuResumeFromStandby_T234D+0x48/0x68 [nvidia]
[  134.116415]  rm_power_management+0x70/0x128 [nvidia]
[  134.117942]  nv_power_management+0x128/0x130 [nvidia]
[  134.119522]  nvidia_resume+0x80/0xc8 [nvidia]
[  134.121081]  nv_pmops_resume+0x2c/0x48 [nvidia]
[  134.122634]  dpm_run_callback+0x40/0x190
[  134.122668]  device_resume+0xa4/0x200
[  134.122696]  dpm_resume+0xe4/0x2f0
[  134.122726]  dpm_resume_end+0x28/0x40
[  134.122754]  suspend_devices_and_enter+0x24c/0x7c0
[  134.122805]  pm_suspend+0x2b0/0x330
[  134.122842]  state_store+0x98/0x120
[  134.122880]  kobj_attr_store+0x18/0x30
[  134.122907]  sysfs_kf_write+0x64/0x80
[  134.122941]  kernfs_fop_write_iter+0x134/0x1d0
[  134.122970]  new_sync_write+0x100/0x1b0
[  134.123005]  vfs_write+0x278/0x390
[  134.123033]  ksys_write+0x80/0x110
[  134.123066]  __arm64_sys_write+0x2c/0x40
[  134.123100]  invoke_syscall+0x5c/0x130
[  134.123149]  el0_svc_common.constprop.0+0x64/0x110
[  134.123196]  do_el0_svc+0x3c/0xa0
[  134.123238]  el0_svc+0x20/0x60
[  134.123279]  el0t_64_sync_handler+0xb0/0xc0
[  134.123319]  el0t_64_sync+0x1a4/0x1a8
[  134.123678] PM: dpm_run_callback(): nv_pmops_resume+0x0/0x48 [nvidia] returns -5
[  134.125351] nv_platform 13800000.display: PM: failed to resume: error -5
[  134.138671] rtk_btusb: btusb_probe intf->cur_altsetting->desc.bInterfaceNumber 0
[  134.138675] rtk_btusb: btusb_probe can_wakeup 1, may wakeup 0
[  134.138677] rtk_btusb: patch_add
[  134.138679] rtk_btusb: auto suspend is disabled
[  134.138680] rtk_btusb: pid = 0x3549
[  134.138683] rtk_btusb: patch_add: Reset gEVersion to 0xff
[  134.138691] rtk_btusb: set_bit(HCI_QUIRK_RESET_ON_CLOSE, &hdev->quirks);
[  134.138830] rtk_btusb: btusb_probe: done
[  134.138832] rtk_btusb: btusb_open start
[  134.138836] rtk_btusb: btusb_open hdev->promisc ==0
[  134.138836] rtk_btusb: download_patch start
[  134.138838] rtk_btusb: chip type value: 0x73
[  134.140501] OOM killer enabled.
[  134.140505] Restarting tasks ...
[  134.143752] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
[  134.143764] Mem abort info:
[  134.143765]   ESR = 0x0000000096000004
[  134.143767]   EC = 0x25: DABT (current EL), IL = 32 bits
[  134.143770]   SET = 0, FnV = 0
[  134.143771]   EA = 0, S1PTW = 0
[  134.143772]   FSC = 0x04: level 0 translation fault
[  134.143773] Data abort info:
[  134.143774]   ISV = 0, ISS = 0x00000004
[  134.143775]   CM = 0, WnR = 0
[  134.143776] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001231fd000
[  134.143779] [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
[  134.143785] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[  134.143791] Modules linked in: nvidia_drm(O) nvidia_modeset(O) lzo_rle lzo_compress zram zsmalloc nvme_fabrics ramoops reed_solomon bridge stp llc usb_f_ncm usb_f_mass_storage algif_hash usb_f_acm algif_skcipher u_serial af_alg usb_f_rndis u_ether libcomposite snd_soc_tegra186_asrc(O) snd_soc_tegra186_arad(O) snd_soc_tegra210_mixer(O) snd_soc_tegra210_afc(O) snd_soc_tegra210_admaif(O) snd_soc_tegra210_mvc(O) snd_soc_tegra_utils(O) snd_soc_tegra_pcm snd_soc_tegra210_ope(O) snd_soc_tegra186_dspk(O) snd_soc_simple_card_utils snd_soc_tegra210_dmic(O) snd_soc_tegra210_amx(O) snd_soc_tegra210_adx(O) snd_soc_tegra210_sfc(O) snd_soc_tegra210_i2s(O) tegra210_adma snd_soc_tegra210_ahub(O) tpm_tis_spi tpm_tis_core nvvrs_pseq_rtc(O) rtk_btusb(O) btusb btrtl option btintel qmi_wwan usb_wwan btbcm cdc_wdm usbnet usbserial bluetooth ecdh_generic xr_usb_serial_common(O) ecc tegra234_oc_event(O) crct10dif_ce nvpmodel_clk_cap(O) mttcan(O) snd_hda_codec_hdmi thermal_trip_event(O) nvpps(O) tegra234_aon(O)
[  134.143873]  can_dev tegra_cactmon_mc_all(O) fusb301(O) tegra_aconnect rtl8822ce(O) snd_hda_tegra cfg80211 snd_hda_codec nvidia(O) spi_tegra114 rfkill at24 pwm_tegra_tachometer(O) snd_hda_core mc_hwpm(O) tegra_pcie_dma_test(O) r8168(O) tegra_pcie_edma(O) tegra_dce(O) host1x_fence(O) nvidia_vrs_pseq(O) tsecriscv(O) nvhost_vi5(O) nvhost_isp5(O) nvhost_nvcsi_t194(O) tegra_camera(O) v4l2_dv_timings nvhost_nvcsi(O) tegra_camera_platform(O) capture_ivc(O) tegra_camera_rtcpu(O) ivc_bus(O) hsp_mailbox_client(O) governor_userspace ivc_ext(O) v4l2_fwnode v4l2_async videobuf2_dma_contig tegra_drm(O) videobuf2_memops nvhost_capture(O) videobuf2_v4l2 videobuf2_common tegra_wmark(O) videodev host1x_nvhost(O) mc nvhwpm(O) cec drm_kms_helper f81435_mode(O) nvidia_p2p(O) ina3221 nvgpu(O) governor_pod_scaling(O) host1x(O) mc_utils(O) nvmap(O) nvsciipc(O) fuse drm ip_tables x_tables ipv6 pwm_fan pwm_tegra tegra_bpmp_thermal tegra_xudc ucsi_ccg typec_ucsi typec nvme nvme_core phy_tegra194_p2u pcie_tegra194
[  134.143960]
[  134.143963] CPU: 5 PID: 2107 Comm: Xorg Tainted: G           O      5.15.148-tegra #1
[  134.143967] Hardware name: NVIDIA Portwell PJAI-1100F-ON4G/Jetson, BIOS r36.4.3-85becaf8-dirty 01/23/2025
[  134.143970] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  134.143974] pc : nvCtxDmaBind+0x28/0x230 [nvidia_modeset]
[  134.144099] lr : nvRmAllocAndBindSurfaceDescriptor+0x74/0x128 [nvidia_modeset]
[  134.144193] sp : ffff80000df13ad0
[  134.144194] x29: ffff80000df13d00 x28: 0000000000000008 x27: ffff0000dcf7aa08
[  134.144197] x26: ffff000091162028 x25: ffff000091162008 x24: 0000000000000009
[  134.144200] x23: ffff0000dcf7aa14 x22: ffff0000dcf7aa08 x21: ffff0000a3ed6008
[  134.144203] x20: 0000000000000000 x19: ffff0000a3ed6008 x18: ffffdf1e3c11e830
[  134.144206] x17: ffffdf1e3c11e840 x16: ffffdf1e45533b40 x15: ffff0000a3e53368
[  134.144209] x14: ffff0000a3e53348 x13: ffffdf1e3c11e880 x12: ffffdf1e3c11e890
[  134.144212] x11: ffff0000f1b0b8a8 x10: ffff80000df13820 x9 : 00000000ffffffe8
[  134.144215] x8 : ffff80000df13800 x7 : 0000000000000000 x6 : ffffdf1e3c21b000
[  134.144217] x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffff80000df13b58
[  134.144220] x2 : 0000000000020102 x1 : 0000000000000000 x0 : 00000000c1d00000
[  134.144224] Call trace:
[  134.144226]  nvCtxDmaBind+0x28/0x230 [nvidia_modeset]
[  134.144314]  nvkms_ioctl+0xe0/0x110 [nvidia_modeset]
[  134.144401]  nvidia_frontend_unlocked_ioctl+0x60/0x80 [nvidia]
[  134.144564]  __arm64_sys_ioctl+0xb4/0x100
[  134.144579]  invoke_syscall+0x5c/0x130
[  134.144591]  el0_svc_common.constprop.0+0x64/0x110
[  134.144595]  do_el0_svc+0x3c/0xa0
[  134.144599]  el0_svc+0x20/0x60
[  134.144606]  el0t_64_sync_handler+0xb0/0xc0
[  134.144610]  el0t_64_sync+0x1a4/0x1a8
[  134.144616] Code: f9002bfe 52802042 910223e3 72a00042 (b9407025)
[  134.144623] ---[ end trace 086a91561824b152 ]---
[  134.148191] rtk_btusb: chip_type->status = 0x0, chip_type->chip = 0xbeef
[  134.148748] rtk_btusb: HCI reset.
[  134.165811] rtk_btusb: read_ver_rsp->lmp_subver = 0x8822
[  134.166090] rtk_btusb: read_ver_rsp->hci_rev = 0xc
[  134.166145] rtk_btusb: patch_entry->lmp_sub = 0x8822
[  134.166200] rtk_btusb: load_firmware start
[  134.166313] rtk_btusb: lmp_version = 0x8822
[  134.166479] rtk_btusb: config filename rtl8822cu_config
[  134.167974] rtk_btusb: no bdaddr file /opt/bdaddr
[  134.168015] done.
[  134.168193] rtk_btusb: Origin cfg len 6
[  134.168302] rtk_btusb: 55 ab 23 87 00 00
[  134.168470] rtk_btusb: New cfg len 6
[  134.168524] rtk_btusb: 55 ab 23 87 00 00
[  134.168633] rtk_btusb: fw name is  rtl8822cu_fw
[  134.173006] rtk_btusb: This is not 8723a, use new patch style!
[  134.173175] rtk_btusb: rtk_get_eversion: gEVersion 255
[  134.176624] rtk_btusb: eversion->status = 0x0, eversion->version = 0x3
[  134.176630] rtk_btusb: load_firmware: New gEVersion 3
[  134.176633] rtk_btusb: rtk_get_fw_project_id: opcode 0, len 1, data 13
[  134.176636] rtk_btusb: lmp_version is 8822, project_id is 8822, match!
[  134.176638] rtk_btusb: fw_version = 0x9a8cbc9
[  134.176640] rtk_btusb: number_of_total_patch = 3
[  134.176641] rtk_btusb: chipID 4
[  134.176643] rtk_btusb: patch_length 0x8590
[  134.176644] rtk_btusb: start_offset 0x00005d00
[  134.176646] rtk_btusb: Svn version: 1940234490
[  134.176647] rtk_btusb: Coexistence: BTCOEX_20210106-2020
[  134.176649] rtk_btusb: buf_len = 0x8596
[  134.176662] rtk_btusb: fw: exists, config file: exists
[  134.176664] rtk_btusb: load_firmware done
[  134.176670] rtk_btusb: download_data start
[  134.619241] rtk_btusb: download_data done
[  134.619248] rtk_btusb: HCI reset.
[  134.634852] rtk_btusb: read_ver_rsp->lmp_subver = 0xcbc9
[  134.635129] rtk_btusb: read_ver_rsp->hci_rev = 0x9a8
[  134.635297] rtk_btusb: patch_entry->lmp_sub = 0x8822
[  134.635740] rtk_btusb: Rtk patch end 0
[  134.635907] rtk_btusb: btusb_open set HCI UP RUNNING
[  134.638731] rtk_btcoex: Open BTCOEX
[  134.638953] rtk_btcoex: rtk_vendor_cmd_to_fw: opcode 0xfc1b
[  134.639123] rtk_btusb: btusb_open end
[  134.644109] rtk_btcoex: unknown hci command
[  134.647384] rtk_btcoex: BTCOEX hci_rev 0x09a8
[  134.647439] rtk_btcoex: BTCOEX lmp_subver 0xcbc9
[  134.675713] rtk_btusb: btusb_notify: hci0 evt 3
[  134.754892] Kernel panic - not syncing:

Hi KevinFFF,
kindly let me know if there are any updates.

Is the issue specific to 2nd suspend?

If the issue is specific to your custom carrier board, please provide details on how your design differs from the devkit, along with your device tree configurations.

Do you also verify on your custom board? Or you are using the devkit?

Hi KevinFFF,

I verified it on our custom board. The devkit using DP instead of HDMI.

NOTE: This issue does not happen on Jetpack 5.1.5.

Hi, KevinFFF
Perhaps you could also try using the Xavier NX carrier board with an Orin NX module — this combination also exhibits the same suspend issue after flashing.
This makes me suspect that the root cause might be related to HDMI, as suspend works perfectly fine on the official devkit setup.

Best regards

Do you mean that you hit the issue with JP6.2 but it works fine with JP5.1.5?

Yes, Orin NX and Xavier NX board(p3509) is not fully pin compatible and may cause un-expected behavior.
Please refer to the table in Jetson FAQ | NVIDIA Developer for details.

Yes, it works fine with JP5.1.5 on our custom board with HDMI connected.

It seems the issue is specific to display driver for HDMI in JP6.
Have you tried porting the display driver from JP5 to JP6 to check if it could help for your case?

Hi, KevinFFF
I will test it,.
thanks!