After a restart (disconnecting power) on a customized AGX Orin board. About every 3rd time the GPU does not start correctly, meaning I can not run cuda-samples. In other cases the GPU works as expected. The GPU is under work when disconnecting the power
See the following output:
root@custom-board~# dmesg | grep gpu | head -n 20
[ 1841.758834] nvgpu: 17000000.ga10b gp10b_priv_ring_isr:217 [ERR] ringmaster intr status0: 0x00000100, status1: 0x00000000
[ 1841.759228] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:560 [ERR] SYS write error: ADR 0x00508914 WRDAT 0x00000104 master 0x00000021
[ 1841.769532] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:563 [ERR] INFO 0x18408321: (subid 0x00000018 priv_level 0 local_ordering 1)
[ 1841.782759] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:568 [ERR] CODE 0xbadf2021
[ 1841.791859] nvgpu: 17000000.ga10b nvgpu_cic_mon_report_err_safety_services:55 [ERR] Error reporting is not supported in this platform
[ 1841.803871] nvgpu: 17000000.ga10b ga10b_priv_ring_decode_error_code:542 [ERR] [Error Type]: orphan(gpc/fbp)
[ 1841.813465] nvgpu: 17000000.ga10b decode_fecs_pri_orphan_error:363 [ERR] [Extra Info]: target_ringstation(0x21)
[ 1890.746087] nvgpu: 17000000.ga10b gp10b_priv_ring_isr:217 [ERR] ringmaster intr status0: 0x00000100, status1: 0x00000000
[ 1890.746487] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:560 [ERR] SYS write error: ADR 0x0050b0c0 WRDAT 0x00001000 master 0x00000021
[ 1890.746876] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:563 [ERR] INFO 0x19408321: (subid 0x00000019 priv_level 0 local_ordering 1)
[ 1890.747252] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:568 [ERR] CODE 0xbadf2021
[ 1890.747529] nvgpu: 17000000.ga10b nvgpu_cic_mon_report_err_safety_services:55 [ERR] Error reporting is not supported in this platform
[ 1890.747878] nvgpu: 17000000.ga10b ga10b_priv_ring_decode_error_code:542 [ERR] [Error Type]: orphan(gpc/fbp)
[ 1890.748141] nvgpu: 17000000.ga10b decode_fecs_pri_orphan_error:363 [ERR] [Extra Info]: target_ringstation(0x21)
[ 1893.746765] nvgpu: 17000000.ga10b nvgpu_timeout_expired_msg_cpu:94 [ERR] Timeout detected @ ga10b_gr_init_wait_idle+0x9c/0x160 [nvgpu]
[ 1893.747161] nvgpu: 17000000.ga10b ga10b_gr_init_wait_idle:364 [ERR] timeout gr busy : 1
[ 1893.747440] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc_golden_ctx_image:776 [ERR] fail
[ 1893.747697] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc:879 [ERR] fail to init golden ctx image
[ 1893.747988] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc:929 [ERR] fail
[ 1893.748221] nvgpu: 17000000.ga10b nvgpu_gr_setup_alloc_obj_ctx:216 [ERR] failed to allocate gr ctx buffer
when I try to run /usr/bin/cuda-samples/UnifiedMemoryStreams
I get the following error (dmesg)
[ 2947.069025] nvgpu: 17000000.ga10b gp10b_priv_ring_isr:217 [ERR] ringmaster intr status0: 0x00000100, status1: 0x00000000
[ 2947.069411] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:560 [ERR] SYS write error: ADR 0x0050b0c0 WRDAT 0x00001000 master 0x00000021
[ 2947.069814] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:563 [ERR] INFO 0x19408321: (subid 0x00000019 priv_level 0 local_ordering 1)
[ 2947.070222] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:568 [ERR] CODE 0xbadf2021
[ 2947.070502] nvgpu: 17000000.ga10b nvgpu_cic_mon_report_err_safety_services:55 [ERR] Error reporting is not supported in this platform
[ 2947.070830] nvgpu: 17000000.ga10b ga10b_priv_ring_decode_error_code:542 [ERR] [Error Type]: orphan(gpc/fbp)
[ 2947.071092] nvgpu: 17000000.ga10b decode_fecs_pri_orphan_error:363 [ERR] [Extra Info]: target_ringstation(0x21)
[ 2947.254376] audit: type=1334 audit(1715690939.092:222): prog-id=88 op=UNLOAD
[ 2947.254618] audit: type=1334 audit(1715690939.092:223): prog-id=87 op=UNLOAD
[ 2950.068943] nvgpu: 17000000.ga10b nvgpu_timeout_expired_msg_cpu:94 [ERR] Timeout detected @ ga10b_gr_init_wait_idle+0x9c/0x160 [nvgpu]
[ 2950.069360] nvgpu: 17000000.ga10b ga10b_gr_init_wait_idle:364 [ERR] timeout gr busy : 1
[ 2950.069630] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc_golden_ctx_image:776 [ERR] fail
[ 2950.069871] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc:879 [ERR] fail to init golden ctx image
[ 2950.070221] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc:929 [ERR] fail
[ 2950.070458] nvgpu: 17000000.ga10b nvgpu_gr_setup_alloc_obj_ctx:216 [ERR] failed to allocate gr ctx buffer
[ 2950.070835] nvgpu: 17000000.ga10b nvgpu_gr_setup_alloc_obj_ctx:273 [ERR] fail
[ 2950.071258] ------------[ cut here ]------------
[ 2950.071581] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.072674] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2950.072804] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2950.153486] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2950.164247] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2950.172647] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2950.178755] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.185053] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2950.191284] sp : ffff80001266bc60
[ 2950.194696] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2950.200207] x27: ffff800016f67000 x26: ffff8000149e1020
[ 2950.205719] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2950.211232] x23: ffffce0dfa5724c8 x22: ffff5661db8f5780
[ 2950.216656] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2950.221996] x19: ffff800016f66d20 x18: 0000000000000000
[ 2950.227506] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2950.233019] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2950.238532] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2950.244046] x11: 0000000000000000 x10: 00000000175a2a2f
[ 2950.249558] x9 : 0000000000000000 x8 : ffff5661db8f5580
[ 2950.254895] x7 : 0000000000000000 x6 : 0000000000000000
[ 2950.260232] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2950.265658] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2950.270994] x1 : 0000000000000000 x0 : 0000000000000000
[ 2950.276333] Call trace:
[ 2950.278848] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.284450] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2950.289203] __arm64_sys_ioctl+0xac/0xf0
[ 2950.292965] el0_svc_common.constprop.0+0x80/0x1c4
[ 2950.297945] do_el0_svc+0x74/0x8c
[ 2950.301186] el0_svc+0x1c/0x2c
[ 2950.304157] el0_sync_handler+0x9c/0x120
[ 2950.308096] el0_sync+0x16c/0x180
[ 2950.311332] ---[ end trace 91a2047c276cbe5d ]---
[ 2950.317000] ------------[ cut here ]------------
[ 2950.320512] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.333474] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2950.333531] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2950.425085] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2950.435846] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2950.444246] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2950.450355] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.456647] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2950.462882] sp : ffff80001266bc60
[ 2950.466294] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2950.471807] x27: ffff800016f66690 x26: ffff8000149e1020
[ 2950.477319] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2950.482834] x23: ffffce0dfa5724c8 x22: ffff5662358ab080
[ 2950.488258] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2950.493596] x19: ffff800016f663b0 x18: 0000000000000000
[ 2950.499107] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2950.504622] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2950.510132] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2950.515646] x11: 0000000000000000 x10: 00000000106fed2f
[ 2950.521157] x9 : 0000000000000000 x8 : ffff5662358abc00
[ 2950.526494] x7 : 0000000000000000 x6 : 0000000000000000
[ 2950.531831] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2950.537258] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2950.542594] x1 : 0000000000000000 x0 : 0000000000000000
[ 2950.547933] Call trace:
[ 2950.550446] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.556050] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2950.560800] __arm64_sys_ioctl+0xac/0xf0
[ 2950.564561] el0_svc_common.constprop.0+0x80/0x1c4
[ 2950.569546] do_el0_svc+0x74/0x8c
[ 2950.572785] el0_svc+0x1c/0x2c
[ 2950.575757] el0_sync_handler+0x9c/0x120
[ 2950.579695] el0_sync+0x16c/0x180
[ 2950.582932] ---[ end trace 91a2047c276cbe5e ]---
[ 2950.587877] ------------[ cut here ]------------
[ 2950.592108] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.605071] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2950.605123] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2950.696686] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2950.707447] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2950.715847] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2950.721948] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.728245] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2950.734484] sp : ffff80001266bc60
[ 2950.737894] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2950.743407] x27: ffff800016f67e28 x26: ffff8000149e1020
[ 2950.748919] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2950.754432] x23: ffffce0dfa5724c8 x22: ffff5661eb826600
[ 2950.759858] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2950.765194] x19: ffff800016f67b48 x18: 0000000000000000
[ 2950.770706] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2950.776219] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2950.781733] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2950.787246] x11: 0000000000000000 x10: 000000001087c82f
[ 2950.792758] x9 : 0000000000000000 x8 : ffff5661c2750c80
[ 2950.798095] x7 : 0000000000000000 x6 : 0000000000000000
[ 2950.803433] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2950.808857] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2950.814195] x1 : 0000000000000000 x0 : 0000000000000000
[ 2950.819533] Call trace:
[ 2950.822044] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.827647] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2950.832400] __arm64_sys_ioctl+0xac/0xf0
[ 2950.836164] el0_svc_common.constprop.0+0x80/0x1c4
[ 2950.841146] do_el0_svc+0x74/0x8c
[ 2950.844385] el0_svc+0x1c/0x2c
[ 2950.847358] el0_sync_handler+0x9c/0x120
[ 2950.851296] el0_sync+0x16c/0x180
[ 2950.854533] ---[ end trace 91a2047c276cbe5f ]---
[ 2950.859640] ------------[ cut here ]------------
[ 2950.863702] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.876673] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2950.876723] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2950.968287] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2950.979046] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2950.987446] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2950.993547] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2950.999845] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2951.006082] sp : ffff80001266bc60
[ 2951.009494] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2951.015007] x27: ffff800016f67970 x26: ffff8000149e1020
[ 2951.020519] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2951.026032] x23: ffffce0dfa5724c8 x22: ffff56622e142c80
[ 2951.031457] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2951.036796] x19: ffff800016f67690 x18: 0000000000000000
[ 2951.042307] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2951.047820] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2951.053333] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2951.058844] x11: 0000000000000000 x10: 0000000010cbb52f
[ 2951.064359] x9 : 0000000000000000 x8 : ffff56622e142200
[ 2951.069694] x7 : 0000000000000000 x6 : 0000000000000000
[ 2951.075032] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2951.080458] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2951.085794] x1 : 0000000000000000 x0 : 0000000000000000
[ 2951.091133] Call trace:
[ 2951.093643] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.099247] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2951.103998] __arm64_sys_ioctl+0xac/0xf0
[ 2951.107761] el0_svc_common.constprop.0+0x80/0x1c4
[ 2951.112745] do_el0_svc+0x74/0x8c
[ 2951.115983] el0_svc+0x1c/0x2c
[ 2951.118959] el0_sync_handler+0x9c/0x120
[ 2951.122896] el0_sync+0x16c/0x180
[ 2951.126133] ---[ end trace 91a2047c276cbe60 ]---
[ 2951.131022] ------------[ cut here ]------------
[ 2951.135301] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.148272] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2951.148324] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2951.239884] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2951.250645] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2951.259048] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2951.265145] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.271443] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2951.277684] sp : ffff80001266bc60
[ 2951.281095] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2951.286608] x27: ffff800016f674b8 x26: ffff8000149e1020
[ 2951.292119] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2951.297633] x23: ffffce0dfa5724c8 x22: ffff5661cab8de00
[ 2951.303058] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2951.308394] x19: ffff800016f671d8 x18: 0000000000000000
[ 2951.313907] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2951.319419] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2951.324932] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2951.330444] x11: 0000000000000000 x10: 000000001748d02f
[ 2951.335958] x9 : 0000000000000000 x8 : ffff5661cab8df80
[ 2951.341295] x7 : 0000000000000000 x6 : 0000000000000000
[ 2951.346632] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2951.352056] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2951.357395] x1 : 0000000000000000 x0 : 0000000000000000
[ 2951.362733] Call trace:
[ 2951.365242] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.370846] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2951.375599] __arm64_sys_ioctl+0xac/0xf0
[ 2951.379360] el0_svc_common.constprop.0+0x80/0x1c4
[ 2951.384347] do_el0_svc+0x74/0x8c
[ 2951.387583] el0_svc+0x1c/0x2c
[ 2951.390558] el0_sync_handler+0x9c/0x120
[ 2951.394494] el0_sync+0x16c/0x180
[ 2951.397733] ---[ end trace 91a2047c276cbe61 ]---
[ 2951.402759] ------------[ cut here ]------------
[ 2951.406901] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.419871] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2951.419924] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2951.511485] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2951.522246] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2951.530648] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2951.536744] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.543044] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2951.549284] sp : ffff80001266bc60
[ 2951.552695] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2951.558208] x27: ffff800016f66b48 x26: ffff8000149e1020
[ 2951.563720] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2951.569231] x23: ffffce0dfa5724c8 x22: ffff5661cfc39a00
[ 2951.574656] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2951.579995] x19: ffff800016f66868 x18: 0000000000000000
[ 2951.585506] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2951.591019] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2951.596533] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2951.602044] x11: 0000000000000000 x10: 0000000010fc422f
[ 2951.607558] x9 : 0000000000000000 x8 : ffff5661cfc39080
[ 2951.612895] x7 : 0000000000000000 x6 : 0000000000000000
[ 2951.618232] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2951.623657] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2951.628995] x1 : 0000000000000000 x0 : 0000000000000000
[ 2951.634334] Call trace:
[ 2951.636841] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.642445] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2951.647200] __arm64_sys_ioctl+0xac/0xf0
[ 2951.650961] el0_svc_common.constprop.0+0x80/0x1c4
[ 2951.655947] do_el0_svc+0x74/0x8c
[ 2951.659183] el0_svc+0x1c/0x2c
[ 2951.662158] el0_sync_handler+0x9c/0x120
[ 2951.666095] el0_sync+0x16c/0x180
[ 2951.669333] ---[ end trace 91a2047c276cbe62 ]---
[ 2951.674191] ------------[ cut here ]------------
[ 2951.678498] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.691473] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2951.691523] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2951.783083] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2951.793848] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2951.802247] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2951.808345] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.814642] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2951.820884] sp : ffff80001266bc60
[ 2951.824294] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2951.829807] x27: ffff800016f661d8 x26: ffff8000149e1020
[ 2951.835320] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2951.840833] x23: ffffce0dfa5724c8 x22: ffff566234900380
[ 2951.846258] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2951.851595] x19: ffff800016f65ef8 x18: 0000000000000000
[ 2951.857107] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2951.862620] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2951.868133] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2951.873645] x11: 0000000000000000 x10: 00000000174b502f
[ 2951.879157] x9 : 0000000000000000 x8 : ffff5661c758c000
[ 2951.884495] x7 : 0000000000000000 x6 : 0000000000000000
[ 2951.889833] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2951.895258] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2951.900595] x1 : 0000000000000000 x0 : 0000000000000000
[ 2951.905932] Call trace:
[ 2951.908443] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.914045] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2951.918798] __arm64_sys_ioctl+0xac/0xf0
[ 2951.922562] el0_svc_common.constprop.0+0x80/0x1c4
[ 2951.927546] do_el0_svc+0x74/0x8c
[ 2951.930782] el0_svc+0x1c/0x2c
[ 2951.933757] el0_sync_handler+0x9c/0x120
[ 2951.937697] el0_sync+0x16c/0x180
[ 2951.940932] ---[ end trace 91a2047c276cbe63 ]---
[ 2951.945962] ------------[ cut here ]------------
[ 2951.950100] WARNING: CPU: 2 PID: 23102 at nvidia/nvgpu/drivers/gpu/nvgpu/common/gr/gr_setup.c:253 nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2951.963071] Modules linked in: veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_tables x_tables br_netfilter overlay cfg80211 aes_ce_blk crypto_simd snd_soc_tegra186_asrc cryptd snd_soc_tegra186_dspk snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_admaif aes_ce_cipher snd_soc_tegra210_afc snd_soc_tegra210_dmic snd_soc_tegra210_mixer snd_soc_tegra210_amx snd_soc_tegra210_adx snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc ghash_ce snd_soc_tegra210_adsp snd_soc_tegra_machine_driver sha2_ce sha256_arm64 snd_soc_tegra_utils sha1_ce snd_soc_simple_card_utils cdc_acm snd_soc_spdif_tx leds_gpio snd_hda_codec_hdmi nvadsp userspace_alert snd_soc_tegra210_ahub tegra210_adma nct1008 snd_hda_tegra tegra_bpmp_thermal snd_hda_codec snd_soc_rt5640 snd_hda_core snd_soc_rl6231 spi_tegra114 pwm_fan nvidia_drm(O) nvidia_modeset(O)
[ 2951.963123] nvidia(O) nvgpu nvmap ina3221 fuse
[ 2952.054684] CPU: 2 PID: 23102 Comm: UnifiedMemorySt Tainted: G W O 5.10.104-l4t-r35.3.ga+g26cfd067b911 #1
[ 2952.065446] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS v35.3.1 01/24/2023
[ 2952.073846] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 2952.079944] pc : nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2952.086242] lr : nvgpu_gr_setup_alloc_obj_ctx+0x20c/0x3f0 [nvgpu]
[ 2952.092482] sp : ffff80001266bc60
[ 2952.095895] x29: ffff80001266bc90 x28: ffff5661cc817000
[ 2952.101407] x27: ffff800016f65d20 x26: ffff8000149e1020
[ 2952.106920] x25: 0000000000000000 x24: ffff5661c72d7d00
[ 2952.112431] x23: ffffce0dfa5724c8 x22: ffff5661cfe54300
[ 2952.117856] x21: ffff8000149e1000 x20: ffff5661c74d0000
[ 2952.123196] x19: ffff800016f65a40 x18: 0000000000000000
[ 2952.128707] x17: 0000000000000000 x16: ffffce0e3f59d7ac
[ 2952.134219] x15: 0000fffff97b12b8 x14: 0000000000000001
[ 2952.139731] x13: 0000000000000038 x12: ffff56622e1f8880
[ 2952.145244] x11: 0000000000000000 x10: 000000001057be2f
[ 2952.150757] x9 : 0000000000000000 x8 : ffff5661cfe54b80
[ 2952.156095] x7 : 0000000000000000 x6 : 0000000000000000
[ 2952.161434] x5 : ffff5661ccb9ab80 x4 : ffffce0dfa5c20d8
[ 2952.166858] x3 : 00000000000000f8 x2 : 0000000000000000
[ 2952.172195] x1 : 0000000000000000 x0 : 0000000000000000
[ 2952.177532] Call trace:
[ 2952.180044] nvgpu_gr_setup_alloc_obj_ctx+0x21c/0x3f0 [nvgpu]
[ 2952.185646] gk20a_channel_ioctl+0xce0/0xff0 [nvgpu]
[ 2952.190396] __arm64_sys_ioctl+0xac/0xf0
[ 2952.194160] el0_svc_common.constprop.0+0x80/0x1c4
[ 2952.199145] do_el0_svc+0x74/0x8c
[ 2952.202382] el0_svc+0x1c/0x2c
[ 2952.205357] el0_sync_handler+0x9c/0x120
[ 2952.209296] el0_sync+0x16c/0x180
[ 2952.212531] ---[ end trace 91a2047c276cbe64 ]---
[ 2952.218970] nvgpu: 17000000.ga10b gp10b_priv_ring_isr:217 [ERR] ringmaster intr status0: 0x00000100, status1: 0x00000000
[ 2952.229107] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:560 [ERR] SYS write error: ADR 0x0050b0c0 WRDAT 0x00001000 master 0x00000021
[ 2952.242479] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:563 [ERR] INFO 0x19408321: (subid 0x00000019 priv_level 0 local_ordering 1)
[ 2952.255976] nvgpu: 17000000.ga10b ga10b_priv_ring_handle_sys_write_errors:568 [ERR] CODE 0xbadf2021
[ 2952.265242] nvgpu: 17000000.ga10b nvgpu_cic_mon_report_err_safety_services:55 [ERR] Error reporting is not supported in this platform
[ 2952.277402] nvgpu: 17000000.ga10b ga10b_priv_ring_decode_error_code:542 [ERR] [Error Type]: orphan(gpc/fbp)
[ 2952.287204] nvgpu: 17000000.ga10b decode_fecs_pri_orphan_error:363 [ERR] [Extra Info]: target_ringstation(0x21)
[ 2955.219754] nvgpu: 17000000.ga10b nvgpu_timeout_expired_msg_cpu:94 [ERR] Timeout detected @ ga10b_gr_init_wait_idle+0x9c/0x160 [nvgpu]
[ 2955.220153] nvgpu: 17000000.ga10b ga10b_gr_init_wait_idle:364 [ERR] timeout gr busy : 1
[ 2955.220433] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc_golden_ctx_image:776 [ERR] fail
[ 2955.220680] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc:879 [ERR] fail to init golden ctx image
[ 2955.220987] nvgpu: 17000000.ga10b nvgpu_gr_obj_ctx_alloc:929 [ERR] fail
[ 2955.221217] nvgpu: 17000000.ga10b nvgpu_gr_setup_alloc_obj_ctx:216 [ERR] failed to allocate gr ctx buffer
[ 2955.221572
I run a customized L4T-R35.3