Hello,
Yesterday my AGX rebooted by itself, since then no longer boots into GUI/Desktop and applications (e.g. Python venvs) do not work properly.
kern.log
shows following entries after reboot:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.769692] ------------[ cut here ]------------
Nov 25 14:30:47 dominik-desktop kernel: [ 455.771149] WARNING: CPU: 6 PID: 7762 at /dvs/git/dirty/git-master_linux/kernel/nvgpu/drivers/gpu/nvgpu/common/mm/nvgpu_mem.c:258 nvgpu_mem_wr_n+0xd0/0xe0 [nvgpu]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772499] Modules linked in: zram overlay spidev binfmt_misc nvgpu bluedroid_pm ip_tables x_tables
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772523]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772529] CPU: 6 PID: 7762 Comm: vulkaninfo Tainted: G W 4.9.140-tegra #1
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772532] Hardware name: Jetson-AGX (DT)
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772536] task: ffffffc7a8c12a00 task.stack: ffffffc7d83b0000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772765] PC is at nvgpu_mem_wr_n+0xd0/0xe0 [nvgpu]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772970] LR is at gr_gk20a_load_golden_ctx_image+0x8c/0x2a0 [nvgpu]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772974] pc : [<ffffff8000fcf2e8>] lr : [<ffffff8000ffdc4c>] pstate: 00400045
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772976] sp : ffffffc7d83b3be0
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772979] x29: ffffffc7d83b3be0 x28: ffffff8012343018
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772986] x27: ffffff8001090c90 x26: ffffff8012343000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772992] x25: ffffff8001090c28 x24: ffffffc7c4748000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.772998] x23: ffffffc7c4740000 x22: ffffff800bc5a000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773005] x21: 0000000000000001 x20: ffffff8012343018
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773011] x19: 0000000000000000 x18: 0000000000000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773017] x17: 0000007f9244c530 x16: ffffff8008272980
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773023] x15: 0000000000000000 x14: 0000000001c378bd
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773030] x13: 000000000000004c x12: 071c71c71c71c71c
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773036] x11: 000000000000000b x10: 0101010101010101
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773043] x9 : fffffffffffffffa x8 : 7f7f7f7f7f7f7f7f
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773049] x7 : fefefeff646c606d x6 : 0000000002209001
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773055] x5 : 0000000000100c80 x4 : 0000000000000001
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773062] x3 : ffffff800bc5a000 x2 : 0000000000000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773071] x1 : ffffff8012343018 x0 : ffffff8000ffdc4c
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773076]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.773079] ---[ end trace 5cf0a372f4d1d0d7 ]---
Nov 25 14:30:47 dominik-desktop kernel: [ 455.774242] Call trace:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.774484] [<ffffff8000fcf2e8>] nvgpu_mem_wr_n+0xd0/0xe0 [nvgpu]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.774677] [<ffffff8000ffdc4c>] gr_gk20a_load_golden_ctx_image+0x8c/0x2a0 [nvgpu]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.774868] [<ffffff8000ffff3c>] gk20a_alloc_obj_ctx+0x6b4/0xac0 [nvgpu]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.775083] [<ffffff8000fa1178>] gk20a_channel_ioctl+0xaf8/0x1320 [nvgpu]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.775091] [<ffffff8008272158>] do_vfs_ioctl+0xb0/0x8d8
Nov 25 14:30:47 dominik-desktop kernel: [ 455.775095] [<ffffff8008272a0c>] SyS_ioctl+0x8c/0xa8
Nov 25 14:30:47 dominik-desktop kernel: [ 455.775101] [<ffffff8008083900>] el0_svc_naked+0x34/0x38
Nov 25 14:30:47 dominik-desktop kernel: [ 455.776455] nvgpu: 17000000.gv11b gk20a_gr_handle_fecs_error:5294 [ERR] ctxsw intr0 set by ucode, error_code: 0x00000015
Nov 25 14:30:47 dominik-desktop kernel: [ 455.777846] ---- mlocks ----
Nov 25 14:30:47 dominik-desktop kernel: [ 455.777894]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.777897] ---- syncpts ----
Nov 25 14:30:47 dominik-desktop kernel: [ 455.777907] id 2 (disp_a) min 1 max 1 refs 1 (previous client : )
Nov 25 14:30:47 dominik-desktop kernel: [ 455.777911] id 3 (disp_b) min 1 max 1 refs 1 (previous client : )
Nov 25 14:30:47 dominik-desktop kernel: [ 455.777920] id 8 (vblank0) min 27225 max -2 refs 1 (previous client : )
Nov 25 14:30:47 dominik-desktop kernel: [ 455.777934] id 20 (gv11b_511) min 5 max 6 refs 1 (previous client : gv11b_511)
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778524]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778527] ---- channels ----
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778547]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778547] channel 2 - 15820000.se
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778547]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778550] NvHost basic channel registers:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778555] CMDFIFO_STAT_0: 00002040
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778559] CMDFIFO_RDATA_0: 8e4408b8
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778565] CMDP_OFFSET_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778569] CMDP_CLASS_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778572] CHANNELSTAT_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778575] The CDMA sync queue is empty.
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778577]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778581]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778581] channel 3 - 15830000.se
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778581]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778584] NvHost basic channel registers:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778587] CMDFIFO_STAT_0: 00002040
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778591] CMDFIFO_RDATA_0: 0000a400
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778595] CMDP_OFFSET_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778598] CMDP_CLASS_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778601] CHANNELSTAT_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778604] The CDMA sync queue is empty.
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778606]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778610]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778610] channel 4 - 15840000.se
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778610]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778613] NvHost basic channel registers:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778616] CMDFIFO_STAT_0: 00002040
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778619] CMDFIFO_RDATA_0: 04040028
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778623] CMDP_OFFSET_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778626] CMDP_CLASS_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778629] CHANNELSTAT_0: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778632] The CDMA sync queue is empty.
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778634]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778639]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778639] ---- host general irq ----
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778639]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778643] sync_intc0mask = 0x00000001
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778646] sync_intmask = 0x50000003
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778648]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778648] ---- host syncpt irq mask ----
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778648]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778651]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778651] ---- host syncpt irq status ----
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778651]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778655] syncpt_thresh_cpu0_int_status(0) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778659] syncpt_thresh_cpu0_int_status(1) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778663] syncpt_thresh_cpu0_int_status(2) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778666] syncpt_thresh_cpu0_int_status(3) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778670] syncpt_thresh_cpu0_int_status(4) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778674] syncpt_thresh_cpu0_int_status(5) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778677] syncpt_thresh_cpu0_int_status(6) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778681] syncpt_thresh_cpu0_int_status(7) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778685] syncpt_thresh_cpu0_int_status(8) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778688] syncpt_thresh_cpu0_int_status(9) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778691] syncpt_thresh_cpu0_int_status(10) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778695] syncpt_thresh_cpu0_int_status(11) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778698] syncpt_thresh_cpu0_int_status(12) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778701] syncpt_thresh_cpu0_int_status(13) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778705] syncpt_thresh_cpu0_int_status(14) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778708] syncpt_thresh_cpu0_int_status(15) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778711] syncpt_thresh_cpu0_int_status(16) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778715] syncpt_thresh_cpu0_int_status(17) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778718] syncpt_thresh_cpu0_int_status(18) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778721] syncpt_thresh_cpu0_int_status(19) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778725] syncpt_thresh_cpu0_int_status(20) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778728] syncpt_thresh_cpu0_int_status(21) = 0x00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778734] gv11b pbdma 0:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778738] id: 0 (tsg), next_id: 0 (tsg) chan status: valid
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778754] PBDMA_PUT: 0000001efc020934 PBDMA_GET: 0000001efc020560 GP_PUT: 00000002 GP_GET: 00000001 FETCH: 00000002 HEADER: 800015d0
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778754] HDR: 80000574 SHADOW0: fc020000 SHADOW1: 0009341e
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778758] gv11b pbdma 1:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778761] id: 144 (tsg), next_id: 1 (tsg) chan status: invalid
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778774] PBDMA_PUT: 0000000821880220 PBDMA_GET: 00000065002a4d80 GP_PUT: 00000000 GP_GET: 10080080 FETCH: 00000000 HEADER: 00002010
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778774] HDR: 49104900 SHADOW0: 41881820 SHADOW1: 40001148
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778778] gv11b pbdma 2:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778781] id: 0 (tsg), next_id: 8 (tsg) chan status: invalid
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778794] PBDMA_PUT: 0000001202000950 PBDMA_GET: 0000000422204000 GP_PUT: 00000000 GP_GET: d0c02a40 FETCH: 00000000 HEADER: a0510044
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778794] HDR: 01808500 SHADOW0: 3001a804 SHADOW1: 00c0cb10
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778796]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778804] gv11b eng 0:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778807] id: 0 (tsg), next_id: 0 (tsg), ctx status: valid
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778809]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778812] gv11b eng 1:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778815] id: 417 (tsg), next_id: 130 (tsg), ctx status: invalid
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778817]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778820] gv11b eng 2:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778823] id: 8 (tsg), next_id: 4 (tsg), ctx status: invalid
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778825]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778828] gv11b eng 3:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778831] id: 16 (tsg), next_id: 1 (tsg), ctx status: invalid
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778833]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778835]
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778883] 511-gv11b, pid 7762, refs: 5:
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778887] channel status: in use on_pbdma busy
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778894] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778894] HEADER: 60400000 COUNT: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778894] SEMAPHORE: addr hi: 00000000 addr lo: 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778894] payload 00000000 execute 00000000
Nov 25 14:30:47 dominik-desktop kernel: [ 455.778896]
Nov 25 14:30:54 dominik-desktop kernel: [ 462.675204] tegradc 15200000.nvdisplay: read_edid_into_buffer: extension_blocks = 1, max_ext_blocks = 3
Nov 25 14:30:54 dominik-desktop kernel: [ 462.690726] tegradc 15200000.nvdisplay: hdmi_recheck_edid: read_edid_into_buffer() returned 256
Nov 25 14:30:54 dominik-desktop kernel: [ 462.690737] tegradc 15200000.nvdisplay: old edid len = 256
Nov 25 14:30:54 dominik-desktop kernel: [ 462.690759] tegradc 15200000.nvdisplay: hdmi: No EDID change after HPD bounce, taking no action
On the monitor screen some error messages are shown:
Any idea what the issue might be and where to look next?