What causes gk20a_channel_timeout_handler?

Hi,

I’ve gotten gk20a_channel_timeout_handler error and my GUI app was killed by oom-killer.

1. What causes gk20a_channel_timeout_handler? 2. Is there any strange symptoms in my dmesg?

dmesg is…

[38602.202261] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.222261] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.238248] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.274197] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.322205] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.358205] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.418293] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.590216] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38602.622205] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38617.150339] nvgpu: 17000000.gp10b gk20a_channel_timeout_handler:1570 [ERR] Job on channel 505 timed out
[38617.162697] NV_PGRAPH_STATUS: 0x0
[38617.162701] NV_PGRAPH_STATUS1: 0x0
[38617.162705] NV_PGRAPH_STATUS2: 0x0
[38617.162708] NV_PGRAPH_ENGINE_STATUS: 0x0
[38617.162712] NV_PGRAPH_GRFIFO_STATUS : 0x1
[38617.162716] NV_PGRAPH_GRFIFO_CONTROL : 0x10001
[38617.162719] NV_PGRAPH_PRI_FECS_HOST_INT_STATUS : 0x0
[38617.162722] NV_PGRAPH_EXCEPTION : 0x0
[38617.162726] NV_PGRAPH_FECS_INTR : 0x0
[38617.162729] NV_PFIFO_ENGINE_STATUS(GR) : 0x10061006
[38617.162732] NV_PGRAPH_ACTIVITY0: 0x0
[38617.162735] NV_PGRAPH_ACTIVITY1: 0x0
[38617.162738] NV_PGRAPH_ACTIVITY2: 0x0
[38617.162742] NV_PGRAPH_ACTIVITY4: 0x0
[38617.162745] NV_PGRAPH_PRI_SKED_ACTIVITY: 0x0
[38617.162749] NV_PGRAPH_PRI_GPC0_GPCCS_GPC_ACTIVITY0: 0x0
[38617.162753] NV_PGRAPH_PRI_GPC0_GPCCS_GPC_ACTIVITY1: 0x0
[38617.162757] NV_PGRAPH_PRI_GPC0_GPCCS_GPC_ACTIVITY2: 0x0
[38617.162761] NV_PGRAPH_PRI_GPC0_GPCCS_GPC_ACTIVITY3: 0x0
[38617.162765] NV_PGRAPH_PRI_GPC0_TPC0_TPCCS_TPC_ACTIVITY0: 0x0
[38617.162769] NV_PGRAPH_PRI_GPC0_TPC1_TPCCS_TPC_ACTIVITY0: 0x0
[38617.162773] NV_PGRAPH_PRI_GPC0_TPCS_TPCCS_TPC_ACTIVITY0: 0x0
[38617.162777] NV_PGRAPH_PRI_GPCS_GPCCS_GPC_ACTIVITY0: 0x0
[38617.162781] NV_PGRAPH_PRI_GPCS_GPCCS_GPC_ACTIVITY1: 0x0
[38617.162785] NV_PGRAPH_PRI_GPCS_GPCCS_GPC_ACTIVITY2: 0x0
[38617.162788] NV_PGRAPH_PRI_GPCS_GPCCS_GPC_ACTIVITY3: 0x0
[38617.162792] NV_PGRAPH_PRI_GPCS_TPC0_TPCCS_TPC_ACTIVITY0: 0x0
[38617.162796] NV_PGRAPH_PRI_GPCS_TPC1_TPCCS_TPC_ACTIVITY0: 0x0
[38617.162800] NV_PGRAPH_PRI_GPCS_TPCS_TPCCS_TPC_ACTIVITY0: 0x0
[38617.162806] NV_PGRAPH_PRI_BE0_BECS_BE_ACTIVITY0: 0x0
[38617.162809] NV_PGRAPH_PRI_BE1_BECS_BE_ACTIVITY0: 0x0
[38617.162813] NV_PGRAPH_PRI_BES_BECS_BE_ACTIVITY0: 0x0
[38617.162818] NV_PGRAPH_PRI_DS_MPIPE_STATUS: 0x0
[38617.162821] NV_PGRAPH_PRI_FE_GO_IDLE_TIMEOUT : 0x7fffffff
[38617.162825] NV_PGRAPH_PRI_FE_GO_IDLE_INFO : 0x33000700
[38617.162829] NV_PGRAPH_PRI_GPC0_TPC0_TEX_M_TEX_SUBUNITS_STATUS: 0x0
[38617.162834] NV_PGRAPH_PRI_CWD_FS: 0x0
[38617.162837] NV_PGRAPH_PRI_FE_TPC_FS: 0x0
[38617.162841] NV_PGRAPH_PRI_CWD_GPC_TPC_ID(0): 0x0
[38617.162844] NV_PGRAPH_PRI_CWD_SM_ID(0): 0x0
[38617.162847] NV_PGRAPH_PRI_FECS_CTXSW_STATUS_FE_0: 0x0
[38617.162850] NV_PGRAPH_PRI_FECS_CTXSW_STATUS_1: 0x100
[38617.162855] NV_PGRAPH_PRI_GPC0_GPCCS_CTXSW_STATUS_GPC_0: 0x0
[38617.162858] NV_PGRAPH_PRI_GPC0_GPCCS_CTXSW_STATUS_1: 0x380
[38617.162863] NV_PGRAPH_PRI_FECS_CTXSW_IDLESTATE : 0xf
[38617.162867] NV_PGRAPH_PRI_GPC0_GPCCS_CTXSW_IDLESTATE : 0xf
[38617.162871] NV_PGRAPH_PRI_FECS_CURRENT_CTX : 0x1fff9e9
[38617.162874] NV_PGRAPH_PRI_FECS_NEW_CTX : 0x1fff9e9
[38617.162879] NV_PGRAPH_PRI_BE0_CROP_STATUS1 : 0x700000
[38617.162883] NV_PGRAPH_PRI_BES_CROP_STATUS1 : 0x700000
[38617.162887] NV_PGRAPH_PRI_BE0_ZROP_STATUS : 0x0
[38617.162891] NV_PGRAPH_PRI_BE0_ZROP_STATUS2 : 0x0
[38617.162895] NV_PGRAPH_PRI_BES_ZROP_STATUS : 0x0
[38617.162898] NV_PGRAPH_PRI_BES_ZROP_STATUS2 : 0x0
[38617.162902] NV_PGRAPH_PRI_BE0_BECS_BE_EXCEPTION: 0x0
[38617.162906] NV_PGRAPH_PRI_BE0_BECS_BE_EXCEPTION_EN: 0x0
[38617.162911] NV_PGRAPH_PRI_GPC0_GPCCS_GPC_EXCEPTION: 0x0
[38617.162915] NV_PGRAPH_PRI_GPC0_GPCCS_GPC_EXCEPTION_EN: 0x30000
[38617.162919] NV_PGRAPH_PRI_GPC0_TPC0_TPCCS_TPC_EXCEPTION: 0x0
[38617.162923] NV_PGRAPH_PRI_GPC0_TPC0_TPCCS_TPC_EXCEPTION_EN: 0x3
[38617.162951] nvgpu: 17000000.gp10b nvgpu_set_error_notifier_locked:137 [ERR] error notifier set to 8 for ch 505
[38617.176201] ---- mlocks ----

[38617.176228] ---- syncpts ----
[38617.176239] id 5 (disp_d) min 1792686 max 1792686 refs 1 (previous client : )
[38617.176242] id 6 (disp_e) min 1 max 1 refs 1 (previous client : )
[38617.176245] id 7 (disp_f) min 1 max 1 refs 1 (previous client : )
[38617.176249] id 8 (vblank1) min 2334734 max -2 refs 1 (previous client : )
[38617.176263] id 19 (gp10b_507) min 7170758 max 7170758 refs 1 (previous client : )
[38617.176266] id 20 (tegra-vi4) min 385975 max 385975 refs 1 (previous client : )
[38617.176269] id 21 (tegra-vi4) min 385975 max 385975 refs 1 (previous client : )
[38617.176273] id 22 (tegra-vi4) min 385975 max 385975 refs 1 (previous client : )
[38617.176276] id 23 (tegra-vi4) min 385975 max 385975 refs 1 (previous client : )
[38617.176279] id 24 (tegra-vi4) min 385973 max 385973 refs 1 (previous client : )
[38617.176282] id 25 (tegra-vi4) min 385973 max 385973 refs 1 (previous client : )
[38617.176287] id 28 (tegra-vi4) min 385969 max 385969 refs 1 (previous client : )
[38617.176291] id 29 (tegra-vi4) min 385969 max 385969 refs 1 (previous client : )
[38617.176294] id 30 (tegra-vi4) min 385962 max 385962 refs 1 (previous client : )
[38617.176297] id 31 (tegra-vi4) min 385962 max 385962 refs 1 (previous client : )
[38617.176300] id 32 (tegra-vi4) min 385961 max 385961 refs 1 (previous client : )
[38617.176304] id 33 (tegra-vi4) min 385961 max 385961 refs 1 (previous client : )
[38617.176307] id 34 (gp10b_506) min 14 max 14 refs 1 (previous client : )
[38617.176310] id 35 (gp10b_505) min 7174746 max 7174746 refs 1 (previous client : )

[38617.176836] ---- channels ----
[38617.176847]
channel 2 - 15820000.se

[38617.176850] NvHost basic channel registers:
[38617.176853] CMDFIFO_STAT_0: 00002040
[38617.176855] CMDFIFO_RDATA_0: 00008d0c
[38617.176859] CMDP_OFFSET_0: 00000000
[38617.176861] CMDP_CLASS_0: 00000000
[38617.176864] CHANNELSTAT_0: 00000000
[38617.176865] The CDMA sync queue is empty.

[38617.176870]
channel 3 - 15830000.se

[38617.176871] NvHost basic channel registers:
[38617.176873] CMDFIFO_STAT_0: 00002040
[38617.176876] CMDFIFO_RDATA_0: 80450c21
[38617.176879] CMDP_OFFSET_0: 00000000
[38617.176881] CMDP_CLASS_0: 00000000
[38617.176884] CHANNELSTAT_0: 00000000
[38617.176885] The CDMA sync queue is empty.

[38617.176889]
channel 4 - 15840000.se

[38617.176891] NvHost basic channel registers:
[38617.176893] CMDFIFO_STAT_0: 00002040
[38617.176895] CMDFIFO_RDATA_0: 80d5140f
[38617.176899] CMDP_OFFSET_0: 00000000
[38617.176901] CMDP_CLASS_0: 00000000
[38617.176903] CHANNELSTAT_0: 00000000
[38617.176905] The CDMA sync queue is empty.

[38617.176913]
---- host general irq ----

[38617.176916] sync_intc0mask = 0x00000001
[38617.176918] sync_intmask = 0x50000003
[38617.176919]
---- host syncpt irq mask ----

[38617.176921]
---- host syncpt irq status ----

[38617.176924] syncpt_thresh_cpu0_int_status(0) = 0xa2a00020
[38617.176926] syncpt_thresh_cpu0_int_status(1) = 0x0000000a
[38617.176929] syncpt_thresh_cpu0_int_status(2) = 0x00000000
[38617.176931] syncpt_thresh_cpu0_int_status(3) = 0x00000000
[38617.176934] syncpt_thresh_cpu0_int_status(4) = 0x00000000
[38617.176936] syncpt_thresh_cpu0_int_status(5) = 0x00000000
[38617.176939] syncpt_thresh_cpu0_int_status(6) = 0x00000000
[38617.176941] syncpt_thresh_cpu0_int_status(7) = 0x00000000
[38617.176944] syncpt_thresh_cpu0_int_status(8) = 0x00000000
[38617.176946] syncpt_thresh_cpu0_int_status(9) = 0x00000000
[38617.176949] syncpt_thresh_cpu0_int_status(10) = 0x00000000
[38617.176951] syncpt_thresh_cpu0_int_status(11) = 0x00000000
[38617.176954] syncpt_thresh_cpu0_int_status(12) = 0x00000000
[38617.176956] syncpt_thresh_cpu0_int_status(13) = 0x00000000
[38617.176959] syncpt_thresh_cpu0_int_status(14) = 0x00000000
[38617.176961] syncpt_thresh_cpu0_int_status(15) = 0x00000000
[38617.176963] syncpt_thresh_cpu0_int_status(16) = 0x00000000
[38617.176966] syncpt_thresh_cpu0_int_status(17) = 0x00000000
[38617.176972] gp10b pbdma 0:
[38617.176975] id: 6 (tsg), next_id: 6 (tsg) chan status: invalid
[38617.177015] PBDMA_PUT: 0000001f001825a0 PBDMA_GET: 0000001f001825a0 GP_PUT: 000000d1 GP_GET: 000000d1 FETCH: 000000d1 HEADER: 60400000
HDR: 00000000 SHADOW0: 00182580 SHADOW1: 0000201f

[38617.177021] gp10b eng 0:
[38617.177024] id: 6 (tsg), next_id: 6 (tsg), ctx status: invalid

[38617.177029] gp10b eng 1:
[38617.177032] id: 6 (tsg), next_id: 6 (tsg), ctx status: invalid

[38617.177150] 505-gp10b, pid 4398, refs 5:
[38617.177153] channel status: not in use idle not busy
[38617.177158] RAMFC : TOP: 8000001f001825a0 PUT: 0000001f001825a0 GET: 0000001f001825a0 FETCH: 0000001f001825a0
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00002301 SEMAPHORE 0000001e 00010aa0 0001c599 00001004

[38617.177166] 506-gp10b, pid 4371, refs 2:
[38617.177168] channel status: in use idle not busy
[38617.177172] RAMFC : TOP: 8000001f000080e0 PUT: 0000001f000080e0 GET: 0000001f000080e0 FETCH: 0000001f000080e0
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00002201 SEMAPHORE 0000001e 000a0aa0 00000000 00000002

[38617.177179] 507-gp10b, pid 4152, refs 2:
[38617.177181] channel status: in use idle not busy
[38617.177185] RAMFC : TOP: 8000001f0000e880 PUT: 0000001f0000e880 GET: 0000001f0000e880 FETCH: 0000001f0000e880
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001301 SEMAPHORE 0000001e 00060aa0 00000000 00000002

[38617.177192] 508-gp10b, pid 4152, refs 2:
[38617.177194] channel status: in use idle not busy
[38617.177198] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[38617.177205] 509-gp10b, pid 4152, refs 2:
[38617.177207] channel status: in use idle not busy
[38617.177210] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[38617.177217] 510-gp10b, pid 4152, refs 2:
[38617.177219] channel status: in use idle not busy
[38617.177222] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[38617.177229] 511-gp10b, pid 4152, refs 2:
[38617.177231] channel status: in use idle not busy
[38617.177234] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[38621.633958] bpmp: mrq 27 took 1160000 us
[38621.717694] usb 1-3.1.4: reset high-speed USB device number 11 using tegra-xusb
[38628.662885] bpmp: mrq 27 took 1440000 us
[38642.050807] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38642.098803] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38642.162812] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38642.194788] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38642.214783] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1
[38642.230753] extcon-gpio-states external-connection:extcon@1: Cable state:1, cable id:1

What kind of application are you running? Could you share your sample code?

We uses QT(wayland) based application.

It is unlikely to just find out the cause or resolve any issue by knowing that you are running Qt.

If possible, share the sample code that could reproduce issue on TX2 devkit so that we can investiage.

Okay, I’ll let our enginneer prepare to share the source codes.
But the issue doesn’t reproduce often so could you provide some common information about gk20a_channel_timeout_handler error?

There is no common information about such issue. Each case requires analysis.
The last time we see timeout issue was due to the gpu gets overloaded. But that was on Jetson nano.

Thanks.
I’ll investigate how to reproduce it.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.