Desktop crash issue

hi, we meet another desktop crash issue.
we use opencv to display camera data, at the same time, execute nvidia-smi about 1 time/second.
seems the driver meet deadlock issue.

we use R38.2.1 version.

this issue is different with below issue. we already set gpu with performance mode to disable devfreq to work.

dmesg.txt (131.6 KB)

[ 3144.146537] INFO: task nvidia-modeset/:1815 blocked for more than 120 seconds.
[ 3144.146565] Tainted: G W O 6.8.12-tegra #1
[ 3144.146579] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 3144.153846] task:nvidia-modeset/ state:D stack:0 pid:1815 tgid:1815 ppid:2 flags:0x00000208
[ 3144.153857] Call trace:
[ 3144.153859] __switch_to+0xe0/0x108
[ 3144.153878] __schedule+0x368/0xbe4
[ 3144.153885] schedule+0x34/0xc8
[ 3144.153891] schedule_timeout+0x1a4/0x1b4
[ 3144.153897] __down_common+0x104/0x218
[ 3144.153905] __down+0x18/0x24
[ 3144.153912] down+0x50/0x6c
[ 3144.153915] nvkms_kthread_q_callback+0x90/0x17c [nvidia_modeset]
[ 3144.154014] _main_loop+0x90/0x14c [nvidia_modeset]
[ 3144.154086] kthread+0x110/0x114
[ 3144.154092] ret_from_fork+0x10/0x20
[ 3144.154225] INFO: task nvidia-smi:11295 blocked for more than 120 seconds.
[ 3144.161176] Tainted: G W O 6.8.12-tegra #1
[ 3144.166707] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 3144.176082] task:nvidia-smi state:D stack:0 pid:11295 tgid:11278 ppid:11275 flags:0x00000204
[ 3144.176093] Call trace:
[ 3144.176095] __switch_to+0xe0/0x108
[ 3144.176105] __schedule+0x368/0xbe4
[ 3144.176110] schedule+0x34/0xc8
[ 3144.176116] schedule_preempt_disabled+0x24/0x40
[ 3144.176122] rwsem_down_read_slowpath+0x214/0x51c
[ 3144.176125] down_read+0xa0/0xa8
[ 3144.176128] os_acquire_rwlock_read+0x38/0x64 [nvidia]
[ 3144.176671] portSyncRwLockAcquireRead+0x10/0x30 [nvidia]
[ 3144.177216] rmapiLockAcquire+0x29c/0x320 [nvidia]
[ 3144.178004] rmapiPrologue+0x124/0x180 [nvidia]
[ 3144.178388] _rmapiRmControl+0x408/0x6a0 [nvidia]
[ 3144.178844] rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
[ 3144.179203] rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
[ 3144.179553] _nv04ControlWithSecInfo.constprop.0+0x80/0xa0 [nvidia]
[ 3144.179997] Nv04ControlWithSecInfo+0x34/0x40 [nvidia]
[ 3144.180330] RmIoctl+0x884/0xbf0 [nvidia]
[ 3144.180687] rm_ioctl+0x64/0x430 [nvidia]
[ 3144.181031] nvidia_unlocked_ioctl+0x664/0x76c [nvidia]
[ 3144.181458] __arm64_sys_ioctl+0xac/0xf0
[ 3144.181467] invoke_syscall+0x48/0x114
[ 3144.181474] el0_svc_common.constprop.0+0xc0/0xe0
[ 3144.181479] do_el0_svc+0x1c/0x28
[ 3144.181484] el0_svc+0x30/0xa8
[ 3144.181488] el0t_64_sync_handler+0x120/0x12c
[ 3144.181492] el0t_64_sync+0x194/0x198

we use cv::imshow() to display camera info, and execute nvidia-smi all the time.
below is the crash info, the desktop is dead.

below is another case that meet crash problem. we don’t execute nvidia-smi, just use opencv to display camera data.
[ 485.842404] INFO: task nvidia-modeset/:1825 blocked for more than 120 seconds.
[ 485.842432] Tainted: G W O 6.8.12-tegra #1
[ 485.842444] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 485.849426] task:nvidia-modeset/ state:D stack:0 pid:1825 tgid:1825 ppid:2 flags:0x00000208
[ 485.849435] Call trace:
[ 485.849437] __switch_to+0xe0/0x108
[ 485.849452] __schedule+0x368/0xbe4
[ 485.849459] schedule+0x34/0xc8
[ 485.849465] schedule_timeout+0x1a4/0x1b4
[ 485.849469] __down_common+0x104/0x218
[ 485.849476] __down+0x18/0x24
[ 485.849482] down+0x50/0x6c
[ 485.849485] nvkms_kthread_q_callback+0x90/0x17c [nvidia_modeset]
[ 485.849583] _main_loop+0x90/0x14c [nvidia_modeset]
[ 485.849658] kthread+0x110/0x114
[ 485.849666] ret_from_fork+0x10/0x20
[ 606.674949] INFO: task nvidia-modeset/:1825 blocked for more than 241 seconds.
[ 606.674977] Tainted: G W O 6.8.12-tegra #1
[ 606.674990] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 606.681979] task:nvidia-modeset/ state:D stack:0 pid:1825 tgid:1825 ppid:2 flags:0x00000208
[ 606.681988] Call trace:
[ 606.681990] __switch_to+0xe0/0x108
[ 606.682009] __schedule+0x368/0xbe4
[ 606.682016] schedule+0x34/0xc8
[ 606.682023] schedule_timeout+0x1a4/0x1b4
[ 606.682029] __down_common+0x104/0x218
[ 606.682037] __down+0x18/0x24
[ 606.682044] down+0x50/0x6c
[ 606.682047] nvkms_kthread_q_callback+0x90/0x17c [nvidia_modeset]
[ 606.682147] _main_loop+0x90/0x14c [nvidia_modeset]
[ 606.682224] kthread+0x110/0x114
[ 606.682232] ret_from_fork+0x10/0x20

when desktop crash issue happen, xorg and gnome-shell take 100% of a cpu core.

20251124-101133

when desktop crash happen, this thread enter D status.
[ 693.291882] task:irq/238-host1x_ state:D stack:0 pid:347 tgid:347 ppid:2 flags:0x00000008
[ 693.291884] Call trace:
[ 693.291885] __switch_to+0xe0/0x110
[ 693.291886] __schedule+0x3dc/0xbd8
[ 693.291888] schedule+0x3c/0x108
[ 693.291889] schedule_timeout+0xa8/0x1d0
[ 693.291891] __wait_for_common+0xe4/0x218
[ 693.291893] wait_for_completion_timeout+0x28/0x40
[ 693.291895] tegra_bpmp_transfer+0x1d0/0x3f8
[ 693.291900] tegra264_mc_icc_set+0xf4/0x1a8
[ 693.291904] apply_constraints+0x74/0xc0
[ 693.291907] icc_set_bw+0xbc/0x2d0
[ 693.291909] vic_devfreq_target+0x70/0x110 [tegra_drm]
[ 693.291924] devfreq_set_target+0x98/0x230
[ 693.291926] devfreq_update_target+0xc8/0xe8
[ 693.291927] update_devfreq+0x1c/0x30
[ 693.291929] vic_actmon_event+0x58/0x88 [tegra_drm]
[ 693.291937] host1x_actmon_handle_interrupt+0xec/0x148 [host1x]
[ 693.291951] host1x_general_isr+0x60/0x68 [host1x]
[ 693.291959] irq_thread_fn+0x34/0xb8
[ 693.291962] irq_thread+0x1a0/0x2a0
[ 693.291963] kthread+0x124/0x130

if set /sys/class/devfreq/8188050000.vic/governor as performance, still can reproduce this issue.

[ 693.291882] task:irq/238-host1x_ state:D stack:0 pid:347 tgid:347 ppid:2 flags:0x00000008
[ 693.291884] Call trace:
[ 693.291885] __switch_to+0xe0/0x110
[ 693.291886] __schedule+0x3dc/0xbd8
[ 693.291888] schedule+0x3c/0x108
[ 693.291889] schedule_timeout+0xa8/0x1d0
[ 693.291891] __wait_for_common+0xe4/0x218
[ 693.291893] wait_for_completion_timeout+0x28/0x40
[ 693.291895] tegra_bpmp_transfer+0x1d0/0x3f8
[ 693.291900] tegra264_mc_icc_set+0xf4/0x1a8
[ 693.291904] apply_constraints+0x74/0xc0
[ 693.291907] icc_set_bw+0xbc/0x2d0
[ 693.291909] vic_devfreq_target+0x70/0x110 [tegra_drm]
[ 693.291924] devfreq_set_target+0x98/0x230
[ 693.291926] devfreq_update_target+0xc8/0xe8
[ 693.291927] update_devfreq+0x1c/0x30
[ 693.291929] vic_actmon_event+0x58/0x88 [tegra_drm]
[ 693.291937] host1x_actmon_handle_interrupt+0xec/0x148 [host1x]
[ 693.291951] host1x_general_isr+0x60/0x68 [host1x]
[ 693.291959] irq_thread_fn+0x34/0xb8
[ 693.291962] irq_thread+0x1a0/0x2a0
[ 693.291963] kthread+0x124/0x130
[ 693.291965] ret_from_fork+0x10/0x20

i add some log at nvkms_kthread_q_callback() and nvkms_ioctl_common()

xorg get the lock, didn’t release. and then kthread_q_callback request the lock, hang issue happen.

[ 74.782799] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xf, pid=2906, comm=gnome-shell)
[ 74.782809] NVKMS_DEBUG: IOCTL got the lock!
[ 74.782942] NVKMS_DEBUG: IOCTL released the lock.
[ 74.798872] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xf, pid=2906, comm=gnome-shell)
[ 74.798882] NVKMS_DEBUG: IOCTL got the lock!
[ 74.799004] NVKMS_DEBUG: IOCTL released the lock.
[ 74.815646] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xf, pid=2906, comm=gnome-shell)
[ 74.815653] NVKMS_DEBUG: IOCTL got the lock!
[ 74.815775] NVKMS_DEBUG: IOCTL released the lock.
[ 74.817778] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xe, pid=2488, comm=Xorg)
[ 74.817786] NVKMS_DEBUG: IOCTL got the lock!
[ 74.820660] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 512
[ 76.244142] NVKMS_DEBUG: kthread_q_callback called (pid=1739, comm=nvidia-modeset/)

continue add some log, i found the program stuck at IsChannelMethodPending(),we don’t have the source code, please help check it.

[  277.312703] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xf, pid=2752, comm=gnome-shell)
[  277.312713] NVKMS_DEBUG: IOCTL got the lock!
[  277.312834] NVKMS_DEBUG: IOCTL released the lock.
[  277.329335] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xf, pid=2752, comm=gnome-shell)
[  277.329345] NVKMS_DEBUG: IOCTL got the lock!
[  277.329472] NVKMS_DEBUG: IOCTL released the lock.
[  277.337710] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0x32, pid=2378, comm=Xorg)
[  277.337734] NVKMS_DEBUG: IOCTL got the lock!
[  277.337770] NVKMS_DEBUG: IOCTL released the lock.
[  277.340419] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xe, pid=2378, comm=Xorg)
[  277.340427] NVKMS_DEBUG: IOCTL got the lock!
[  277.340443] nvidia-modeset: IdleBaseChannel enter, pOpenDev=000000001cd3df01

[  277.340450] nvidia-modeset: IdleBaseChannelAll enter type:0

[  277.340456] nvidia-modeset: IdleBaseChannelAll loop iteration1
[  277.340465] nvidia-modeset: nvIdleMainLayerChannelCheckIdleOneApiHead enter

[  278.993780] NVKMS_DEBUG: kthread_q_callback called (pid=1776, comm=nvidia-modeset/)
[  448.978815] INFO: task nvidia-modeset/:1776 blocked for more than 147 seconds.
[  448.978843]       Tainted: G        W  O       6.8.12-tegra #2
[  448.978856] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[  448.985930] task:nvidia-modeset/ state:D stack:0     pid:1776  tgid:1776  ppid:2      flags:0x00000208
[  448.985939] Call trace:
[  448.985942]  __switch_to+0xe0/0x110
[  448.985956]  __schedule+0x3dc/0xbd8
[  448.985962]  schedule+0x3c/0x108
[  448.985966]  schedule_timeout+0x1b4/0x1d0
[  448.985977]  __down_common+0x138/0x260
[  448.985983]  __down+0x20/0x38
[  448.985988]  down+0x6c/0x90
[  448.985993]  nvkms_kthread_q_callback+0xf8/0x1f0 [nvidia_modeset]
[  448.986099]  _main_loop+0xa4/0x168 [nvidia_modeset]
[  448.986179]  kthread+0x124/0x130
[  448.986188]  ret_from_fork+0x10/0x20
NvBool nvIdleMainLayerChannelCheckIdleOneApiHead(NVDispEvoPtr pDispEvo,

                                                 NvU32 apiHead)

{

    NVDevEvoPtr pDevEvo = pDispEvo->pDevEvo;

    const NVDispApiHeadStateEvoRec *pApiHeadState =

        &pDispEvo->apiHeadState[apiHead];

    NvU32 head;

    nvEvoLog(EVO_LOG_INFO, "nvIdleMainLayerChannelCheckIdleOneApiHead enter\n");

    FOR_EACH_EVO_HW_HEAD_IN_MASK(pApiHeadState->hwHeadsMask, head) {

        NVEvoChannelPtr pMainLayerChannel =

            pDevEvo->head[head].layer[NVKMS_MAIN_LAYER];

        NvBool isMethodPending = FALSE;

        NvBool ret;




        ret = pDevEvo->hal->IsChannelMethodPending(pDevEvo, pMainLayerChannel,

            pDispEvo->displayOwner, &isMethodPending);




        if (ret && isMethodPending) {

            nvEvoLog(EVO_LOG_INFO, "nvIdleMainLayerChannelCheckIdleOneApiHead exit1\n");

            return FALSE;

        }

    }

    nvEvoLog(EVO_LOG_INFO, "nvIdleMainLayerChannelCheckIdleOneApiHead exit2\n");

    return TRUE;

}

continue add some log, find stuck at below function.

src/nvidia-modeset/src/nvkms-evo3.c:7029:NvBool nvEvoIsChannelMethodPendingC3

[  248.336164] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xb, pid=2665, comm=InputThread)
[  248.336168] NVKMS_DEBUG: IOCTL got the lock!
[  248.336174] NVKMS_DEBUG: IOCTL released the lock.
[  248.344496] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xb, pid=2665, comm=InputThread)
[  248.344502] NVKMS_DEBUG: IOCTL got the lock!
[  248.344510] NVKMS_DEBUG: IOCTL released the lock.
[  248.345082] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xe, pid=2558, comm=Xorg)
[  248.345091] NVKMS_DEBUG: IOCTL got the lock!
[  248.345098] nvidia-modeset: IdleBaseChannel enter, pOpenDev=0000000060803cbe

[  248.345102] nvidia-modeset: IdleBaseChannelAll enter type:0

[  248.345104] nvidia-modeset: IdleBaseChannelAll loop iteration1
[  248.345108] nvidia-modeset: nvIdleMainLayerChannelCheckIdleOneApiHead enter

[  248.345111] nvidia-modeset: nvEvoIsChannelMethodPendingC3 enter

[  248.352074] NVKMS_DEBUG: IOCTL trying to get lock (cmd=0xb, pid=2665, comm=InputThread)
[  250.328043] NVKMS_DEBUG: kthread_q_callback called (pid=1824, comm=nvidia-modeset/)
[  485.849461] INFO: task nvidia-modeset/:1824 blocked for more than 120 seconds.
[  485.849487]       Tainted: G        W  O       6.8.12-tegra #1
[  485.849500] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  485.856481] task:nvidia-modeset/ state:D stack:0     pid:1824  tgid:1824  ppid:2      flags:0x00000208
[  485.856490] Call trace:
[  485.856493]  __switch_to+0xe0/0x108
[  485.856512]  __schedule+0x368/0xbe4
[  485.856520]  schedule+0x34/0xc8
[  485.856526]  schedule_timeout+0x1a4/0x1b4
[  485.856531]  __down_common+0x104/0x218
[  485.856539]  __down+0x18/0x24
[  485.856546]  down+0x50/0x6c
[  485.856549]  nvkms_kthread_q_callback+0xa8/0x198 [nvidia_modeset]
[  485.856654]  _main_loop+0x90/0x14c [nvidia_modeset]
[  485.856730]  kthread+0x110/0x114
[  485.856738]  ret_from_fork+0x10/0x20
[  485.856801] INFO: task nvidia-smi:4971 blocked for more than 120 seconds.
[  485.863518]       Tainted: G        W  O       6.8.12-tegra #1
[  485.869050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  485.869053] task:nvidia-smi      state:D stack:0     pid:4971  tgid:4948  ppid:4944   flags:0x00000204
[  485.869059] Call trace:
[  485.869061]  __switch_to+0xe0/0x108
[  485.877132]  __schedule+0x368/0xbe4
[  485.877135]  schedule+0x34/0xc8
[  485.877138]  schedule_preempt_disabled+0x24/0x40
[  485.877141]  rwsem_down_read_slowpath+0x214/0x51c
[  485.877143]  down_read+0xa0/0xa8
[  485.877144]  os_acquire_rwlock_read+0x38/0x64 [nvidia]
[  485.877380]  portSyncRwLockAcquireRead+0x10/0x30 [nvidia]
[  485.877605]  rmapiLockAcquire+0x29c/0x320 [nvidia]
[  485.877795]  rmapiPrologue+0x124/0x180 [nvidia]
[  485.877976]  _rmapiRmControl+0x408/0x6a0 [nvidia]
[  485.878153]  rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
[  485.878324]  rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
[  485.878500]  _nv04ControlWithSecInfo.constprop.0+0x80/0xa0 [nvidia]
[  485.878714]  Nv04ControlWithSecInfo+0x34/0x40 [nvidia]
[  485.878883]  RmIoctl+0x884/0xbf0 [nvidia]
[  485.879063]  rm_ioctl+0x64/0x430 [nvidia]
[  485.879239]  nvidia_unlocked_ioctl+0x664/0x76c [nvidia]
[  485.879458]  __arm64_sys_ioctl+0xac/0xf0
[  485.879468]  invoke_syscall+0x48/0x114
[  485.879476]  el0_svc_common.constprop.0+0xc0/0xe0
[  485.879483]  do_el0_svc+0x1c/0x28
[  485.879489]  el0_svc+0x30/0xa8
[  485.879496]  el0t_64_sync_handler+0x120/0x12c
[  485.879502]  el0t_64_sync+0x194/0x198

below is the proc/sysrq-trigger log.
there is a irq task working, irq/238-host1x_, maybe vic is adjusting the freq, at this time, nvEvoIsChannelMethodPendingC3 have no reponse.
[ 376.380721] r8152 2-1.4:1.0 enx00e04c1f3f10: renamed from eth0
[ 485.849371] INFO: task nvidia-modeset/:1850 blocked for more than 120 seconds.
[ 485.849399] Tainted: G W O 6.8.12-tegra #1
[ 485.849412] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 485.856417] task:nvidia-modeset/ state:D stack:0 pid:1850 tgid:1850 ppid:2 flags:0x00000208
[ 485.856425] Call trace:
[ 485.856428] __switch_to+0xe0/0x108
[ 485.856443] __schedule+0x368/0xbe4
[ 485.856450] schedule+0x34/0xc8
[ 485.856457] schedule_timeout+0x1a4/0x1b4
[ 485.856462] __down_common+0x104/0x218
[ 485.856470] __down+0x18/0x24
[ 485.856477] down+0x50/0x6c
[ 485.856480] nvkms_kthread_q_callback+0x90/0x17c [nvidia_modeset]
[ 485.856579] _main_loop+0x90/0x14c [nvidia_modeset]
[ 485.856656] kthread+0x110/0x114
[ 485.856662] ret_from_fork+0x10/0x20
[ 485.856760] INFO: task nvidia-smi:10936 blocked for more than 120 seconds.
[ 485.863077] Tainted: G W O 6.8.12-tegra #1
[ 485.868978] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 485.876720] task:nvidia-smi state:D stack:0 pid:10936 tgid:10911 ppid:10905 flags:0x00800204
[ 485.876728] Call trace:
[ 485.876730] __switch_to+0xe0/0x108
[ 485.876740] __schedule+0x368/0xbe4
[ 485.876746] schedule+0x34/0xc8
[ 485.876753] schedule_preempt_disabled+0x24/0x40
[ 485.876759] rwsem_down_read_slowpath+0x214/0x51c
[ 485.876762] down_read+0xa0/0xa8
[ 485.876765] os_acquire_rwlock_read+0x38/0x64 [nvidia]
[ 485.877328] portSyncRwLockAcquireRead+0x10/0x30 [nvidia]
[ 485.877897] rmapiLockAcquire+0x29c/0x320 [nvidia]
[ 485.878436] rmapiPrologue+0x124/0x180 [nvidia]
[ 485.878620] _rmapiRmControl+0x408/0x6a0 [nvidia]
[ 485.878801] rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
[ 485.878980] rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
[ 485.879145] _nv04ControlWithSecInfo.constprop.0+0x80/0xa0 [nvidia]
[ 485.879310] Nv04ControlWithSecInfo+0x34/0x40 [nvidia]
[ 485.879472] RmIoctl+0x884/0xbf0 [nvidia]
[ 485.879659] rm_ioctl+0x64/0x430 [nvidia]
[ 485.879833] nvidia_unlocked_ioctl+0x664/0x76c [nvidia]
[ 485.880016] __arm64_sys_ioctl+0xac/0xf0
[ 485.880020] invoke_syscall+0x48/0x114
[ 485.880024] el0_svc_common.constprop.0+0xc0/0xe0
[ 485.880027] do_el0_svc+0x1c/0x28
[ 485.880036] el0_svc+0x30/0xa8
[ 485.880039] el0t_64_sync_handler+0x120/0x12c
[ 485.880041] el0t_64_sync+0x194/0x198
[ 606.682058] INFO: task nvidia-modeset/:1850 blocked for more than 241 seconds.
[ 606.682080] Tainted: G W O 6.8.12-tegra #1
[ 606.682091] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 606.689095] task:nvidia-modeset/ state:D stack:0 pid:1850 tgid:1850 ppid:2 flags:0x00000208
[ 606.689104] Call trace:
[ 606.689106] __switch_to+0xe0/0x108
[ 606.689124] __schedule+0x368/0xbe4
[ 606.689132] schedule+0x34/0xc8
[ 606.689139] schedule_timeout+0x1a4/0x1b4
[ 606.689146] __down_common+0x104/0x218
[ 606.689157] __down+0x18/0x24
[ 606.689165] down+0x50/0x6c
[ 606.689168] nvkms_kthread_q_callback+0x90/0x17c [nvidia_modeset]
[ 606.689268] _main_loop+0x90/0x14c [nvidia_modeset]
[ 606.689344] kthread+0x110/0x114
[ 606.689352] ret_from_fork+0x10/0x20
[ 606.689480] INFO: task nvidia-smi:10936 blocked for more than 241 seconds.
[ 606.695918] Tainted: G W O 6.8.12-tegra #1
[ 606.701674] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 606.709359] task:nvidia-smi state:D stack:0 pid:10936 tgid:10911 ppid:10905 flags:0x00800204
[ 606.709364] Call trace:
[ 606.709366] __switch_to+0xe0/0x108
[ 606.709375] __schedule+0x368/0xbe4
[ 606.709379] schedule+0x34/0xc8
[ 606.709382] schedule_preempt_disabled+0x24/0x40
[ 606.709386] rwsem_down_read_slowpath+0x214/0x51c
[ 606.709389] down_read+0xa0/0xa8
[ 606.709390] os_acquire_rwlock_read+0x38/0x64 [nvidia]
[ 606.709787] portSyncRwLockAcquireRead+0x10/0x30 [nvidia]
[ 606.710132] rmapiLockAcquire+0x29c/0x320 [nvidia]
[ 606.710479] rmapiPrologue+0x124/0x180 [nvidia]
[ 606.710796] _rmapiRmControl+0x408/0x6a0 [nvidia]
[ 606.711091] rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
[ 606.711362] rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
[ 606.711712] _nv04ControlWithSecInfo.constprop.0+0x80/0xa0 [nvidia]
[ 606.712057] Nv04ControlWithSecInfo+0x34/0x40 [nvidia]
[ 606.712401] RmIoctl+0x884/0xbf0 [nvidia]
[ 606.712773] rm_ioctl+0x64/0x430 [nvidia]
[ 606.713123] nvidia_unlocked_ioctl+0x664/0x76c [nvidia]
[ 606.713487] __arm64_sys_ioctl+0xac/0xf0
[ 606.713495] invoke_syscall+0x48/0x114
[ 606.713501] el0_svc_common.constprop.0+0xc0/0xe0
[ 606.713506] do_el0_svc+0x1c/0x28
[ 606.713511] el0_svc+0x30/0xa8
[ 606.713516] el0t_64_sync_handler+0x120/0x12c
[ 606.713520] el0t_64_sync+0x194/0x198
[ 727.514753] INFO: task nvidia-modeset/:1850 blocked for more than 362 seconds.
[ 727.514778] Tainted: G W O 6.8.12-tegra #1
[ 727.514790] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 727.521781] task:nvidia-modeset/ state:D stack:0 pid:1850 tgid:1850 ppid:2 flags:0x00000208
[ 727.521789] Call trace:
[ 727.521791] __switch_to+0xe0/0x108
[ 727.521806] __schedule+0x368/0xbe4
[ 727.521812] schedule+0x34/0xc8
[ 727.521817] schedule_timeout+0x1a4/0x1b4
[ 727.521821] __down_common+0x104/0x218
[ 727.521827] __down+0x18/0x24
[ 727.521834] down+0x50/0x6c
[ 727.521837] nvkms_kthread_q_callback+0x90/0x17c [nvidia_modeset]
[ 727.521929] _main_loop+0x90/0x14c [nvidia_modeset]
[ 727.521994] kthread+0x110/0x114
[ 727.522001] ret_from_fork+0x10/0x20
[ 727.522136] INFO: task nvidia-smi:10936 blocked for more than 362 seconds.
[ 727.528422] Tainted: G W O 6.8.12-tegra #1
[ 727.534375] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 727.542115] task:nvidia-smi state:D stack:0 pid:10936 tgid:10911 ppid:10905 flags:0x00800204
[ 727.542122] Call trace:
[ 727.542124] __switch_to+0xe0/0x108
[ 727.542133] __schedule+0x368/0xbe4
[ 727.542138] schedule+0x34/0xc8
[ 727.542143] schedule_preempt_disabled+0x24/0x40
[ 727.542149] rwsem_down_read_slowpath+0x214/0x51c
[ 727.542152] down_read+0xa0/0xa8
[ 727.542155] os_acquire_rwlock_read+0x38/0x64 [nvidia]
[ 727.542652] portSyncRwLockAcquireRead+0x10/0x30 [nvidia]
[ 727.543126] rmapiLockAcquire+0x29c/0x320 [nvidia]
[ 727.543533] rmapiPrologue+0x124/0x180 [nvidia]
[ 727.543918] _rmapiRmControl+0x408/0x6a0 [nvidia]
[ 727.544295] rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
[ 727.544665] rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
[ 727.545038] _nv04ControlWithSecInfo.constprop.0+0x80/0xa0 [nvidia]
[ 727.545409] Nv04ControlWithSecInfo+0x34/0x40 [nvidia]
[ 727.545777] RmIoctl+0x884/0xbf0 [nvidia]
[ 727.546171] rm_ioctl+0x64/0x430 [nvidia]
[ 727.546557] nvidia_unlocked_ioctl+0x664/0x76c [nvidia]
[ 727.546979] _arm64_sys_ioctl+0xac/0xf0
[ 727.546987] invoke_syscall+0x48/0x114
[ 727.546993] el0_svc_common.constprop.0+0xc0/0xe0
[ 727.546999] do_el0_svc+0x1c/0x28
[ 727.547005] el0_svc+0x30/0xa8
[ 727.547009] el0t_64_sync_handler+0x120/0x12c
[ 727.547013] el0t_64_sync+0x194/0x198
[ 764.597836] sysrq: Show Blocked State
[ 764.597884] task:irq/238-host1x
state:D stack:0 pid:338 tgid:338 ppid:2 flags:0x00000008
[ 764.597893] Call trace:
[ 764.597895] __switch_to+0xe0/0x108
[ 764.597912] __schedule+0x368/0xbe4
[ 764.597918] schedule+0x34/0xc8
[ 764.597924] schedule_timeout+0xa0/0x1b4
[ 764.597929] wait_for_completion_timeout+0x78/0x14c
[ 764.597935] tegra_bpmp_transfer+0x1c8/0x3ec
[ 764.597948] tegra264_mc_icc_set+0xec/0x198
[ 764.597971] apply_constraints+0x6c/0xac
[ 764.597978] icc_set_bw+0xb4/0x29c
[ 764.597983] vic_devfreq_target+0x68/0xfc [tegra_drm]
[ 764.598007] devfreq_set_target+0x90/0x224
[ 764.598012] devfreq_update_target+0xc0/0xd8
[ 764.598015] update_devfreq+0x14/0x20
[ 764.598018] vic_actmon_event+0x50/0x74 [tegra_drm]
[ 764.598031] host1x_actmon_handle_interrupt+0x100/0x138 [host1x]
[ 764.598048] host1x_general_isr+0x58/0x5c [host1x]
[ 764.598060] irq_thread_fn+0x2c/0xa8
[ 764.598069] irq_thread+0x164/0x24c
[ 764.598075] kthread+0x110/0x114
[ 764.598079] ret_from_fork+0x10/0x20
[ 764.598111] task:nvidia-modeset/ state:D stack:0 pid:1850 tgid:1850 ppid:2 flags:0x00000208
[ 764.598116] Call trace:
[ 764.598118] __switch_to+0xe0/0x108
[ 764.598124] __schedule+0x368/0xbe4
[ 764.598129] schedule+0x34/0xc8
[ 764.598134] schedule_timeout+0x1a4/0x1b4
[ 764.598137] __down_common+0x104/0x218
[ 764.598143] __down+0x18/0x24
[ 764.598149] down+0x50/0x6c
[ 764.598151] nvkms_kthread_q_callback+0x90/0x17c [nvidia_modeset]
[ 764.598237] _main_loop+0x90/0x14c [nvidia_modeset]
[ 764.598301] kthread+0x110/0x114
[ 764.598304] ret_from_fork+0x10/0x20
[ 764.598412] task:vi-output, ox05 state:D stack:0 pid:4282 tgid:4282 ppid:2 flags:0x00000208
[ 764.598416] Call trace:
[ 764.598418] __switch_to+0xe0/0x108
[ 764.598424] __schedule+0x368/0xbe4
[ 764.598429] schedule+0x34/0xc8
[ 764.598434] schedule_timeout+0xa0/0x1b4
[ 764.598437] wait_for_completion_timeout+0x78/0x14c
[ 764.598443] vi_capture_status+0x6c/0x128 [tegra_camera]
[ 764.598474] vi5_capture_dequeue+0x114/0x438 [tegra_camera]
[ 764.598492] tegra_channel_kthread_capture_dequeue+0xb4/0x248 [tegra_camera]
[ 764.598509] kthread+0x110/0x114
[ 764.598512] ret_from_fork+0x10/0x20
[ 764.598517] task:vi-output, ox05 state:D stack:0 pid:4297 tgid:4297 ppid:2 flags:0x00000208
[ 764.598520] Call trace:
[ 764.598522] __switch_to+0xe0/0x108
[ 764.598528] __schedule+0x368/0xbe4
[ 764.598533] schedule+0x34/0xc8
[ 764.598538] schedule_timeout+0xa0/0x1b4
[ 764.598541] wait_for_completion_timeout+0x78/0x14c
[ 764.598547] vi_capture_status+0x6c/0x128 [tegra_camera]
[ 764.598563] vi5_capture_dequeue+0x114/0x438 [tegra_camera]
[ 764.598579] tegra_channel_kthread_capture_dequeue+0xb4/0x248 [tegra_camera]
[ 764.598594] kthread+0x110/0x114
[ 764.598597] ret_from_fork+0x10/0x20
[ 764.598602] task:vi-output, ox05 state:D stack:0 pid:4300 tgid:4300 ppid:2 flags:0x00000208
[ 764.598605] Call trace:
[ 764.598607] __switch_to+0xe0/0x108
[ 764.598613] __schedule+0x368/0xbe4
[ 764.598618] schedule+0x34/0xc8
[ 764.598623] schedule_timeout+0xa0/0x1b4
[ 764.598626] wait_for_completion_timeout+0x78/0x14c
[ 764.598631] vi_capture_status+0x6c/0x128 [tegra_camera]
[ 764.598647] vi5_capture_dequeue+0x114/0x438 [tegra_camera]
[ 764.598663] tegra_channel_kthread_capture_dequeue+0xb4/0x248 [tegra_camera]
[ 764.598678] kthread+0x110/0x114
[ 764.598681] ret_from_fork+0x10/0x20
[ 764.598686] task:vi-output, ox05 state:D stack:0 pid:4302 tgid:4302 ppid:2 flags:0x00000208
[ 764.598689] Call trace:
[ 764.598690] __switch_to+0xe0/0x108
[ 764.598696] __schedule+0x368/0xbe4
[ 764.598700] schedule+0x34/0xc8
[ 764.598706] schedule_timeout+0xa0/0x1b4
[ 764.598709] wait_for_completion_timeout+0x78/0x14c
[ 764.598714] vi_capture_status+0x6c/0x128 [tegra_camera]
[ 764.598730] vi5_capture_dequeue+0x114/0x438 [tegra_camera]
[ 764.598745] tegra_channel_kthread_capture_dequeue+0xb4/0x248 [tegra_camera]
[ 764.598761] kthread+0x110/0x114
[ 764.598764] ret_from_fork+0x10/0x20
[ 764.598775] task:vi-output, ox05 state:D stack:0 pid:9447 tgid:9447 ppid:2 flags:0x00000208
[ 764.598778] Call trace:
[ 764.598780] __switch_to+0xe0/0x108
[ 764.598786] __schedule+0x368/0xbe4
[ 764.598791] schedule+0x34/0xc8
[ 764.598796] schedule_timeout+0xa0/0x1b4
[ 764.598799] wait_for_completion_timeout+0x78/0x14c
[ 764.598804] vi_capture_status+0x6c/0x128 [tegra_camera]
[ 764.598820] vi5_capture_dequeue+0x114/0x438 [tegra_camera]
[ 764.598835] tegra_channel_kthread_capture_dequeue+0xb4/0x248 [tegra_camera]
[ 764.598851] kthread+0x110/0x114
[ 764.598853] ret_from_fork+0x10/0x20
[ 764.598860] task:nvidia-smi state:D stack:0 pid:10936 tgid:10911 ppid:10905 flags:0x00800204
[ 764.598866] Call trace:
[ 764.598867] __switch_to+0xe0/0x108
[ 764.598872] __schedule+0x368/0xbe4
[ 764.598877] schedule+0x34/0xc8
[ 764.598883] schedule_preempt_disabled+0x24/0x40
[ 764.598888] rwsem_down_read_slowpath+0x214/0x51c
[ 764.598891] down_read+0xa0/0xa8
[ 764.598893] os_acquire_rwlock_read+0x38/0x64 [nvidia]
[ 764.599384] portSyncRwLockAcquireRead+0x10/0x30 [nvidia]
[ 764.599869] rmapiLockAcquire+0x29c/0x320 [nvidia]
[ 764.600315] rmapiPrologue+0x124/0x180 [nvidia]
[ 764.600726] _rmapiRmControl+0x408/0x6a0 [nvidia]
[ 764.601121] rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
[ 764.601503] rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
[ 764.601870] _nv04ControlWithSecInfo.constprop.0+0x80/0xa0 [nvidia]
[ 764.602240] Nv04ControlWithSecInfo+0x34/0x40 [nvidia]
[ 764.602605] RmIoctl+0x884/0xbf0 [nvidia]
[ 764.603033] rm_ioctl+0x64/0x430 [nvidia]
[ 764.603427] nvidia_unlocked_ioctl+0x664/0x76c [nvidia]
[ 764.603847] __arm64_sys_ioctl+0xac/0xf0
[ 764.603854] invoke_syscall+0x48/0x114
[ 764.603861] el0_svc_common.constprop.0+0xc0/0xe0
[ 764.603867] do_el0_svc+0x1c/0x28
[ 764.603873] el0_svc+0x30/0xa8
[ 764.603878] el0t_64_sync_handler+0x120/0x12c
[ 764.603882] el0t_64_sync+0x194/0x198

below is the sysrq log,please help check it.

hung_dmesg.txt (131.6 KB)

attach is uart log and dmesg log.
below pipeline can reproduce this issue
camera csi(4 surround camera, 1 rear camera)→opencv display(the necessary condition is close/open opencv window again and again. it is very easy reproduce this issue when close the opencv window)
camera csi(4 surround camera, 1 rear camera)→deepstream call VIC to color convertion(this task work in background)

1218_camera_demo_stuck.gz (310 KB)

the uart log have below info, please help check it
Exception Type: 0
DFAR: 0xd9684094 DFSR: 0x00001008 ADFSR: 0x00500000
IFAR: 0x00000000 IFSR: 0x00000000 AIFSR: 0x00000000
PC: 0x00026dee LR: 0x00026ded SP: 0x0007bed0 PSR: 0x2000003f
R0: 0x00000001 R1: 0x4008208c R2: 0x000000c6 R3: 0xd9680000
R4: 0x00041050 R5: 0x00000005 R6: 0x00004094 R7: 0x0007bed0
R8: 0x00000004 R9: 0x00000003 R10: 0x00000000 R11: 0x41c956c4
R12: 0x00000000

we already updated to jetpack7.1, still have this problem.
Please increase the priority of this issue.

Please try with official JP7.1 GA release.

BTW, are you also getting the support via another channel?
If yes, then should move to there, not via forum.

Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.