How is NVHOST_VI_GET_CAPTURE_STATUS getting generated

Hi,
Setup, Xavier NX with JP5.1.3, fpdlink-iii and 8 imx219.
Found an issue where under certain conditions argus won’t start back up again, here are the kernel logs:

[27671.646943] task:ViCsiHw frameSt state:D stack:    0 pid:230904 ppid:     1 flags:0x00000001
[27671.646989] Call trace:
[27671.647010]  __switch_to+0xc8/0x120
[27671.647021]  __schedule+0x318/0x980
[27671.647028]  schedule+0x78/0x110
[27671.647056]  schedule_timeout+0x2dc/0x340
[27671.647076]  wait_for_completion+0x8c/0x120
[27671.647088]  vi_capture_status+0xac/0x130
[27671.647095]  vi_channel_ioctl+0x2c4/0x8f0
[27671.647104]  __arm64_sys_ioctl+0xac/0xf0
[27671.647113]  el0_svc_common.constprop.0+0x80/0x1d0
[27671.647120]  do_el0_svc+0x38/0xc0
[27671.647127]  el0_svc+0x1c/0x30
[27671.647133]  el0_sync_handler+0xa8/0xb0
[27671.647140]  el0_sync+0x16c/0x180

It complains about ‘Hung task’ for x time. This basically causes this state:

user@vision:~$ ps aux | grep argus
root      230695  0.1  0.0      0     0 ?        Zsl  13:15   0:05 [nvargus-daemon] <defunct>
root      253690  0.0  0.1  55684 10856 ?        Ss   14:10   0:00 /usr/sbin/nvargus-daemon
user   253866  0.0  0.0  10300   664 pts/0    S+   14:11   0:00 grep --color=auto argus

Where ‘nvargus-daemon’ will not die, and causes any attempt to restart ‘/usr/sbin/nvargus-daemon’ to fail. The only solution is to reboot the board, and we are trying to avoid that at all costs.

The solution that I think could solve it is this:

--- a/kernel/nvidia/drivers/media/platform/tegra/camera/fusa-capture/capture-vi.c
+++ b/kernel/nvidia/drivers/media/platform/tegra/camera/fusa-capture/capture-vi.c
@@ -1498,21 +1498,20 @@ int vi_capture_status(
                return -ENODEV;
        }
 
-       dev_dbg(chan->dev, "%s: waiting for status, timeout:%d ms\n",
+       dev_err(chan->dev, "%s: waiting for status, timeout:%d ms\n",
                __func__, timeout_ms);
 
        /* negative timeout means wait forever */
        if (timeout_ms < 0) {
-               wait_for_completion(&capture->capture_resp);
-       } else {
-               ret = wait_for_completion_timeout(
-                               &capture->capture_resp,
-                               msecs_to_jiffies(timeout_ms));
-               if (ret == 0) {
-                       dev_dbg(chan->dev,
-                               "capture status timed out\n");
-                       return -ETIMEDOUT;
-               }
+               timeout_ms = 100;
+       }
+       ret = wait_for_completion_timeout(
+                       &capture->capture_resp,
+                       msecs_to_jiffies(timeout_ms));
+       if (ret == 0) {
+               dev_err(chan->dev,
+                       "capture status timed out\n");
+               return -ETIMEDOUT;
        }
 
        if (ret < 0) {

Basically the fix is to always have a timeout(added 100ms as placeholder wanted to check via the print what are normal values), and don’t wait forever. I don’t really care if argus crashes due to that ‘ETIMEDOUT’, but I want to be able to restart it back up and don’t leave zombie processes. But I cannot properly test it since I’m not sure what calls the ‘NVHOST_VI_GET_CAPTURE_STATUS’ interrupt, which in theory calls that method. And via normal capture, so far I have not been able to get it to enter there.
Does somebody knows how I can trigger that method?

Regards,
Andres
Embedded SW Engineer at RidgeRun
Contact us: support@ridgerun.com
Developers wiki: https://developer.ridgerun.com
Website: www.ridgerun.com

The NVHOST_VI_GET_CAPTURE_STATUS should trigger for each SOF/EOF.

Thank you. To close the topic and if its helpful to someone else, another interrupt that is related is VI_CAPTURE_STATUS on capture-vi-channel.c. And was able to always get it to trigger using v4l2, like:

user@vision:~$ v4l2-ctl --stream-mmap --stream-count=1 --stream-to=file.raw
<
user@vision:~$ sudo dmesg -c
[   80.375683] imx219 9-0060: imx219_power_on: power on
[   80.636428] vi_capture_status: waiting for status, timeout:2500 ms
[   80.809970] imx219 9-0060: imx219_power_off: power off

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.