Kernel crashes with v4l and Raspberry Pi camera

hi,

I have connected an IMX219 Raspberry Camera to my Jetson Nano, and am trying to access the camera from my code using V4L2. I can access via libargus, but would ideally like to avoid having an extra daemon running, as I have found Argus to often get itself into a bad state requiring daemon restart, and read elsewhere that it tends to consume lots of memory.

However, with a V4L command like test like:

$ v4l2-ctl --device /dev/video0 --set-fmt-video=width=2464,height=2066,pixelformat=RG12 --stream-mmap --stream-to=output1.raw --stream-count=1 --stream-skip=1

or if running the v4lcuda example, I almost always encounter select timeouts or kernel crashes like this one:

$ dmesg
[ 3428.380811] tegra-vii2c 546c0000.i2c: no acknowledge from address 0x10
[ 3428.388792] regmap_util_write_table_8:regmap_util_write_table:-121
[ 3428.446277] imx219 6-0010: Error turning off streaming
[ 3428.455015] ------------[ cut here ]------------
[ 3428.459659] WARNING: CPU: 0 PID: 9228 at /dvs/git/dirty/git-master_linux/kernel/kernel-4.9/drivers/media/v4l2-core/videobuf2-core.c:1667 __vb2_queue_cancel+0x11c/0x188
[ 3428.474598] Modules linked in: zram nvgpu bluedroid_pm ip_tables x_tables

[ 3428.474724] CPU: 0 PID: 9228 Comm: cuda-EvtHandlr Tainted: G W 4.9.140-tegra #1
[ 3428.474740] Hardware name: NVIDIA Jetson Nano Developer Kit (DT)
[ 3428.474758] task: ffffffc00f5e8000 task.stack: ffffffc0c159c000
[ 3428.474781] PC is at __vb2_queue_cancel+0x11c/0x188
[ 3428.474802] LR is at __vb2_queue_cancel+0x34/0x188
[ 3428.474824] pc : [] lr : [] pstate: 60400045
[ 3428.474837] sp : ffffffc0c159fae0
[ 3428.474851] x29: ffffffc0c159fae0 x28: 0000000000000008
[ 3428.474889] x27: ffffff8008f62000 x26: ffffffc0c159fde8
[ 3428.474925] x25: ffffffc0e9548ae8 x24: ffffffc0f5016518
[ 3428.474958] x23: 0000000000000001 x22: ffffffc0f9678718
[ 3428.474992] x21: ffffffc0f9678030 x20: ffffffc0f9678b58
[ 3428.475025] x19: ffffffc0f9678b58 x18: 0000000000000001
[ 3428.475058] x17: 0000000000000001 x16: 0000000000000000
[ 3428.475090] x15: ffffffffffffffff x14: ffffffc0c159fa00
[ 3428.475123] x13: ffffffc0c159f905 x12: 0000000000000000
[ 3428.475156] x11: ffffffc0c159f8c0 x10: ffffffc0c159f8c0
[ 3428.475190] x9 : ffffffc0c159f9c0 x8 : fffffffffffffffe
[ 3428.475222] x7 : ffffffc051e2c800 x6 : ffffffc051e2c980
[ 3428.475255] x5 : ffffffc00f5e8000 x4 : ffffffc0fefc40e0
[ 3428.475287] x3 : 000000001830b480 x2 : ffffffc051e2c700
[ 3428.475320] x1 : ffffffc0f932a8c0 x0 : 0000000000000004

[ 3428.475366] —[ end trace b99ae9da2ac9f40f ]—
[ 3428.480015] Call trace:
[ 3428.480045] [] __vb2_queue_cancel+0x11c/0x188
[ 3428.480074] [] vb2_core_queue_release+0x2c/0x58
[ 3428.480100] [] _vb2_fop_release+0x84/0xa0
[ 3428.480132] [] tegra_channel_close+0x58/0x130
[ 3428.480161] [] v4l2_release+0x48/0xa0
[ 3428.480197] [] __fput+0x90/0x1d0
[ 3428.480222] [] ____fput+0x20/0x30
[ 3428.480258] [] task_work_run+0xbc/0xd8
[ 3428.480290] [] do_exit+0x2c4/0xa08
[ 3428.480316] [] do_group_exit+0x40/0xa8
[ 3428.480346] [] get_signal+0x26c/0x578
[ 3428.480377] [] do_signal+0x130/0x500
[ 3428.480404] [] do_notify_resume+0x90/0xb0
[ 3428.480429] [] work_pending+0x8/0x10
[ 3456.629400] tegradc tegradc.0: blank - powerdown
[ 3456.683271] extcon-disp-state extcon:disp-state: cable 47 state 0
[ 3456.683297] Extcon AUX1(HDMI) disable

Can anybody tell me what is going wrong here? I have changed to using a 4amp barrel plug power supply, but it does not help.

Thank you in advance,

Jacob

hello jacobnfaku,

may I know is there a 2464x2066 in your IMX219 available sensor modes?
you may execute below commands to check all available modes,
for example,

$ v4l2-ctl -d /dev/video0 --list-formats-ext

by the way, there’s t210 hardware limitation that you should set the stride to 64-alignment,
please adjust the sensor width accordingly.
thanks

Hi jacobnfaku,

The v4l2cuda samples is not designed for raw capture.
We are suggest using nvarguscamerasrc and argus_camera to run on imx219.
nvarguscamerasrc pipeline:

$ gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM),width=1920, height=1080, framerate=30/1, format=NV12' ! nvoverlaysink -ev

Argus samples:
Please following /usr/src/jetson_multimedia_api/argus/README.TXT steps to install argus_camera app.

Sorry that was a bad example command I pasted, I seem to have tried everything.

Here is with one of the camera’s resolutions:

v4l2-ctl --device /dev/video0 --set-fmt-video=width=3264,height=2464,pixelformat=RG12 --stream-mmap --stream-to=output1.raw --stream-count=1 --stream-skip=1
^C
[nano]
~$ sudo dmesg -c
[75336.410361] tegra-vii2c 546c0000.i2c: no acknowledge from address 0x10
[75336.417092] regmap_util_write_table_8:regmap_util_write_table:-121
[75336.474703] imx219 6-0010: Error turning off streaming
[75336.483538] ------------[ cut here ]------------
[75336.488219] WARNING: CPU: 0 PID: 10952 at /dvs/git/dirty/git-master_linux/kernel/kernel-4.9/drivers/media/v4l2-core/videobuf2-core.c:1667 __vb2_queue_cancel+0x11c/0x188
[75336.503272] Modules linked in: zram nvgpu bluedroid_pm ip_tables x_tables

[75336.503513] CPU: 0 PID: 10952 Comm: v4l2-ctl Tainted: G W 4.9.140-tegra #1
[75336.503544] Hardware name: NVIDIA Jetson Nano Developer Kit (DT)
[75336.503582] task: ffffffc0caa56200 task.stack: ffffffc083654000
[75336.503631] PC is at __vb2_queue_cancel+0x11c/0x188
[75336.503672] LR is at __vb2_queue_cancel+0x34/0x188
[75336.503716] pc : [] lr : [] pstate: 60400045
[75336.503742] sp : ffffffc083657ae0
[75336.503773] x29: ffffffc083657ae0 x28: 0000000000000008
[75336.503856] x27: ffffff8008f62000 x26: ffffffc083657de8
[75336.503931] x25: ffffffc0a99a98e8 x24: ffffffc0f508f238
[75336.504001] x23: 0000000000000001 x22: ffffffc0f5080718
[75336.504071] x21: ffffffc0f5080030 x20: ffffffc0f5080b58
[75336.504140] x19: ffffffc0f5080b58 x18: 0000000000000001
[75336.504208] x17: 0000000000000001 x16: 0000000000000000
[75336.504276] x15: ffffffffffffffff x14: ffffffc083657a00
[75336.504344] x13: ffffffc083657905 x12: 0000000000000000
[75336.504411] x11: ffffffc0836578c0 x10: ffffffc0836578c0
[75336.504480] x9 : ffffffc0836579c0 x8 : 0000000000000000
[75336.504549] x7 : ffffffc0ede46c00 x6 : ffffffc0e37d4701
[75336.504615] x5 : ffffff80085303dc x4 : ffffffbf038df510
[75336.504683] x3 : 000000018040003e x2 : ffffffc0e37d4700
[75336.504749] x1 : ffffffc0f93238c0 x0 : 0000000000000004

[75336.504842] —[ end trace a175502341a31d3e ]—
[75336.509541] Call trace:
[75336.509598] [] __vb2_queue_cancel+0x11c/0x188
[75336.509655] [] vb2_core_queue_release+0x2c/0x58
[75336.509706] [] _vb2_fop_release+0x84/0xa0
[75336.509769] [] tegra_channel_close+0x58/0x130
[75336.509825] [] v4l2_release+0x48/0xa0
[75336.509887] [] __fput+0x90/0x1d0
[75336.509939] [] ____fput+0x20/0x30
[75336.510002] [] task_work_run+0xbc/0xd8
[75336.510060] [] do_exit+0x2c4/0xa08
[75336.510111] [] do_group_exit+0x40/0xa8
[75336.510168] [] get_signal+0x26c/0x578
[75336.510223] [] do_signal+0x130/0x500
[75336.510277] [] do_notify_resume+0x90/0xb0
[75336.510322] [] work_pending+0x8/0x10

I am trying argus, but the only example I found so far that does zero-copy to CUDA is the Histogram one that is luminance only. Is there source code available, so I can see how Argus uses V4L2?

Best regards,
Jacob

Hi jacobnfaku,

What image version are you using on Jetson-Nano?

$ head -1 /etc/nv_tegra_release

You may need to disable bypass from v4l2-ctl in order to avoid confict with nvarguscamera daemon:

v4l2-ctl --device /dev/video0 <b>--set-ctrl bypass_mode=0</b> --set-fmt-video=width=3264,height=2464,pixelformat=RG12 --stream-mmap --stream-to=output1.raw --stream-count=1

Hi Carolyuu,

My Jetson SDcard image did not have that file, but it was from around September, so I downloaded jetson-nano-sd-card-image-r32.2.zip, which appears to be the most recent version, and flashed that. Unfortunately the new one does not have the /etc/nv_tegra_release file either.

I tried replacing the Raspberry PI IMX219 with a Waveshare one that was bought specifically for the Nano. Things go a little better with the new one, I no longer get the kernel errors, and can capture frames. Unfortunately, the buffers I get back only contain zeros.

I tried the 12_camera_v4l2_cuda sample and there I also just get all-green images back (probably what the zero buffers look like after YUV->RGB conversion.)

The command suggested by Honey_Patouceul works, but the output .raw file only contains zeros. I’ve tried qv4l2 and I get a black window.

I can still run the camera via gstreamer and argus, but it is unclear if I would be able to achieve zero-copy reads to CUDA buffers with that approach.

Is it safe to conclude that v4l2 is just plain broken on the Jetson Nano?

You may first try to stop argus deamon:

sudo systemctl stop nvargus-daemon.service

then try to increase gain and/or exposure from qv4l2 (be sure to disable bypass as well) and try. If this doesn’t work, you may also try to disable argus deamon, reboot and try, but it’s only a suggestion, not sure it would really help.

Hi,
The following command is verified on r32.2.1, r32.2.3, r32.3.1, Jetson Nano, Raspberry Pi camera v2:

$ gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! 'video/x-raw(memory:NVMM),width=1920, height=1080, framerate=30/1, format=NV12' ! nvoverlaysink

Please check if you can run it first. If not, please re-install through SDKManager and try again.

hi and thanks for the help and suggestions;

The camera runs fine via argus, but with V4L2 there is no signal, just 0-filled frames. I have tried the suggestions to adjust gain & exposure and to reboot with argus disabled. Nothing helps.

Hi,
Without Argus, you cannot leverage ISP engine on Jetson Nano. This is not optimal and not recommended.

That is shame, as Argus can get very unstable and leak a lot of memory at times.

Right now for instance, it has produced 3.2GiB worth of syslog containing mostly these errors:

Feb 20 12:26:16 jacob-desktop nvargus-daemon[26684]: (Argus) Error InvalidState: (propagating from src/api/ScfCaptureThread.cpp, function run(), line 109)
Feb 20 12:26:16 jacob-desktop nvargus-daemon[26684]: SCF: Error InvalidState: Session has suffered a critical failure (in src/api/Session.cpp, function capture(), line 667)
Feb 20 12:26:16 jacob-desktop nvargus-daemon[26684]: (Argus) Error InvalidState: (propagating from src/api/ScfCaptureThread.cpp, function run(), line 109)

In this wonderful Ubuntu world that seems to be the only supported way to use a Jetson, this in turn triggers systemd’s log mechanism to (apparently) go and read all of these log statement into memory, so that apart from being out of disk space, the rest of the system gets hosed too. I’d be hesitant to use any of this on a production system.

Hi,
We have Raspberry Pi camera v2 and use the cameras to verify camera functions in each release. It should be with reliable stability. Not sure but probably your power supply does not give sufficient voltage/ampere. Please check if it meets the requirement:
https://devtalk.nvidia.com/default/topic/1048640/jetson-nano/power-supply-considerations-for-jetson-nano-developer-kit/

For information, please share your release version($ head -1 /etc/nv_tegra_release) and steps to reproduce the error. We will try to reproduce it by following the steps.

Hi DaneLLL, please see thread above for context wrt power supply and the (lack of) /etc/nv_tegra_release file, even after reflashing latest jetpack.

Wrt stability, my guess is that Argus runs fine if everything is done flawlessly in the client, but when experimenting or if unexpected events cause initialization to fail halfway through, Argus does very quickly get itself into a very bad state. I have screenshots of nvargus-daemon using 9.1GiB VIRT memory, and I have massive log files consisting of forever repeated log statements from Argus like this one:

Feb 20 17:53:04 jacob-desktop nvargus-daemon[10231]: SCF: Error InvalidState: Session has suffered a critical failure (in src/api/Session.cpp, function capture(), line 667)

that also indicate that Argus is not as stable as one would ideally hope to be able to take it into production.

Finally, there is the issue of black screens with v4l2. From kernel traces it looks like some magic TEGRA CIDs get configured by Argus that the v42l-examples do not show how to use. Perhaps NVIDIA would consider making Argus open source, that would really help.

Regards,
Jacob

Hi jacobnfaku,

We run below script overnight on r32.3.1 + Nano + Raspberry Pi camera v2, no issue.
Test script:

#!/bin/bash

i=1
while [ "$i" != "1000" ]
do
    echo "loop" $i
    gst-launch-1.0 nvarguscamerasrc num-buffers=300 sensor-id=0 ! 'video/x-raw(memory:NVMM),width=1920, height=1080, framerate=30/1, format=NV12' ! nvoverlaysink
    sleep 5
    i=$(($i+1))
done

Yes I believe that can work, because it is doing a clean shutdown after every iteration. Right now, after having a crash elsewhere in my code that causes my process to exit without calling argus shutdown, I can get this result:

$ gst-launch-1.0 nvarguscamerasrc num-buffers=300 sensor-id=0 ! 'video/x-raw(memory:NVMM),width=1920, height=1080, framerate=30/1, format=NV12' ! nvoverlaysink
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Error generated. /dvs/git/dirty/git-master_linux/multimedia/nvgstreamer/gst-nvarguscamera/gstnvarguscamerasrc.cpp, execute:521 No cameras available
Got EOS from element "pipeline0".
Execution ended after 0:00:00.032277158
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...
(Argus) Error EndOfFile: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 266)
(Argus) Error EndOfFile: Receive worker failure, notifying 1 waiting threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 340)
(Argus) Error InvalidState: Argus client is exiting with 1 outstanding client threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 357)
(Argus) Error EndOfFile: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 368)
(Argus) Error EndOfFile: Client thread received an error from socket (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 145)
(Argus) Error EndOfFile:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 87)

And the only way to repair argus is to completely reboot. Even restarting the systemd service is not enough to restore argus to a working state. I think your script needs to test various stages of unclean client shutdowns.

Hi,
It would be great if you can catch the crash in your code. If not, you may restart nvargus-daemon after abnormal shutdown.

sudo service nvargus-daemon stop
sudo nvargus-daemon

Yes that is what I tried, but restarting Argus is not always enough. I have to reboot.

Clean shutdowns are not always a possibility, so Argus should be able to detect a client disconnecting uncleanly and be able survive (“fault containment”). Same as would be the case if using a kernel device.