when camera stream stop abnormally, v4l2-ctrl cannot be killed

Hi,

A problem has been encountered, when I use v4l2-ctrl to test camera.

sdk version: JetPack 4.2.1 - L4T R32.1

test cmd: v4l2-ctl --set-fmt-video=width=1280,height=768,pixelformat=RG10 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1000 -d /dev/video0

It was normal at first, but it was abnormal when the camera does not output images suddenly(maybe camera died or lines broken). The dmesg is shown below:

[ 534.490310] tegra194-vi5 15c10000.vi: no reply from camera processor
[ 534.490458] tegra194-vi5 15c10000.vi: uncorr_err: request timed out after 2500 ms
[ 534.490621] tegra194-vi5 15c10000.vi: err_rec: attempting to reset the capture channel
[ 534.493051] tegra194-vi5 15c10000.vi: err_rec: successfully reset the capture channel
[ 537.046313] tegra194-vi5 15c10000.vi: no reply from camera processor
[ 537.046454] tegra194-vi5 15c10000.vi: uncorr_err: request timed out after 2500 ms
[ 537.046585] tegra194-vi5 15c10000.vi: err_rec: attempting to reset the capture channel
[ 537.046721] tegra194-vi5 15c10000.vi: unexpected response from camera processor
[ 537.046841] video4linux video0: vi capture release failed
[ 537.046932] tegra194-vi5 15c10000.vi: fatal: error recovery failed
[ 537.542788] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 537.542940] Mem abort info:
[ 537.542992] ESR = 0x96000005
[ 537.543046] Exception class = DABT (current EL), IL = 32 bits
[ 537.543139] SET = 0, FnV = 0
[ 537.543194] EA = 0, S1PTW = 0
[ 537.543245] Data abort info:
[ 537.543292] ISV = 0, ISS = 0x00000005
[ 537.543360] CM = 0, WnR = 0
[ 537.543419] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffc177404000
[ 537.543520] [0000000000000000] *pgd=0000000000000000, *pud=0000000000000000
[ 537.543655] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 537.543754] Modules linked in: can_raw can fuse i2c_camera_driver(O) overlay mvsw61xx 8021q garp mrp nvs_bmi160 nvs nvgpu binfmt_misc nfsd nfs_acl ip_tables x_tables
[ 537.544264] CPU: 5 PID: 12993 Comm: v4l2-ctl Tainted: G W O 4.9.140+ #1
[ 537.544816] Hardware name: jetson-xavier (DT)
[ 537.545181] task: ffffffc1719e0e00 task.stack: ffffffc17404c000
[ 537.545630] PC is at exit_creds+0x2c/0x80
[ 537.545934] LR is at __put_task_struct+0x4c/0x148
[ 537.549599] pc : [] lr : [] pstate: 60400045
[ 537.557205] sp : ffffffc17404fa10
[ 537.560621] x29: ffffffc17404fa10 x28: ffffffc1719e0e00
[ 537.566214] x27: 0000000000000009 x26: ffffffc32ddb1dc0
[ 537.571557] x25: ffffffc182125810 x24: 0000000000000001
[ 537.576895] x23: ffffffc3e91e4048 x22: ffffffc3ecfa2b58
[ 537.582233] x21: ffffffc32620e230 x20: 0000000000000000
[ 537.587400] x19: ffffffc32620e200 x18: 0000007f922eda70
[ 537.593091] x17: 0000007f92263f90 x16: ffffffbf0bf27de0
[ 537.599116] x15: 000000000000000a x14: ffffffbf00000000
[ 537.604549] x13: 00000000000001a2 x12: 0000000000000fa0
[ 537.610060] x11: 000000000000000b x10: 0000000000000a20
[ 537.615857] x9 : ffffffc17404f850 x8 : ffffffc1719e1880
[ 537.621867] x7 : fefefeff646c606d x6 : 000000000b44fc4a
[ 537.627121] x5 : 0000000000000400 x4 : 0000000000000000
[ 537.632714] x3 : 00000000000000be x2 : 0000000000000000
[ 537.637798] x1 : 0000000000000000 x0 : 00000000ffffffff

[ 537.644814] Process v4l2-ctl (pid: 12993, stack limit = 0xffffffc17404c000)
[ 537.651441] Call trace:
[ 537.653812] [] exit_creds+0x2c/0x80
[ 537.658368] [] __put_task_struct+0x4c/0x148
[ 537.663700] [] kthread_stop+0x1e0/0x1e8
[ 537.669036] [] vi5_channel_stop_kthreads+0x40/0x58
[ 537.675156] [] vi5_channel_stop_streaming+0x48/0xa8
[ 537.681303] [] tegra_channel_stop_streaming+0x34/0x48
[ 537.687239] [] __vb2_queue_cancel+0x3c/0x170
[ 537.693095] [] vb2_core_queue_release+0x2c/0x58
[ 537.698957] [] _vb2_fop_release+0x88/0xa8
[ 537.704292] [] tegra_channel_close+0x58/0x128
[ 537.709987] [] v4l2_release+0x4c/0x98
[ 537.714890] [] __fput+0x94/0x1d0
[ 537.719954] [] ____fput+0x20/0x30
[ 537.724771] [] task_work_run+0xc0/0xe0
[ 537.730024] [] do_exit+0x2b8/0x9d8
[ 537.735177] [] do_group_exit+0x3c/0xa0
[ 537.740517] [] get_signal+0x29c/0x590
[ 537.745723] [] do_signal+0x168/0x4e0
[ 537.750694] [] do_notify_resume+0x90/0xb0
[ 537.756270] [] work_pending+0x8/0x10
[ 537.761611] —[ end trace 867157b046591f9a ]—
[ 537.774769] Fixing recursive fault but reboot is needed!

And at this time, the v4l2-ctrl program cannot be killed by “ctrl+c” and “kill -9 pid”.

root@xavier[1]:~$ v4l2-ctl --set-fmt-video=width=1280,height=768,pixelformat=RG10 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1000 -d /dev/video0
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 40.19 fps
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 40.09 fps
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 40.06 fps
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 40.04 fps
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 40.03 fps
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 40.03 fps
<^C^C^C^C^C

root@xavier[1]:~$ ps -e | grep v4l2
12993 pts/4 00:00:00 v4l2-ctl
root@xavier[1]:~$
root@xavier[1]:~$ kill -9 12993
root@xavier[1]:~$
root@xavier[1]:~$ ps -e | grep v4l2
12993 pts/4 00:00:00 v4l2-ctl

I think, whatever happens to the camera, kernel should not oops and program should not can’t be killed.

hello lisubin328719,

below failures indicate that VI5 engine did not receive the sensor streaming, and report a failure.

[ 534.490310] tegra194-vi5 15c10000.vi: no reply from camera processor
[ 534.490458] tegra194-vi5 15c10000.vi: uncorr_err: request timed out after 2500 ms
[ 534.490621] tegra194-vi5 15c10000.vi: err_rec: attempting to reset the capture channel
[ 534.493051] tegra194-vi5 15c10000.vi: err_rec: successfully reset the capture channel
[ 537.046313] tegra194-vi5 15c10000.vi: no reply from camera processor
[ 537.046454] tegra194-vi5 15c10000.vi: uncorr_err: request timed out after 2500 ms
[ 537.046585] tegra194-vi5 15c10000.vi: err_rec: attempting to reset the capture channel
[ 537.046721] tegra194-vi5 15c10000.vi: unexpected response from camera processor
[ 537.046841] video4linux video0: vi capture release failed

since you could launch camera for streaming at the beginning, this seems like kernel driver stability issue.

  1. may I have your confirmation that your sensor frame-rate configuration is 40-fps?
  2. could you please tried to increase clock frequency by jetson_clock for verification.
  3. since JetPack-4.2.2 already release, suggest you might moving to latest release for confirmation.

Hi, JerryChang, thank you for you reply.

I can understand these errors, but I can’t understand these oops and why v4l2-ctrl can’t be killed.
Sensor streaming may stop because of some reasons, but drivers should not oops.

1、sensor frame-rate configuration is 40-fps
2、jetson_clock has already been “NV Power Mode: MAXN”
3、I have already tried JetPack-4.2.1, it’s the same. Migrate our changes to the new version is a big work, if JetPack-4.2.2 fixed this problem, can you tell me which commit does?

hello lisubin328719,

please have a try to apply two patches in the attachment, devtalk1062176_Aug30.tar.gz
you may also check Kernel Customization chapter to compile the VI drivers,
after that, please also check Flashing a Specific Partition to update your kernel partition,
thanks
devtalk1062176_Aug30.tar.gz (2.04 KB)

Hi, JerryChang, thank you for you reply. I will try this attachment.

hello lisubin328719,

that tarball was just uploaded, please refresh the forum page and check my comments,
thanks

Hi, JerryChang

Can you send your vi5_fops.c to me? I don’t know which version the patch is based on.
I find this is different from the file in R32.1 and R32.2.

hello lisubin328719,

you may based-on l4t-r32.2 and apply the patches for verification,
thanks