Hello,
When I was testing some abnormal conditions, the camera software would cause kernel panic or cause the system to get blocked and then reboot.
The specific operation methods are as follows:
Number of connected cameras: 8
(the more cameras, the easier it is to block the system)
1.Connect enough cameras and open the corresponding video devices
2.Unplug the camera. The more cameras you unplug, the more likely it is that the system will hang.
[ 135.908070] CPU: 2 PID: 3167 Comm: vi-output, isx0 Tainted: G O 5.15.136-voyager-1.0-2024-12-23-02-58-06-tegra #1
[ 135.908073] Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS 36.3.0-gcid-36191598 05/06/2024
[ 135.908074] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 135.908077] pc : tegra_capture_ivc_notify_chan_id+0x1a0/0x1c0 [capture_ivc]
[ 135.908080] lr : tegra_capture_ivc_notify_chan_id+0x1a0/0x1c0 [capture_ivc]
[ 135.908083] sp : ffff80000da3bb10
[ 135.908084] x29: ffff80000da3bb10 x28: 0000000000000000 x27: 0000000000000000
[ 135.908088] x26: 0000000000000000 x25: 0000000000001043 x24: ffff0000a50c6088
[ 135.908092] x23: 000000000000008c x22: ffff0000a50c7200 x21: ffff0000a50c6080
[ 135.908096] x20: 0000000000000000 x19: 000000000000008c x18: ffffffffffffffff
[ 135.908099] x17: 0000000000000000 x16: ffffdc628be9ed58 x15: ffffdc628dc2f915
[ 135.908103] x14: ffffffffffffffff x13: ffffdc628dc2f90b x12: 6e6f63206c656e6e
[ 135.908107] x11: 656820747563205b x10: ffffdc628d8060a8 x9 : ffff80000da3bb10
[ 135.908110] x8 : ffff80000da3bb10 x7 : 7973756220736920 x6 : 7973756220736920
[ 135.908114] x5 : ffff0007a8be7a10 x4 : 00000000fffff4c2 x3 : ffffdc628d84b028
[ 135.908117] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000090d20000
[ 135.908120] Call trace:
[ 135.908122] tegra_capture_ivc_notify_chan_id+0x1a0/0x1c0 [capture_ivc]
[ 135.908125] vi_capture_setup+0x36c/0x7e0 [tegra_camera]
[ 135.908143] tegra_channel_capture_setup+0x10c/0x2c0 [tegra_camera]
[ 135.908157] vi5_channel_error_recover+0x128/0x1f0 [tegra_camera]
[ 135.908169] tegra_channel_error_recover+0x80/0xa0 [tegra_camera]
[ 135.908182] tegra_channel_kthread_capture_dequeue+0x134/0x1e0 [tegra_camera]
[ 135.908194] kthread+0x184/0x190
[ 135.908200] ret_from_fork+0x10/0x20
[ 135.908205] ---[ end trace b66752dd077c7039 ]---
[ 136.009532] tegra194-vi5 13e00000.host1x:vi0@15c00000: failed to update control callback
[ 142.349451] BUG: Bad page state in process https pfn:12d49d
[ 142.349460] page:00000000a6274c7b refcount:0 mapcount:0 mapping:00000000234add85 index:0x0 pfn:0x12d49d
[ 142.349469] failed to read mapping contents, not a valid kernel address?
[ 142.349470] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff)
[ 142.349476] raw: 0bfffc0000000000 dead000000000100 dead000000000122 dead000000000400
[ 142.349478] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 142.349478] page dumped because: non-NULL mapping
[ 142.349479] Modules linked in: bridge stp llc nvidia_modeset(O) lzo_rle lzo_compress zram zsmalloc brcm_xgbe tegra_pcie_dma_test(O) tegra_pcie_edma(O) ramoops reed_solomon rfkill libcomposite snd_soc_tegra210_admaif(O) snd_soc_tegra186_asrc(O) snd_soc_tegra_pcm snd_soc_tegra210_afc(O) snd_soc_tegra210_mvc(O) snd_soc_tegra210_mixer(O) snd_soc_tegra210_ope(O) snd_soc_tegra186_dspk(O) snd_soc_tegra186_arad(O) snd_soc_tegra210_adx(O) snd_soc_tegra210_dmic(O) snd_soc_tegra210_amx(O) snd_soc_tegra210_sfc(O) snd_soc_tegra210_i2s(O) nvidia_p2p(O) nvgpu(O) snd_soc_tegra210_ahub(O) tegra210_adma spidev ucsi_ccg iray_horus640(O) zh_ISX031(O) nvethernet(O) typec_ucsi sunny_ov8m(O) max96717f(O) tegra_cactmon_mc_all(O) typec snd_hda_codec_hdmi mttcan(O) max20087(O) max9295(O) ina3221 nvpps(O) can_dev snd_hda_tegra tegra234_aon(O) pwm_tegra_tachometer(O) snd_hda_codec tegra_aconnect tegra_xudc phy_tegra194_p2u max96712(O) snd_hda_core snd_soc_rt5640 at24 snd_soc_rl6231 pcie_tegra194 spi_tegra114
[ 142.349538] host1x_fence(O) pwm_tegra mc_hwpm(O) nvhost_isp5(O) nvhost_nvcsi_t194(O) nvhost_vi5(O) tegra_camera(O) governor_userspace tegra_drm(O) v4l2_dv_timings nvhost_nvcsi(O) capture_ivc(O) v4l2_fwnode v4l2_async videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 nvidia(O) videobuf2_common videodev snd_soc_tegra_machine_driver(O) tegra_wmark(O) nvhost_pva(O) snd_soc_tegra_utils(O) cec nvvrs_pseq_rtc(O) tegra_camera_rtcpu(O) nvhost_nvdla(O) snd_soc_simple_card_utils drm_kms_helper mc crct10dif_ce ivc_bus(O) tegra23x_perf_uncore(O) tegra_dce(O) tegra_camera_platform(O) tegra234_oc_event(O) nvpmodel_clk_cap(O) hsp_mailbox_client(O) pwm_fan nvhwpm(O) tegra_se(O) tegra_mce(O) ivc_ext(O) nvidia_vrs_pseq(O) tegra_bpmp_thermal lm90 i2c_nvvrs11(O) thermal_trip_event(O) nvhost_capture(O) crypto_engine host1x_nvhost(O) tsecriscv(O) governor_pod_scaling(O) host1x(O) drm mc_utils(O) nvmap(O) nvsciipc(O) fuse ip_tables x_tables ipv6
[ 475.284566] nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c67e:2:0:0x00000057
[ 475.385147] NVRM gpumgrGetSomeGpu: Failed to retrieve pGpu - Too early call!.
[ 475.385154] NVRM nvAssertFailedNoLog: Assertion failed: NV_FALSE @ gpu_mgr.c:303
[ 475.385166] nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c67e:2:0:0x00000057
[ 475.452528] NVRM gpumgrGetSomeGpu: Failed to retrieve pGpu - Too early call!.
[ 475.452544] NVRM nvAssertFailedNoLog: Assertion failed: NV_FALSE @ gpu_mgr.c:303
[ 476.273960] NVRM gpumgrGetSomeGpu: Failed to retrieve pGpu - Too early call!.
[ 476.273968] NVRM nvAssertFailedNoLog: Assertion failed: NV_FALSE @ gpu_mgr.c:303
[ 476.273983] nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c67e:2:0:0x00000057
[ 476.292554] NVRM gpumgrGetSomeGpu: Failed to retrieve pGpu - Too early call!.
[ 476.292559] NVRM nvAssertFailedNoLog: Assertion failed: NV_FALSE @ gpu_mgr.c:303
[ 476.292566] nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c67e:2:0:0x00000057
[ 476.375846] NVRM gpumgrGetSomeGpu: Failed to retrieve pGpu - Too early call!.
[ 476.375849] NVRM nvAssertFailedNoLog: Assertion failed: NV_FALSE @ gpu_mgr.c:303
[ 476.375858] nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c67e:2:0:0x00000057
[ 476.476519] NVRM gpumgrGetSomeGpu: Failed to retrieve pGpu - Too early call!.
[ 476.476533] NVRM nvAssertFailedNoLog: Assertion failed: NV_FALSE @ gpu_mgr.c:303
Full log:
dmesg.txt (16.2 MB)
Although this is an abnormal test, in actual use, we cannot guarantee that the camera itself or the camera connection cable will always be free of problems.
Hi,
For the camera basic functionality first needs to check the device and driver configuration.
You can reference to below program guide for the detailed information of device tree and driver implementation.
https://docs.nvidia.com/jetson/archives/r36.3/DeveloperGuide/SD/CameraDevelopment/SensorSoftwareDriverProgramming.html?highlight=programing#sensor-software-driver-programming
Please refer to Applications Using V4L2 IOCTL Directly by using V4L2 IOCTL to verify basic camera functionality.
https://docs.nvidia.com/jetson/archives/r36.3/DeveloperGuide/SD/CameraDevelopment/SensorSoftwareDriverProgramming.html?highlight=programing#to-run-a-v4l2-ctl-test
Once confirm the configure and still failed below link help to get log and some information and some tips for debug.
https://elinux.org/Jetson/l4t/Camera_BringUp#Steps_to_enable_more_debug_messages
Thanks!
software version: jetpack 6.0
hello MushroomHunter,
may I also know what’s your test pipeline? is it always report kernel panic while VI trying error recover?
besides, is it possible for moving forward to the latest JP-6 public release version for verification?
The test method is to unplug the camera when it is turned on normally.
Kernel panic will definitely occur.
For other latest software versions, the software migration cost is relatively high for us and testing may take some time.
just an FYI,
we did resolve kernel panic when error recovery has failed on JP-5 public release.
please see-also Topic 305007 for reference.
hello MushroomHunter,
we tested it before, we cannot reproduce kernel panic on r36.3 + Orin-Nano + IMX477.
although we’re using software simulated commands to forced-stop the camera stream.
could you please double check L4T release tag, $ cat /etc/nv_tegra_release
As I mentioned in my post, you need to unplug multiple cameras at the same time to reproduce this problem. The probability of reproducing a single camera is relatively low, but it can still be reproduced after multiple tests.
I have attached my log in the post, you can take a look.
Does this patch also work for JP6?
hello MushroomHunter,
we’re using software simulation approach since hardware plug/unplug might cause hardware damage.
may I know what’s the single camera (or, dual-cam) failure rate, we’ll setup an environment to reproduce the issue again.
we did not apply that for testing, however, you may give it a try since AGX Orin still using VI-5 drivers.
may I also know what’s your software capture pipeline, for instance, are you using v4l2 IOCTL?
JerryChang:
we did not apply that for testing, however, you may give it a try since AGX Orin still using VI-5 drivers.
may I also know what’s your software capture pipeline, for instance, are you using v4l2 IOCTL?
you are right. I used v4l2 ioctl.Unplug 4 cameras at the same time, and the problem can be reproduced in 1 minute
Moreover, I found that fatal panics are most likely caused by
[ 354.714399] mc_utils(O) nvmap(O) nvsciipc(O) drm fuse ip_tables x_tables ipv6 pwm_fan(E) pwm_tegra(E) tegra_bpmp_thermal(E)
[ 354.714409] CPU: 3 PID: 3493 Comm: vi-output, isx0 Tainted: G W OE 5.15.136-tegra #1
[ 354.714412] Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS 36.3.0-gcid-36191598 05/06/2024
[ 354.714414] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 354.714414] tegra-camrtc-capture-vi tegra-capture-vi: uncorr_err: request timed out after 2500 ms
[ 354.714416] pc : __vb2_buf_mem_finish+0x54/0x90 [videobuf2_common]
[ 354.714425] tegra-camrtc-capture-vi tegra-capture-vi: err_rec: attempting to reset the capture channel
[ 354.714426] lr : __vb2_buf_mem_finish+0x20/0x90 [videobuf2_common]
[ 354.714430] sp : ffff8000192ebcd0
[ 354.714430] x29: ffff8000192ebcd0 x28: ffff000088e0f080 x27: 0000000000000000
[ 354.714433] x26: 0000000000000000 x25: ffff000088e0f880 x24: ffff00011b088000
[ 354.714434] x23: ffff000088e0f880
[ 354.714434] tegra-camrtc-capture-vi tegra-capture-vi: uncorr_err: request timed out after 2500 ms
[ 354.714435] x22: 0000000000000180 x21: 0000000000000030
[ 354.714437] x20: 0000000000000001 x19: ffff0000ab2e2800 x18: ffffffffffffffff
[ 354.714439] x17: 0000000000000000 x16: 0000000000000000 x15: ffffbf7fa934f087
[ 354.714440] x14: ffffffffffffffff x13: ffffbf7fa934f07e x12: 756f2064656d6974
[ 354.714442] x11: 2074736575716572 x10: 203a7272655f7272 x9 : 00000000000009c4
[ 354.714444] x8 : 000000000a736d20
[ 354.714444] tegra-camrtc-capture-vi tegra-capture-vi: err_rec: attempting to reset the capture channel
[ 354.714445] x7 : 75716572203a7272 x6 : 000000000a736d20
[ 354.714446] x5 : 0000000000000000 x4 : ffff0007a8c039f0 x3 : 0000000000000001
[ 354.714448] x2 : ffffbf7fa6b81e58 x1 : ffff0000ab2e2800 x0 : 0000000000000000
[ 354.714450] Call trace:
[ 354.714451] __vb2_buf_mem_finish+0x54/0x90 [videobuf2_common]
[ 354.714455] vb2_buffer_done+0xb8/0x280 [videobuf2_common]
[ 354.714459] vi5_capture_dequeue+0x1a4/0x3a0 [tegra_camera]
[ 354.714476] tegra_channel_kthread_capture_dequeue+0xb0/0x1e0 [tegra_camera]
[ 354.714486] kthread+0x184/0x190
[ 354.714493] ret_from_fork+0x10/0x20
A small number of null pointers are
[ 1455.096913] Modules linked in: zh_ISX031(O) sunny_ov8m(O) max9295(O) ntb_transport_voyager nvidia_modeset(O) ntb_hw_voyager ntb ntb_dma_voyager pcie_switch_voyager lzo_rle lzo_compress zram zsmalloc brcm_xgbe tegra_pcie_dma_test(O) tegra_pcie_edma(O) snd_soc_tegra186_asrc(O) snd_soc_tegra210_mixer(O) snd_soc_tegra210_admaif(O) snd_soc_tegra210_afc(O) snd_soc_tegra186_arad(O) snd_soc_tegra_pcm snd_soc_tegra210_ope(O) snd_soc_tegra210_mvc(O) snd_soc_tegra186_dspk(O) snd_soc_tegra210_dmic(O) snd_soc_tegra210_adx(O) snd_soc_tegra210_amx(O) snd_soc_tegra210_i2s(O) snd_soc_tegra210_sfc(O) bridge stp llc usb_f_ncm usb_f_mass_storage ramoops reed_solomon snd_soc_tegra210_ahub(O) tegra210_adma spidev usb_f_acm u_serial usb_f_rndis nvvrs_pseq_rtc(O) u_ether ax88179_178a joydev usbnet snd_hda_codec_hdmi mttcan(O) tegra_cactmon_mc_all(O) can_dev tegra_aconnect tegra234_aon(O) snd_hda_tegra snd_hda_codec pwm_tegra_tachometer(O) at24 tegra_xudc phy_tegra194_p2u iray_horus640(O) snd_hda_core rfkill
[ 1455.096967] max96717f(O) snd_soc_rt5640 nvethernet(O) snd_soc_rl6231 max20087(O) snd_soc_tegra_machine_driver(O) spi_tegra114 nvpps(O) snd_soc_tegra_utils(O) mc_hwpm(O) host1x_fence(O) pcie_tegra194 max96712(O) snd_soc_simple_card_utils crct10dif_ce tegra23x_perf_uncore(O) gpio_didi_hardisk tegra234_oc_event(O) nvidia(O) nvpmodel_clk_cap(O) tegra_mce(O) thermal_trip_event(O) i2c_nvvrs11(O) nvhost_isp5(O) nvhost_vi5(O) tegra_dce(O) nvhost_nvcsi_t194(O) lm90 nvidia_vrs_pseq(O) libcomposite tegra_camera(OE) v4l2_dv_timings nvhost_nvcsi(O) tegra_camera_platform(O) capture_ivc(O) governor_userspace tegra_camera_rtcpu(O) ivc_bus(O) hsp_mailbox_client(O) ivc_ext(O) v4l2_fwnode tegra_drm(O) v4l2_async nvhost_pva(O) videobuf2_dma_contig nvhost_nvdla(O) tegra_wmark(O) videobuf2_memops videobuf2_v4l2 tsecriscv(O) videobuf2_common nvhwpm(O) nvhost_capture(O) videodev cec tegra_se(O) mc drm_kms_helper host1x_nvhost(O) crypto_engine nvidia_p2p(O) ina3221 nvgpu(O) governor_pod_scaling(O) host1x(O)
[ 1455.097016] mc_utils(O) nvmap(O) nvsciipc(O) drm fuse ip_tables x_tables ipv6 pwm_fan(E) pwm_tegra(E) tegra_bpmp_thermal(E) [last unloaded: zh_ISX031]
[ 1455.097027] CPU: 0 PID: 3782 Comm: vi-output, isx0 Tainted: G B W OE 5.15.136-tegra #1
[ 1455.097030] Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS 36.3.0-gcid-36191598 05/06/2024
[ 1455.097032] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1455.097035] pc : vi_channel_close_ex+0x28/0x90 [tegra_camera]
[ 1455.097065] lr : vi_channel_close_ex+0x28/0x90 [tegra_camera]
[ 1455.097076] sp : ffff8000192dbd10
[ 1455.097077] x29: ffff8000192dbd10 x28: ffff0000897ae880 x27: 0000000000000000
[ 1455.097080] x26: ffff0000897ae908 x25: ffff0000897ae570 x24: ffff0000897aeac0
[ 1455.097082] x23: ffff00008905c0a0 x22: 0000000000000000 x21: 0000000000000000
[ 1455.097083] x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000000
[ 1455.097085] x17: 0000000000000000 x16: ffffdf1e45431e20 x15: 0000ffffb8bad3d8
[ 1455.097087] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 1455.097089] x11: 0000000000000000 x10: d08a5513eab75b35 x9 : e2d39499c57d9207
[ 1455.097091] x8 : ffff0007a8bc1d10 x7 : fffffc00061e5c08 x6 : ffff20e9619d0000
[ 1455.097093] x5 : 718582f1eadf490a x4 : fffffc00061e71a0 x3 : 00000000ffffffff
[ 1455.097095] x2 : ffffdf1e45431e58 x1 : 0000000000000000 x0 : ffffdf1e45431e58
[ 1455.097097] Call trace:
[ 1455.097099] vi_channel_close_ex+0x28/0x90 [tegra_camera]
[ 1455.097110] vi5_channel_error_recover+0x68/0x1f0 [tegra_camera]
[ 1455.097121] tegra_channel_error_recover+0x80/0xa0 [tegra_camera]
[ 1455.097132] tegra_channel_kthread_capture_dequeue+0x288/0x330 [tegra_camera]
[ 1455.097143] kthread+0x184/0x190
[ 1455.097149] ret_from_fork+0x10/0x20
[ 1455.097155] Code: d50320ff 2a0003f4 aa1e03e0 940013fe (f9400e76)
[ 1455.097159] ---[ end trace 47f9d2630b37e683 ]---
JP5 patch feels like it’s already included in JP6.
hello MushroomHunter,
I see, there’re several error recover before kernel panic, right?
is this failure only reproduced with keeping camera disconnected?
please give it another try by settings timeout to v4l2 IOCTL.
for instance, you may configure it as a negative value (-1) to test with infinite timeout.
$ v4l2-ctl -d /dev/video0 --all
Camera Controls
override_capture_timeout_ms 0x009a206f (int) : min=-1 max=2147483647 step=1 default=2500 value=2500
Hello, we have tried setting it to -1.But there is a problem. When the camera has no stream, it cannot exit normally.
err = vi_capture_status(chan->tegra_vi_channel[vi_port], CAPTURE_TIMEOUT_MS);
I have also analyzed the reason. Since CAPTURE_TIMEOUT_MS is -1, kthread_stop(chan->kthread_capture_dequeue) cannot exit.
close(fd) (application)-> v4l2_release → tegra_channel_close → tegra_channel_stop_streaming → vi5_channel_stop_streaming → vi5_channel_stop_kthreads->kthread_stop(chan->kthread_capture_dequeue)
The kthread_capture_dequeue thread will always be blocked by vi5_capture_dequeue, resulting in failure to exit normally
while(1)
{
while (!(kthread_should_stop() || list_empty(&chan->dequeue)
|| (chan->capture_state == CAPTURE_ERROR))) {
buf = dequeue_dequeue_buffer(chan);
if (!buf)
break;
vi5_capture_dequeue(chan, buf);
}
}
The patch in this thread fixes this problem.
https://forums.developer.nvidia.com/t/orin-vi5-channel-stop-kernel-panic-on-jetpack5-1-1/258971/7
But after running for 6 hours, the following error will be printed.
[21111.532847] tegra_channel_kthread_capture_enqueue:[vi-output, isx031 7-0030]->[21111008285070] - [21108455357441] = 2552927629 ns
[21111.532909] tegra194-vi5 13e00000.host1x:vi0@15c00000: vi_channel_close_ex,240
[21111.532914] tegra194-vi5 13e00000.host1x:vi0@15c00000: vi_channel_close_ex,242
[21111.532915] tegra194-vi5 13e00000.host1x:vi0@15c00000: vi_channel_close_ex,245
[21111.532917] tegra194-vi5 13e00000.host1x:vi0@15c00000: vi_channel_close_ex,248
[21111.532918] tegra194-vi5 13e00000.host1x:vi0@15c00000: vi_channel_close_ex,251
[21111.532919] tegra194-vi5 13e00000.host1x:vi0@15c00000: vi_channel_close_ex,254
[21111.532920] tegra194-vi5 13e00000.host1x:vi0@15c00000: vi_channel_close_ex,257
[21111.538410] tegra-camrtc-capture-vi tegra-capture-vi: dma_alloc_coherent failed
[21111.538415] tegra-camrtc-capture-vi tegra-capture-vi: fatal: error recovery failed
I looked at the code and found that there is no code to release chan->request[vi_port] in vi5_channel_error_recover, which will cause a memory leak.
There is only tegra_channel_capture_setup in vi5_channel_error_recover, but no related release function. But I found that there is a corresponding release function in vi5_channel_stop_streaming.
dma_free_coherent(chan->tegra_vi_channel[vi_port]->rtcpu_dev,
chan->capture_queue_depth * sizeof(struct capture_descriptor),
chan->request[vi_port], chan->request_iova[vi_port]);
So is this a code logic bug?
Also, will the patches in this post be reflected in new versions later?
hello mmcly,
would you please update the kernel image to add below patch for capture-ivc.c.
for example,
diff --git a/drivers/platform/tegra/rtcpu/capture-ivc-priv.h b/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
index 04c95aef3..f6cb92497 100644
--- a/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
+++ b/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
@@ -37,6 +37,7 @@ struct tegra_capture_ivc_cb_ctx {
tegra_capture_ivc_cb_func cb_func;
/** Private context of a VI/ISP capt…
For the memory leak problem, I made a patch
diff --git a/Linux_for_Tegra/source/nvidia-oot/drivers/media/platform/tegra/camera/vi/vi5_fops.c b/Linux_for_Tegra/source/nvidia-oot/drivers/media/platform/tegra/camera/vi/vi5_fops.c
index 1fadda429..12136bf1f 100644
--- a/Linux_for_Tegra/source/nvidia-oot/drivers/media/platform/tegra/camera/vi/vi5_fops.c
+++ b/Linux_for_Tegra/source/nvidia-oot/drivers/media/platform/tegra/camera/vi/vi5_fops.c
@@ -320,7 +320,7 @@ static int tegra_channel_capture_setup(struct tegra_channel *chan, unsigned int
setup.queue_depth * setup.request_size,
&setup.iova, GFP_KERNEL);
chan->request_iova[vi_port] = setup.iova;
-
+ dev_err(chan->vi->dev, "dma_addr:[%pad] dma_length:[%d]\n", &setup.iova, setup.queue_depth * setup.request_size);
if (chan->request[vi_port] == NULL) {
dev_err(chan->vi->dev, "dma_alloc_coherent failed\n");
return -ENOMEM;
@@ -575,7 +575,8 @@ uncorr_err:
rel_buf:
vi5_release_buffer(chan, buf);
}
-
+static void vi5_unit_get_device_handle(struct platform_device *pdev,
+ uint32_t csi_stream_id, struct device **dev);
static int vi5_channel_error_recover(struct tegra_channel *chan,
bool queue_error)
{
@@ -593,6 +594,23 @@ static int vi5_channel_error_recover(struct tegra_channel *chan,
dev_err(&chan->video->dev, "vi capture release failed\n");
goto done;
}
+ /* Release capture requests */
+ if (chan->request[vi_port] != NULL) {
+ dma_free_coherent(chan->tegra_vi_channel[vi_port]->rtcpu_dev,
+ chan->capture_queue_depth * sizeof(struct capture_descriptor),
+ chan->request[vi_port], chan->request_iova[vi_port]);
+ }
+ chan->request[vi_port] = NULL;
+
+ /* Release emd data buffers */
+ if (chan->emb_buf_size > 0) {
+ struct device *vi_unit_dev;
+ vi5_unit_get_device_handle(chan->vi->ndev, chan->port[0],\
+ &vi_unit_dev);
+ dma_free_coherent(vi_unit_dev, chan->emb_buf_size,
+ chan->emb_buf_addr, chan->emb_buf);
+ chan->emb_buf_size = 0;
+ }
vi_channel_close_ex(chan->vi_channel_id[vi_port],
chan->tegra_vi_channel[vi_port]);
chan->tegra_vi_channel[vi_port] = NULL;
After a short test, there is no memory leak at present. But I still need long-term testing
hello MushroomHunter,
please move allow_signal
from enqueue into dequeue, so that you may break capture thread.
for instance,
diff --git a/drivers/media/platform/tegra/camera/vi/vi5_fops.c b/drivers/media/platform/tegra/camera/vi/vi5_fops.c
@@ -688,7 +688,6 @@ static int tegra_channel_kthread_capture_enqueue(void *data)
struct tegra_channel_buffer *buf;
unsigned long flags;
set_freezable();
- allow_signal(SIGINT);
while (1) {
try_to_freeze();
@@ -732,6 +731,7 @@ static int tegra_channel_kthread_capture_dequeue(void *data)
struct tegra_channel_buffer *buf;
set_freezable();
+ allow_signal(SIGINT);
unfortunately, the semaphore patch (i.e. Topic 258971, comment #18 ) did not check-in to the rel-36 code-line.
that’s exactly the patch we’ve in the upcoming JP-6 public release.
it has release capture requests and embedded data buffers before closing and exiting the stream.
hello mmcly,
would you please update the kernel image to add below patch for capture-ivc.c.
for example,
diff --git a/drivers/platform/tegra/rtcpu/capture-ivc-priv.h b/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
index 04c95aef3..f6cb92497 100644
--- a/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
+++ b/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
@@ -37,6 +37,7 @@ struct tegra_capture_ivc_cb_ctx {
tegra_capture_ivc_cb_func cb_func;
/** Private context of a VI/ISP capt…
According to current tests, if this patch is not installed, when multiple cameras fail, the channel resources will be busy. So how should this be solved? In other words, is there any risk in doing this? Do you recommend doing this?
I do recommend to apply that race condition fix to resolve channel context is busy failure.
OK, then I will transplant this patch into my system code