System crashed when open four cameras via GMSL

Hello everyone,
We built a custom carrier board that connected four ov_5640 cameras via GMSL.
We found that the system crashed under L4T R35.5 when executing test_camera.sh for the second time.
test_camera.sh is a script that uses gstreamer to open the four cameras by using gstreamer.
The system operates normally under L4T R35.4.1. Attached is the kernel log and the test_camera.sh.

How to solve the problem! ?
Any advice would be greatly appreciated!

kernel.log (131.8 KB)
test_camera.sh (3.5 KB)

Please update the RCE firmware to try.

camera-rtcpu-t234-rce.img.r35.5.asynchronous (519.3 KB)

Reference to below to update the firmware.

Hi @ShaneCCC

Thanks for your reply. I following the steps as below link to update the firmware, because we are on the Orin NX.

I checked the checksum of camera firmware after updated it via kernel messages:

$ sudo dmesg | grep "firmware version"

The results as below:

The Old one :
tegra186-cam-rtcpu bc00000.rtcpu: firmware version cpu=rce cmd=6 sha1=55ecd57df677fd722e795b49ccb283f065b222a1

The new one:
tegra186-cam-rtcpu bc00000.rtcpu: firmware version cpu=rce cmd=6 sha1=761d9d48563b4131a5af3b34f2c9f19a713a222d

So I think I have updated it successfully. But it still had same problem…

Please give me some advice. Thanks!

Please also update these patches.

0001-capture-ivc-fix-multi-cam-race-condition.patch (2.4 KB)
0001-vi5-continue-captures-even-after-corr-errors.patch (2.3 KB)

Hi @ShaneCCC

I found that this patch ( 0001-vi5-continue-captures-even-after-corr-errors.patch) was already added in the R35.5. So I didn’t add this patch in this time.

I only applied this patch ( 0001-capture-ivc-fix-multi-cam-race-condition.patch) in this time.

But it still not working…

Could you check if still have below error message after apply the RCE firmware.

General error queue is out of sync with frame queue.

Thanks

Hi ShaneCCC,
We didn’t see that error messages after applying the patch and firmware. But the system still crashed. Please see the attached kernel log.
putty.log (120.6 KB)

Looks like unable to capture from the GMSL and try to recover many times and crash finally.

Hi ShaneCCC,
Here are some test result for your reference.

FW version R35.4.1 vi5_fops.c R35.5 vi5_fops.c
R35.4.1 Camera FW OK NG
R35.5 Camera FW NG NG
Async Camera FW NG NG

Get the trace log to check.

Hi ShaneCC,

Here are the logs.
r35-4-fw_r35-4.log (129.6 KB)
r35-4-fw_r35-5.log (131.1 KB)
r35-5-fw_r35-4.log (278.9 KB)
r35-5-fw_r35-5.log (247.4 KB)
Async-fw_r35-4.log (186.5 KB)
Async-fw_r35-5.log (770.3 KB)

I found that system didn’t crash if I use the vi5_fops.c from R35.4.1. But the camera image would be broken and the some of the camera window would be closed automatically when using Async or r35-5 camera FW.

The system would crash if I use the vi5_fops.c from R35.5.

No matter kernel crash or not both have capture failed.
Do you try single camera instead of multiple cameras.

Thanks

Hi ShaneCCC,
The kernel would not crash if we only open one camera.

Does the log show the capture failed and timeout if open single camera only?

Thanks

Please confirm on r35.4.1 for single camera if have failed and timeout message.

If yes, I would suspect the camera timing or signal problem.

Hi ShaneCCC,
I just tested in L4T R35.4.1 and can open one camera without any problem.

Hi ShaneCCC,
I did some further tests. Please see the following result.

  1. R35.4.1 rootfs + R35.4.1 Camera FW + R35.5.0 kernel + R35.4.1 vi5_fops.c
    OK. test 20 times.
    1.log (327.6 KB)

  2. R35.4.1 rootfs + R35.4.1 Camera FW + R35.5.0 kernel + R35.5.0 vi5_fops.c
    OK. test 20 times.
    2.log (327.0 KB)

  3. R35.4.1 rootfs + R35.5.0 Camera FW + R35.5.0 kernel + R35.4.1 vi5_fops.c
    NG. No crash but sometimes no video from One of the four camera.
    3.log (208.2 KB)

  4. R35.4.1 rootfs + R35.5.0 Camera FW + R35.5.0 kernel + R35.5.0 vi5_fops.c
    NG. System Crashed. The image is distorted or green.
    4.log (455.5 KB)

  5. R35.5.0 rootfs + R35.5.0 Camera FW + R35.5.0 kernel + R35.5.0 vi5_fops.c
    NG. Kernel crashed when close camera window.
    5.log (143.2 KB)

  6. R35.5.0 rootfs + R35.5.0 Camera FW + R35.5.0 kernel + R35.4.1 vi5_fops.c
    NG. No crashed but will show VI channel not found for stream…
    6.log (335.9 KB)

  7. R35.5.0 rootfs + R35.4.0 Camera FW + R35.5.0 kernel + R35.4.1 vi5_fops.c
    OK. test 20 times. But cannot only open one gmsl camera must open all the four cameras.
    7.log (481.7 KB)

  8. R35.5.0 rootfs + R35.4.0 Camera FW + R35.5.0 kernel + R35.5.0 vi5_fops.c
    NG. System crashed when I close camera windows.
    8.log (160.3 KB)

Conclusions:

  1. I can open only one gmsl camera when system is flashed with R35.4.1 rootfs. It will triger the vi channel reset operation when system is flashed with R35.5.0 rootfs.
  2. Sometimes vi cannot get the images from gmsl camera if system is flashed with r35.5 camera FW.
  3. When vi cannot get the images from gmsl camera, it will reset the vi channel. Kernel sometimes will crash after vi do the reset operation several times when kernel is compiled with R35.5.0 vi5_fops.c.
  4. On the R35.5.0 rootfs. When I closed the camera windows sequentially, because the system only allowed all four cameras to be opened, the vi channel reset action was triggered. If the kernel is compiled with R35.5.0 vi5_fops.c, then the kernel sometimes will crash.
  5. So I think this problem may be caused by the following reasons.
    VI cannot get the images from gmsl camera (R35.5.0 Camera FW) → VI do the reset operation → kernel crash. (R35.5.0 vi5_fops.c)

Is there any difference between the camera fw of R35.4.1 and R35.5, which causes VI sometimes cannot get the camera images?

R35.5 RCE firmware have deskew calibration update.

Hi ShaneCCC,
Could you please give me a R35.5 RCE firmware with debug flag enabled!?
So that I could see if any usage debug message output from RCE firmware.

Check this binary.

camera-rtcpu-t234-rce.img.r35.5.dbg (525.4 KB)