Jetson Corrupted Frame Camera Driver

Ok great thanks. I did try it and applied your suggested change. However than I got the following error:

no events in queue

Thus I adapted also this error to just continue the application and try again to get a new frame in the queue. However it seems like the event got not triggered anymore after a corrupted frame.

In the two images below you can see the code and output of the terminal.

Best.


hello david.mueller1,

as you can see… Argus::IEventQueue::getSize. it returns the number of events in the queue.

we may double check the MIPI packets.
for instance,
there should be SoF (start-of-frame)/EoF (end-of-frame) from camera hardware signaling to indicate a good frame from software side.

Ok, yes, exactly. And after receiving an error (before triggering iMetadata==Null) no more events are generated. I also tried to capture a frame directly when no event is received using ICaptureSession’s capture method. However, it seems that the application does not abort after an error, but I simply do not get any new frames.

Where can I check the SoF and EoF using libargus. The only way I have found to do this is using iEventProvider which is not an option as I am not getting any new event.

Is there anything else I can do or will you get back to me once you’ve checked the MIPI packets?

hello david.mueller1,

you should probe the MIPI signaling by oscilloscope.
or… please try below steps to enable VI tracing logs.

echo 1 > /sys/kernel/debug/tracing/tracing_on
echo 30720 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/tegra_rtcpu/enable
echo 1 > /sys/kernel/debug/tracing/events/freertos/enable
echo 2 > /sys/kernel/debug/camrtc/log-level
echo > /sys/kernel/debug/tracing/trace
cat /sys/kernel/debug/tracing/trace

Here is the output of the log. The application failed at frame 10. I can see from the logs that frame 12 was triggered. However, I ran the application for a long time after that and no more logs were written.
logs.txt (163.9 KB)

hello david.mueller1,

thanks for sharing the logs,
it’s a pair of CHANSEL_PXL_SOF/CHANSEL_PXL_EOF to indicate a good frame.

however, here shows PHY interrupts after frame-11.
re-cap as below…
rtcpu_nvcsi_intr: tstamp:2906406844105 class:GLOBAL type:PHY_INTR0 phy:1 cil:0 st:0 vc:0 status:0x00000080
rtcpu_nvcsi_intr: tstamp:2906406844105 class:CORRECTABLE_ERR type:PHY_INTR phy:1 cil:0 st:0 vc:0 status:0x00000080

the error code 0x80 means a control error, it’s LP sequence error detected on data-lane.
normally, LP sequence it should follow by LP11->LP01->LP00->LP11 sequence.
since it’s intermittent MIPI signal. it could be SW side out-of-sync with the high-speed signaling.

please give it a try to configure DT property, cil_settletime.
it’s the settings of THS settle time of the MIPI lane.

Hello,
I am now encountering a similar issue.
I would like to know what the normal result of running userAutoExpourse is.
After running it multiple times, I always get the result as shown in the following image.

userAutoExpourse is sample app to demonstrates manual exposure time and analog gain controls using a basic auto-exposure algorithm.
its frame count has configured to 60-frames by default, you may adding --frames=COUNT options for processing more frames.
so, you’re having normal result of running userAutoExpourse.

Hello,
Today I conducted camera testing on Jetson Orin NX 16GB with JetPack 5.1.2, camera: IMX327
Two of the cameras were tested using the command

gst-launch-1.0 nvarguscamerasrc ee-mode=0 tnr-mode=0 aeantibanding=0 silent=false ! fakesink

while the other one was tested using the vidio-viewer command

video-viewer csi://2 --input-rate=25

However, none of the cameras were able to capture images.
The error message from GStreamer indicated UNAVAILABLE and TIMEOUT.


And I need to restart nvargus,

systemctl restart nvargus-daemon.service

otherwise the new commands cannot access the camera.

This could be due to intermittent MIPI signals. I would like to know how to test the stability of the camera input signal to verify my hypothesis.

Thank you!

Hello,
I would like to know how you found out that the device was receiving a corrupted frame.

Hi JerryChange

Thanks a lot for your answer. I was not able to found the device tree for the IMX219 sensor. Where are they located ? Do I need to reflash the Jetson for changing the DT ?

So in your case the timeout was called immediately after you started gstreamer, so argus probably crashed before and you tried to use it without restarting its service using `systemctl restart nvargus-daemon.service’.

This is however essentially not necessary anymore if you use the binaries from here.

However, the application will still fail once you get a corrupted frame, except that you don’t need to restart libargus.

To find out that my problem was due to the intermittent MIPI signalling, I tried to reproduce it on a desktop setup. The longer the CSI cable, the more it happened. Finally, doing some electrical stress tests by rubbing the cables together always resulted in a timeout problem with the same logs as I otherwise got when using the camera with the robot.

Hello,
I’m sorry that my messages may have caused confusion. I need to clarify something.


Adding >> ./video2.txt to the end of a command will append the standard output of the command execution to a text file named video2.txt. Any command’s standard output will be redirected and appended to this file instead of being displayed in the terminal. Standard error output (stderr) will not be redirected and will still be displayed in the terminal.
So the error was actually displayed after a long time.

And I understand how you tested the intermittent MIPI signalling.
Thank you for your response!

Regarding the questions I asked here, I have opened a new post and provided more information. You can reply there The camera cannot run continuously due to nvargus .

hello david.mueller1,

$public_sources/r35.5.0/Linux_for_Tegra/source/public/hardware/nvidia/platform/t23x/p3768/kernel-dts/cvb/tegra234-camera-rbpcv2-imx219.dtsi

it’s able to perform a partition update.

please see-also developer guide to Building the Kernel.
you may update device tree only by checking To flash a specific partition.
or… it’s UEFI’s capability for loading device tree via FDT entry, please see-also DTB Support.

Ok great so I was able to adjust the cil_settletime in the DT. I checked that it worked using:
cat /sys/firmware/devicetree/base/cam_i2cmux/i2c@0/rbpcv2_imx219_a@10/mode0/cil_settletime

I the settletime to 48.0, as I asumed to have a ui of 12.5 ns.
lp_clock_period=1204×106≈4.902
ui=12.5 n

Calculate the constants:
85 ns+75 ns=160 ns
145 ns+125 ns=270 ns

Thus the lower bound is:
160 / 4.9 < cil_settletime + 6 => 26.64 < cil_settletime
The upper bound is:
cil_settletime + 6 < 270 / 4.9 => cil_settletime < 49.07

I still get the same problem. please see my logs here (until we get the same interrupt as before setting the cil_settletime):
logs.txt (453.1 KB)

hello david.mueller1,

something has changed. re-cap the error message as below,
here’s failure… General error queue is out of sync with frame queue.
ERROR: camera-ip/vi5/vi5.c:745 [vi5_handle_eof] General error queue is out of sync with frame queue. ts=156360035936 sof_ts=156361076032 gerror_code=2 gerror_data=800062 notify_bits=40000"""

we have some changes to address this failure recently.
would you like to stick with r35.3.1 for deployment? or, is it possible for moving to the latest JP-5 release version (r35.5.0) to apply the pre-built update for verification.
here’s rce-fw update, Topic290301_May17_rce-fw.zip (255.6 KB) which is based-on r35.5.0 release version.
you may see-also Topic 260583 for steps to replace the camera firmware on Orin NX.

Hello JerryChang.

No we can switch to r35.5.0 or even r36. Will test this as soon as possible. However I will need to reflash a jetson and therefore it might be on monday. Thanks for your help.

When applying these new fixes should I probably still set the increased cil_settletime

Best,

David

hello david.mueller1,

yes, please do also have a try to increase cil_settletime for your specific scenario.

Hello JerryChang

Thanks for your help. So followed the instructions. In step 3 of Request debug RTCPU image for JP5.1.1 - #5 by JerryChang I got the following error:

user@host:~/Projects/jetson/jetson_35.0/Linux_for_Tegra$  sudo ./flash.sh --no-flash -r -k A_rce-fw jetson-agx-orin-devkit mmcblk0p1
###############################################################################
# L4T BSP Information:
# R35 , REVISION: 5.0
# User release: 0.0
###############################################################################
ECID is 
Board ID() version() sku() revision()
Chip SKU(00:00:00:D0) ramcode() fuselevel(fuselevel_production) board_FAB()
Error: Unrecognized module SKU 

Therefor I tried to use the same one as for the whole flashing:

sudo ./flash.sh --no-flash -r -k A_rce-fw p3509-a02+p3767-0000 internal

This command was succesfull. I than reflashed the whole jetson and tried to get a video capture using nvgstcapture-1.0. Thereby I got an error. The output of the log is:

[   19.087204] fuse: init (API version 7.32)
[  103.126634] logitech-hidpp-device 0003:046D:4088.0006: HID++ 4.5 device connected.
[  106.377346] falcon 154c0000.nvenc: Direct firmware load for nvhost_nvenc080.fw failed with error -2
[  106.386835] falcon 154c0000.nvenc: Falling back to sysfs fallback for: nvhost_nvenc080.fw
[  106.396158] falcon 154c0000.nvenc: looking for firmware in subdirectory

Thereby appying the firmware was not sucessfull. Which command could I use for the step3 in the instructions ?

Best

David