How to make Argus in Jetson 35.2.1 recover after a corrupted frame?

As I wrote at the beginning, the easiest way to simulate MIPI noise corruption is by briefly shorting some MIPI wires, such as D+/D-. The simplest test is just briefly touch exposed MIPI wires on some connector with a screwdriver. Depending where in MIPI frame the corruption happened you may see garbage on the screen (without capture interruption), you may see some MIPI rows lost (with the bottom portion of frame shifting up) and you may see a complete loss of frame. And then Argus fails. (but v4l survives this test).
The next test is to attach a transistor to those wires and then generate a short pulse synchronized with camera to corrupt only specific part of the frame for specified duration or corrupt only n-th frame or several frames in a row.

let me have clarification,
Argus and also camera driver expect sensor stream continuous sending frames without failures.

error handling mechanism is there to keep system alive.
since there’s intermittent signaling, it shows timeout failures from camera pipeline. Argus will report it via EVENT_TYPE_ERROR, and the application has to shutdown.

Argus and also camera driver expect sensor stream continuous sending frames without failures.

This is wrong expectation. This can only work in some hypothetical idealized camera. But in real world cameras occasional corruption is unavoidable. That’s why MIPI standard provides for CRC checksum for every packet - so that receiver can simply drop corrupted frame and continue receiving good frames.

error handling mechanism is there to keep system alive.

But Argus does not keep the system alive - it does everything wrong - it does NOT check CRC at all, thus allowing partially corrupted frame to be displayed and it closes camera streaming in case of other corrupted frames, thus killing the system.

As I wrote from the beginning, the correct behavior is:

  1. Check CRC and drop partially corrupted frame (or return it to the app with indication of corruption)
  2. Continue receiving good frames

The decision to stop streaming after corruption frames must belong to the application, not Argus, because every application has its own requirement about acceptable frame loss ratio.

firmware side has already checked the packet data CRC, seeing TRM for PH_WC and PF_CRC

No, it does not. If corruption affects only pixels and not the frame structure, then firmware ignores it and display is corrupted. Sometimes row headers are corrupted and then rows below it are shifted and discolored, but capture does not stop. And only if entire frame is broken, then capture stops. I attached few images - it took only few minutes to reproduce and capture them.

There’s enhancement in Argus stack for error handling in next Jetpack release. Once it is released, you may upgrade and give it a try. General error cases should be covered. However, shorting D+ and D- is more like a corner case and it may not be able to be detected and handled.

I worked on several camera projects in different branches of robotics and saw that even the best shielded cable still occasionally receives a burst of noise, which corrupts a frame a two and ability of system to detect, recover and continue operation is crucial. In most cases it is OK to drop few frames out of 60 frames per second, but it is not OK to just stop streaming and it is not OK to pass corrupted frames to application.
By the way, I saw that older version of Jetson in kernel/nvidia/drivers/media/platform/tegra/camera/csi/csi4_fops.c has code to enable CRC check and check status, but newer code in csi5_fops.c lost any CRC support. Why is that?

because that’s moving to camera firmware side for doing so.
it’s A_rce-fw partition, the binary file is… camera-rtcpu-t234-rce.img, the sources is not public available.

Are you saying that CRC checking functionality was lost when you moved code from Linux to firmware?
Is there a way to run old csi4_fops.c code on Orin?
Or, at least, reintroduce that CRC register access? Is that CRC status register still available in hardware?
Can it still be accessible from a kernel driver?

hello jhnlmn,

sorry, it cannot.

according to Orin TRM, it looks like we don’t have CRC check support for DPHY.
let me double check this internally.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.