How to properly initialize MIPI capture when source is streaming continuously?

Hi,
Normal MIPI cameras are told to start and stop streaming via I2C,
but I am testing a MIPI capture device, which is streaming continuously and does not have I2C.
I see that the first capture after Orin boot is successful, but the second and following capture fails.
I am using
v4l2-ctl -d /dev/video4 --set-ctrl sensor_mode=0 --set-fmt-video=width=2880,height=2880 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1 --stream-to test3.raw --verbose

I tried using debug FW Topic268833_JP-512_rce-fw_Debug.7z with 35.4.1

The first successful capture produces trace, which starts with lines like:

tegra_nvcsi_stream_set_config 
===== NVCSI Stream Configuration =====

but the second failed capture does not have stream configuration, but instead if has non-stop errors:

rtcpu_string: tstamp:27210535315 id:0x04010000 str:"ISR PHY 3 CIL_B 0x44
rtcpu_nvcsi_intr: tstamp:27210534718 class:GLOBAL type:PHY_INTR0 phy:3 cil:1 st:0 vc:0 status:0x00000044
rtcpu_string: tstamp:27210535875 id:0x04010000 str:"ISR PHY 3 CIL_A 0x4
rtcpu_nvcsi_intr: tstamp:27210535653 class:GLOBAL type:PHY_INTR0 phy:3 cil:0 st:0 vc:0 status:0x00000004
rtcpu_string: tstamp:27210536250 id:0x04010000 str:"ISR PHY 3 CIL_B 0x44

bad.txt (2.1 KB)
good.txt (16.9 KB)
So, question is how to properly reinitialize capture session, perform clock deskew, etc?

Thank you

hello jhnlmn,

there’re PHY interrupts, the error code 0x44 indicate there’re more than one bit error has detected on data-lane.
you should have reset on the sensor driver side, and please reset it while s_stream has called.

As I explained in the question, this is not a sensor, but a capture device and it does not have reset.
After I reboot Orin the first capture does not have errors, which means that MIPI source is OK.
Therefore, I assume that I need a way to reset Orin capture FW (without rebooting it).
In particular, I think I need a way to restart deskew mechanism. I have a suspicion that deskew is not executed after I restart v4l2-ctl.
Also, I need to somehow invoke software, which prints “tegra_nvcsi_stream_set_config”. Is there some IOCTL call to invoke tegra_nvcsi_stream_set_config?

hello jhnlmn,

could you please give it a try to apply below change, which be able to continue captures even if the channel encounters errors.

---
 .../media/platform/tegra/camera/vi/vi5_fops.c  | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/media/platform/tegra/camera/vi/vi5_fops.c b/drivers/media/platform/tegra/camera/vi/vi5_fops.c
index 80b7d2bd9..2071b3687 100644
--- a/drivers/media/platform/tegra/camera/vi/vi5_fops.c
+++ b/drivers/media/platform/tegra/camera/vi/vi5_fops.c
@@ -485,6 +485,7 @@ static void vi5_capture_dequeue(struct tegra_channel *chan,
 	struct tegra_channel_buffer *buf)
 {
 	int err = 0;
+	bool frame_err = false;
 	int vi_port = 0;
 	int gang_prev_frame_id = 0;
 	unsigned long flags;
@@ -528,12 +529,7 @@ static void vi5_capture_dequeue(struct tegra_channel *chan,
 					"err_data %d\n",
 					descr->status.frame_id, descr->status.flags,
 					descr->status.err_data);
-#if LINUX_VERSION_CODE < KERNEL_VERSION(5, 4, 0)
-					buf->vb2_state = VB2_BUF_STATE_REQUEUEING;
-#else
-					buf->vb2_state = VB2_BUF_STATE_ERROR;
-#endif
-			goto done;
+				frame_err = true;
 			}
 		} else if (!vi_port) {
 			gang_prev_frame_id = descr->status.frame_id;
@@ -565,7 +561,14 @@ static void vi5_capture_dequeue(struct tegra_channel *chan,
 #endif
 	vb->vb2_buf.timestamp = descr->status.sof_timestamp;
 
-	buf->vb2_state = VB2_BUF_STATE_DONE;
+	if (frame_err)
+#if LINUX_VERSION_CODE < KERNEL_VERSION(5, 4, 0)
+		buf->vb2_state = VB2_BUF_STATE_REQUEUEING;
+#else
+		buf->vb2_state = VB2_BUF_STATE_ERROR;
+#endif
+	else
+		buf->vb2_state = VB2_BUF_STATE_DONE;
 	/* Read EOF from capture descriptor */
 #if LINUX_VERSION_CODE < KERNEL_VERSION(5, 4, 0)
 	ts = ns_to_timespec((s64)descr->status.eof_timestamp);
@@ -578,7 +581,6 @@ static void vi5_capture_dequeue(struct tegra_channel *chan,
 	trace_tegra_channel_capture_frame("eof", &ts);
 #endif
 
-done:
 	goto rel_buf;
 
 uncorr_err:

No, still does not work.
Usually I get a flood of “[RCE] ISR PHY 2 CIL_A 0x4” and “RCE] ISR PHY 2 CIL_B 0x40” messages
and eventually “uncorr_err: request timed out after 2500 ms”.
But your patch is only activated by “corr_err: discarding frame” message, which usually does not arrive at all. I saw that message only once, but it still failed to recover (here I added 2 more printk to vi5_capture_dequeue):
tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 1, flags: 0, err_data 131072
vi5_capture_dequeue:556 skip VB2_BUF_STATE_REQUEUEING/VB2_BUF_STATE_ERROR
vi5_capture_dequeue:603 VB2_BUF_STATE_ERROR

uncorr_err: request timed out after 2500 m
I wonder whether there is a way to perform periodic skew calibration, as described in this post:

?

We implemented MIPI streaming start/stop using GPIO to more closely mimic real camera sensor.
But Orin still fails to capture correctly.
Second capture prints a lot of errors like:
kworker/8:3-867 [008] … 540.698156: rtcpu_nvcsi_intr: tstamp:17653395931 class:GLOBAL type:STREAM_VC phy:0 cil:0 st:4 vc:0 status:0x00000004
kworker/8:3-867 [008] … 540.698158: rtcpu_nvcsi_intr: tstamp:17653395931 class:CORRECTABLE_ERR type:STREAM_VC phy:0 cil:0 st:4 vc:0 status:0x00000004
kworker/8:3-867 [008] … 540.698158: rtcpu_nvcsi_intr: tstamp:17653396527 class:GLOBAL type:STREAM_VC phy:0 cil:0 st:4 vc:0 status:0x00000006
kworker/8:3-867 [008] … 540.698159: rtcpu_nvcsi_intr: tstamp:17653396527 class:CORRECTABLE_ERR type:STREAM_VC phy:0 cil:0 st:4 vc:0 status:0x00000006
kworker/8:3-867 [008] … 540.698159: rtcpu_nvcsi_intr: tstamp:17653397099 class:GLOBAL type:STREAM_VC phy:0 cil:0 st:4 vc:0 status:0x00000006

Even if it manages to capture a frame, the image ends up being corrupted.
But if I reboot Orin, then first capture is still OK,
So, I have to repeat my question: how to reset MIPI FW without rebooting Orin?

hello jhnlmn,

FYI,
we are only looking for initial deskew calibration; however, it should also be ok to configure the deserializer to send periodic skew calibration bursts. the effect would be that the initial skew calibration can be retried every time the skew calibration burst is sent.

as mentioned above.
it should due to we are only looking for initial deskew calibration.
you may give it a try to restart Argus daemon
for example,
$ sudo pkill nvargus-daemon
$ sudo systemctl start nvargus-daemon

Restarting nvargus-daemon does not make any difference, which is not surprising since I am using v4l2_ctrl, not argus

it should also be ok to configure the deserializer to send periodic skew calibration bursts.

What do you mean? The chip vendor told us that they are sending de-skew calibration sequence at every frame. What else should we ask them to do?