Orin Nano: MIPI acquisition reports CHANSEL_NOMATCH

Hello,

I’m using a sensor imx297 on a Orin Nano, on the port CSI3. This is a MIPI sensor, single lane. The mclk is generated on the camera PCB side. The peak bandwidth is 1.2 Gbps.

I already ran it successfully on the Xavier. I suspect a device tree configuration problem.

I’m running with a custom driver, and I can confirm though probing that when I start an acquisition:

  • the camera gets configured over I2C as expected
  • the camera is streaming

When I run: v4l2-ctl --stream-mmap --stream-count=1 --stream-to=frame.raw, the tracing gives me:

     kworker/1:3-123     [001] ....  1717.850224: rtcpu_vinotify_event: tstamp:54287068871 cch:0 vi:1 tag:FS channel:0x00 frame:187 vi_tstamp:1737176028128 data:0x000000bb00000013
     kworker/1:3-123     [001] ....  1717.850225: rtcpu_vinotify_event: tstamp:54287069119 cch:0 vi:1 tag:CHANSEL_NOMATCH channel:0x08 frame:187 vi_tstamp:1737176042688 data:0x0000000000000249
     kworker/1:3-123     [001] ....  1717.850225: rtcpu_vinotify_event: tstamp:54287069395 cch:0 vi:1 tag:FE channel:0x00 frame:187 vi_tstamp:1737183997216 data:0x000000bb00000023
     kworker/1:3-123     [001] ....  1717.850225: rtcpu_vinotify_error: tstamp:54287258958 cch:0 vi:1 tag:CHANSEL_NOMATCH channel:0x08 frame:188 vi_tstamp:1737192153632 data:0x0000000000000249
     kworker/1:3-123     [001] ....  1717.906226: rtcpu_vinotify_event: tstamp:54287611510 cch:0 vi:1 tag:FS channel:0x00 frame:188 vi_tstamp:1737192139104 data:0x000000bc00000013
     kworker/1:3-123     [001] ....  1717.906226: rtcpu_vinotify_event: tstamp:54287611795 cch:0 vi:1 tag:CHANSEL_NOMATCH channel:0x08 frame:188 vi_tstamp:1737192153632 data:0x0000000000000249
     kworker/1:3-123     [001] ....  1717.906227: rtcpu_vinotify_event: tstamp:54287612039 cch:0 vi:1 tag:FE channel:0x00 frame:188 vi_tstamp:1737200108160 data:0x000000bc00000023

which I read as:

  • no_match: 1
  • CTYPE: 0x4 = LS
  • DTYPE: 0x12, matching the metadata lines ID - 2 metadata lines are expected

And the kernel reports:

[   61.708122] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 1, flags: 0, err_data 256
[   61.724169] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 2, flags: 0, err_data 256
[   61.740025] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 3, flags: 0, err_data 256
[   61.756110] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 4, flags: 0, err_data 256

Could you please give me some pointer about how to investigate further?

Please find attached:

And some extra debug info from media-ctl -p -d /dev/media0

Media controller API version 5.10.104

Media device information
------------------------
driver          tegra-camrtc-ca
model           NVIDIA Tegra Video Input Device
serial          
bus info        
hw revision     0x3
driver version  5.10.104

Device topology
- entity 1: 13e40000.host1x:nvcsi@15a00000- (2 pads, 2 links)
            type V4L2 subdev subtype Unknown flags 0
            device node name /dev/v4l-subdev0
	pad0: Sink
		<- "aev2_cam 10-001a":0 [ENABLED]
	pad1: Source
		-> "vi-output, aev2_cam 10-001a":0 [ENABLED]

- entity 4: aev2_cam 10-001a (1 pad, 1 link)
            type V4L2 subdev subtype Sensor flags 0
            device node name /dev/v4l-subdev1
	pad0: Source
		[fmt:SRGGB10_1X10/720x540 field:none colorspace:srgb]
		-> "13e40000.host1x:nvcsi@15a00000-":0 [ENABLED]

- entity 6: vi-output, aev2_cam 10-001a (1 pad, 1 link)
            type Node subtype V4L flags 0
            device node name /dev/video0
	pad0: Sink
		<- "13e40000.host1x:nvcsi@15a00000-":1 [ENABLED]

hello maxe777,

discarding frame logs it’s sometime a warning messages, due to unsuccess capture state, it’s dropping frames and issue a requeue for new buffers.
the error code meant channel has encountered uncorrectable error, and it must be reset.

may I know which Jetpack release version you’re working with?
could you please try moving to the latest release, i.e. Jetpack-5.1.2/l4t-r35.4.1 since we’ve fix some camera bugs and also stability issues.

Hello,

I’m on L4T 35.3.1. I will check to update to 35.4.1 as suggested, but this will take a bit more effort.

In the meanwhile, I was able to get the image stream by applying the same patch as: https://github.com/VC-MIPI-modules/vc_mipi_nvidia/blob/master/patch/kernel_Xavier_35.3.1%2B/0001-Disable-VB2_BUF_STATE_REQUEUEING-in-vi5_fops.c.patch

I’m not 100% sure that this patch is fully safe, but at least it shows that there is no major issue with the setup and its configuration. Also the source of the patched kernel file did not change on L4T 35.4.1.

Would you be able to bring some more light why the Nvidia stack discards frames which are actually valid? Is there a better approach to avoid this than the patch I linked?

Thanks

we’ve recently address some camera bugs for JP-5.1.2/l4t-r35.4.1
please see-also below threads to apply the bug fixes for verification.
for example,
Topic 268833, for camera firmware to update deskew algorithm, and also stability fixes,
Topic 268519, to fix memory corruption within libnvargus

Hi @JerryChang,

okay, I will keep you posted about the results on L4T R35.4.1. Thanks for sharing the links.

Regards

Hi,
I need a bit some extra time to carry out these tests. I will post some update later.

Hi,
The tests are still pending - we will follow-up with some update soon.

Hi @JerryChang,

I could now pick this task again, and perform the update to the latest L4T (35.4.1). Sorry for the delay.
I fully flashed the board to have everything up to date and repeated the tests.

The observations are still the same. Before applying the patch I linked, the capture still fails. So this ticket is not resolved.

The kernel log shows:

[ 2114.304277] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 1, flags: 0, err_data 512
[ 2114.312132] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 2, flags: 0, err_data 512
[ 2114.320392] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 3, flags: 0, err_data 512
[ 2114.328657] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 4, flags: 0, err_data 512

The mipi log can be found here: mipi_log_l4t-35.4.1.txt (93.4 KB)
It shows again a lot of messages like:

tag:CHANSEL_NOMATCH channel:0x08 frame:110 vi_tstamp:2129437996928 data:0x0000000000000249

For reference, I link again the patch which makes the difference: https://github.com/VC-MIPI-modules/vc_mipi_nvidia/blob/master/patch/kernel_Xavier_35.3.1%2B/0001-Disable-VB2_BUF_STATE_REQUEUEING-in-vi5_fops.c.patch
With this patch the image gets received and looks correct.

It feels however more like a workaround, rather than the solution - but it is interesting for debug purposes.

Could NVidia explain why the L4T release fails at capture the frames and reports CHANSEL_NOMATCH? Anything that we overlooked in the configuration, or that we can double check?

Looking forward to your support

hello maxe777,

are you using a SerDes chip? what’s the actual format types for sending to CSI channel?

I see something incorrect,
for example,
(1) there’re some CHANSEL_FAULT failures in the beginning.
according to the errors, it’s reporting failures with end-of-frame.
also, the detected line number is 539. is this your active_h configuration?
please note that capture engine did not support resolution in odd numbers.

(2) your VI tracing logs did not report CHANSEL_PXL_EOF. it may due to incorrect line number settings as mentioned above.

Hi @JerryChang ,

Thanks a lot for your quick feedback.

  • No, I’m not using SerDer. The MIPI lanes of the camera are directly connected to the Jetson Orin Nano.

  • About the number of lines:
    – Under the device tree, I have active_h = 540.
    – Since I’m using the plain V4L2 interface, I assume the configuration which actually matter is the struct on the kernel side (set as well to 540):

static const struct camera_common_frmfmt aev2_cam_frmfmt[] = {
  {{736, 540}, aev2_cam_framerates, 1, false /* hdr_en*/, AEV2_CAM_MODE_720x540_60FPS},
};

– The sensor is a Sony IMX297. I configured the registers to operate with 540 lines (ROI definition + FE code output timing)
– When I receive a full image, the 540 lines look like valid content
– I do not have equipment to decode the actual MIPI transfer, but from a scope trace, I can see 554 MIPI packets for a frame. 2 packets are short. The 552 other are long. I matches the expected breakdown of the datasheet:

  • 2 embedded data lines
  • 6 NULL lines
  • 4 Optical blanking lines
  • 540 lines with image content in RAW10

The packet header format should be as follow per datasheet:

Header [5:0] Name Description
00h Frame Start Code FS
01h Frame End Code FE
10h NULL invalid data
12h Embedded Data Embedded Data
2Bh RAW10 0A0Ah
35h OB Data Vertical OB line data

Attached are some graphical description of the expected transfers.


imx297_timing_2

According to my measurements, the camera seems to be configured as expected. Please let me know if you need more details.

Looking forward to your suggestions

hello maxe777,

FYI, the detected line number it’s counting started from zero.
so, this configuration seems setup correctly, and that’s matching to your active_h settings for 540 lines.

as the discarding frame error is due to channel has encounter an error.
please give it a try to issue a reset signal before s_stream?

Hi @JerryChang,

Thanks you for double checking the logs and confirming the correct configuration.

Could you please bring some light about how to “issue a reset signal before s_stream” as you suggested?
My attempts based on getting a pointer to struct tegra_channelusing a call to v4l2_get_subdev_hostdata, and then invoking tegra_channel_error_recover(chan, false); were not very conclusive since I always ended up with a NULL channel.

I’m not sure which location & API makes most sense to use to perform the test you requested.

Looking forward to your guidance.

hello maxe777,

it’s in the sensor kernel driver side to have a reset before v4l2 ioctl s_stream called.
that’s usually the approach to restart the stream from sensor driver side to have align with VI capture engine.

Hi @JerryChang ,

I took a look again at various drivers, but I cannot find the pattern you described.

For instance in a driver like kernel/nvidia/drivers/media/i2c/nv_imx477.c:

  • Since it goes over the tegracam framework, the callback s_stream is not directly handled at this level
  • The start streaming (imx477_start_streaming) function would be invoked as part of the s_stream V4L2 callback though v4l2sd_stream (in kernel/nvidia/drivers/media/platform/tegra/camera/tegracam_v4l2.c) if I’m right
  • But the start streaming function is only dealing with setting cam registers over I2C. I do not see any reset code.
  • And about reset, I only see the power management of the camera at the driver level.

Would you have a link to share the common pattern you refer to? I have the impression that I’m not looking at the code you expect.

I mostly looked at the nv_imx drivers present in the nvidia repos since my sensor is part of this family. Please let me know if some other driver offers a better template to refer to your proposal, or if I overlooked something.

Many thanks.

hello maxe777,

you did not overlooked them, the approach to restart the stream from sensor driver side did not present in the reference camera drivers.

here’s an example,
please check IMX219 register settings. i.e. imx219_mode_tbls.h
it’s programming register 0x0103 for a software reset.
for instance,

static imx219_reg imx219_mode_common[] = { 
        /* software reset */
        {0x0103, 0x01},
        {IMX219_TABLE_WAIT_MS, 10},

as you know it’s the register 0x0103 for a software reset.
you may add such register settings into mode settings which before s_stream.
for example,

 static imx219_reg imx219_start_stream[] = {
+       {0x0103, 0x01},
+       {IMX219_TABLE_WAIT_MS, 10},
        {0x0100, 0x01},
        {IMX219_TABLE_WAIT_MS, 3},

Hi @JerryChang,

Thanks a lot for the clarification. Now I understand what you mean. I initially though you meant a reset of the VI channel.

Unfortunately, the IMX297 does not offer a soft reset. But please note that the issue is 100% reproducible and already present at the first capture after probe.
During the probe, the camera undergoes a hard reset and is configured from scratch.

What else could I check to address this issue?

Hi @JerryChang ,

I have a theory about what the problem is, but I would appreciate Nvidia confirmation.

It looks to me that the CSI engine does not capture the image properly if the line length is not a multiple of 32 (at least for the RAW10 format - it may be a 64B alignment requirement). The capture runs with lengths which are not multiple of 32px (eg 720px = 32 * 22.5), but the received image looks corrupted, as you can see here:

Rendering the received bytes with a image with of 720px makes somewhat sense, but the last16px seem to alternatively be missing or be set to 0.

When I use a line length of 704px (= 32 * 22), then these artifacts are gone as you can see on:

The IMX297 can output a line length up to 728px. This is unfortunately not a multiple of 32. The next multiple of 32 is 736px.
This is what I used in my first tests to acquired the full sensor output.
However, this setting is not matching the sensor line length (728px actual, vs 736 specified).

It would match the CHANSEL_FAULT 0x200 and CHANSEL_FAULT 0x21b0202 entries form the logs I shared previously, since they mean SHORT_LINE for lines 0 and 539. 8 pixels are indeed missing. I’m not sure why the error is reported only for the first and last line though.

The stream runs fine if I set the sensor to use 728px and use:

v4l2-ctl -c preferred_stride=1472
Then the MIPI logs are clean, the kernel no longer reports dropped frames and the images look valid.

So that solves the reported issue.

@JerryChang: could you please confirm the size requirement on the line length? Would you have links to the relevant documentation?

Thanks

hello maxe777,

yes… there’s limitation for VI alignment.
we usually set the camera configuration follow the width alignment as 64.

you may refer to Orin TRM for [Video Input (VI)] chapter (page-1894).
please checking [2.5.2 ATOMP] section (page-1930) for details.

Thanks @JerryChang for the confirmation.

I came across this section before in the TRM, but I still find it non obvious to infer which exact limitation to face.

You may consider as future improvement to introduce a kernel side check to warn that the requested VI configuration is not compatible with the HW capabilities. It would make debugging a lot easier a would likely save support cases.

Thank you again for the support on this thread. The case is solve from our perspective.