MIPI CSI interface reporting errors (ECC, CRC) on some TX2s

Hello,

I am experiencing problems with MIPI CSI camera interface. It seems that there is data corruption on the MIPI interface.
Camera uses 1 lane configuration with 1188Mbps data rate. MCLK is 37.125MHz.

Example of dmesg error logs that I am getting. It shows ECC and/or CRC errors:

[93473.741333] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) ERROR_STATUS2VI_VC2 = 0x00000008
[93473.750043] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) INTR_STATUS 0x00008002
[93473.757903] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) ERR_INTR_STATUS 0x00008002

[ 133.505258] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) ERROR_STATUS2VI_VC2 = 0x00000008
[ 133.513954] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) INTR_STATUS 0x00000800
[ 133.521782] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) ERR_INTR_STATUS 0x00000800

[ 4474.130707] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) INTR_STATUS 0x00010000
[ 4474.138540] nvcsi 150c0000.nvcsi: csi4_stream_check_status (0) ERR_INTR_STATUS 0x00010000

[ 7091.448492] tegra-vi4 15700000.vi: Status: 2 channel:00 frame:0001
[ 7091.454766] tegra-vi4 15700000.vi: timestamp sof 7101742702144 eof 7101758835360 data 0x00200060
[ 7091.464335] tegra-vi4 15700000.vi: capture_id 16064 stream 0 vchan 0

When the corruption happens before streaming has started, the frames cannot be acquired at all.
If it happens during streaming, the error is reported after streaming is stopped.

My test set of SoMs consists of about 20 TX2s and I could divide the SoMs into 2 groups:

  • CSI errors never occur (~10 SoMs)
  • CSI errors occur very often (almost every other camera streaming operation) (~10 SoMs)

I performed various tests, using different carrier boards and camera modules. The issue definitely stays with the SoM and is easy to reproduce.
I tried old L4T release with kernel 4.4, and the latest L4T with kernel 4.9 - the same errors.
I inspected the TX2 connector - any damage is not visible.
I tried using external oscillator as the clock source - still the same issue.

I have following questions:

  1. Do you have idea what could cause this kind of failure? Could it be that the TX2s are damaged?
  2. Is there any way to debug data quality of the MIPI signals on the TX2?
  3. Is there any way to fine tune the CSI peripheral? I am using cli_settletime = “0”, so the kernel calculates the value itself (cli_settletime=23 in my case)

I will appreciate your help.

Thank you.

ECC and CRC error should be the sensor output package header without ECC or CRC check.
Should be better consult with vendor.

I am using exactly the same camera module for tests, and one group of TX2 fails, the other does not fail. Camera sensor should be OK, as otherwise it would generate errors on all of the TX2s, right?

Yes, TX2 and Xavier have the ECC/CRC checking, but Nano and TX1 didn’t.

TX2s are the only modules that I use for testing. 10 TX2 modules show errors, the other 10 TX2 modules do not show errors.

Camera modules and carrier boards were the same for both group of TX2 modules.

Wow, sound like signal quality problem. What’s if configure the sensor as discontinuous mode or continuous mode.

I configured the sensor in discontinuous_clk mode (before it was set to “no”).

Errors are now much less frequent. Do you have any ideas what could be the root cause of this problem? Can some frames be lost in discontinuous clock mode? Why less errors are detected in this configuration?

Thanks

This kind of symptom should be the sensor REG setting or HW problem.