Issue: Camera Frame Corruption Error on Orin Nano SOM
Description: We are encountering an issue with one of our NVIDIA Orin Nano SOMs, where we receive the following error messages after approximately one hour of continuous operation:
ubuntu kernel: [12348.697617] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 7, flags: 0, err_data 131072
ubuntu kernel: [12348.806101] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 9, flags: 0, err_data 262144
ubuntu kernel: [12348.956488] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 10, flags: 0, err_data 64
Steps Taken:**
- The error happens consistently on one particular Orin Nano SOM.
- We have tested with both JetPack 5.1.2 and JetPack 6.1 , but the error persists with both versions.
- The error frequency and the message remain the same in each case.
Hardware Setup:
- The same camera hardware and connections have been tested with other Orin Nano SOMs, which operate without issues.
- This particular SOM create error when plugging to another working camera hardware and connection setup.
v4l2 Compliance Test: We have also run the v4l2 compliance test on the system, and the results (attached) indicate no significant issues, though this might be worth reviewing to see if there’s any underlying misconfiguration or incompatibility.
Questions:
-
What could be the potential causes of the corr_err: discarding frame
error, specifically in relation to one Orin Nano SOM, when the same setup works fine on other SOMs?
-
Could this be a hardware-related issue with the particular SOM, or could it be related to a specific software/driver configuration issue?
-
Should we be concerned about any specific kernel parameters or driver versions that might be incompatible with the camera setup on this specific SOM?
Any insights or recommendations for further troubleshooting would be greatly appreciated.
hello ajeesh56fh5,
please check the board SKUs to compare the difference, you may using $ sudo i2cdump -f -y 0 0x50
to read the eeprom.
BTW,
please see-also Topic 310858 for some known changes when porting DTS from JP-5 to JP-6.
besides, please refer to Topic 318537 to debug discarding frame corr_err messages
Hi Jerry
Please find below comparison between 2 SOMS as per your suggestion.
I2C dump on Working SOM.
ubuntu@ubuntu:~$ sudo i2cdump -f -y 0 0x50
[sudo] password for ubuntu:
No size specified (using byte-data access)
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
00: 02 00 fe 00 00 00 00 00 00 00 00 ff 00 00 00 00 ?.?..
10: 00 01 00 01 36 39 39 2d 31 33 37 36 37 2d 30 30 .?.?699-13767-00
20: 30 34 2d 33 30 30 20 4e 2e 32 00 00 00 00 00 00 04-300 N.2…
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
40: b0 48 00 00 26 19 eb 2d b0 48 31 34 32 32 30 32 ?H…&??-?H142202
50: 33 30 30 31 37 36 37 00 00 00 00 00 00 00 00 00 3001767…
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
90: 00 00 00 00 00 00 4e 56 43 42 00 ff 4d 31 00 00 …NVCB…M1…
a0: 00 00 00 00 00 00 00 00 00 00 00 00 26 19 eb 2d …&??-
b0: b0 48 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ?H?..
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0 …?
ubuntu@ubuntu:~$
I2C dump on Not Working SOM.
ubuntu@ubuntu:~$ sudo i2cdump -f -y 0 0x50
[sudo] password for ubuntu:
No size specified (using byte-data access)
0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef
00: 02 00 fe 00 00 00 00 00 00 00 00 ff 00 00 00 00 ?.?..
10: 00 01 00 01 36 39 39 2d 31 33 37 36 37 2d 30 30 .?.?699-13767-00
20: 30 34 2d 33 30 30 20 4e 2e 32 00 00 00 00 00 00 04-300 N.2…
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
40: b0 48 00 00 15 19 eb 2d b0 48 31 34 32 32 30 32 ?H…???-?H142202
50: 33 30 30 31 39 34 35 00 00 00 00 00 00 00 00 00 3001945…
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
90: 00 00 00 00 00 00 4e 56 43 42 00 ff 4d 31 00 00 …NVCB…M1…
a0: 00 00 00 00 00 00 00 00 00 00 00 00 15 19 eb 2d …???-
b0: b0 48 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ?H?..
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d5 …?
ubuntu@ubuntu:~$
hello ajeesh_mtx,
these two were same SKUs, i.e. 699-13767-0004-300 N.2
please examine your sensor pixel clock settings, it must be set correctly to avoid potential issues.
**Dear [Jerry Chang]
Thank you for your prompt response. Following your suggestion to review the pixel clock settings, I would like to provide more details regarding the issue and request further assistance to help resolve the problem.
MIPI Speed Settings & Results:
- Normally with all other SOM we works with 1188MHz. We have tried two different MIPI speed settings (1188MHz & 891MHz )on the system with this particular SOM. In both MIPI speed settings, frame errors occur only on one specific SOM, while all other SOMs function without issue, even under identical settings. This suggests the problem is isolated to that particular SOM.
Problem with SOM at Normal Temperature:
- The SOM exhibiting frame errors continues to experience failures even under normal temperature conditions (room temperature). The frame errors begin after approximately 2-3 hours of operation, leading to failure in processing. This issue is persistent, and the errors occur consistently within a period of 2-3 hours.
Good SOMs Performance Across Full Temperature Range:
- On the other hand, all other SOMs (with the same hardware and configuration) perform correctly across the full temperature range. These units have been running for days to months without experiencing any frame errors or instability, which further suggests that the issue is specific to the faulty SOM.
Same Hardware, Different SOM Performance:
- When using identical hardware (excluding the SOM), the system operates normally and without any frame errors. This leads us to believe the issue is related to a defect or performance variance in the SOM itself.
- The behavior is clear: with a good SOM, the system performs without any issues, while with a faulty SOM, frame errors occur after only a short period of operation.
Request for Assistance:
- Could you help identify the underlying cause for the discrepancy in performance between the good and bad SOMs?
- Additionally, any insights into potential SOM-specific variations in how they handle MIPI speed or pixel clock settings would be greatly appreciated.
Looking forward to your expert guidance on this matter.
hello ajeesh_mtx,
is it a long run issue? for instance, you only reproduce frame error after 2-3 hours? although this only happened on specific SOM.
as previous checked, those good/bad modules they’re based-on the same SKUs. it’s likely hardware damage.
anyways, let’s have one more test,
please try increasing the sensor pixel clock (by adding 15%) to test again on specific (or… faulty) module.
We are currently working with over 20 SOM units, and while the majority of the units have been operating as expected without issue, we have encountered a recurring fault on three of the units. The issue occurs intermittently, with the fault manifesting after a period of 1 to 4 hours of operation. Notably, this error does not appear on any of the other SOMs, which have been running for extended periods (several months) without exhibiting the same behavior.
Details of the Issue:
SOM Units Affected: 3 specific units out of a batch of more than 20
Issue Frequency: The fault occurs inconsistently, sometimes within 1 hour and other times after 3-4 hours of operation.
Operating Conditions: All SOM units are running the same firmware/software configuration and are under development.
Sensor MIPI Speed Options: The sensor in use supports only two MIPI speed options: 1188MHz and 891MHz. There is no option to modify or verify additional settings beyond this at the moment with the current setup.
At both MIPI speeds, the normal SOM units operate without any issues, while the faulty SOM units experience problems at both speeds.
hello ajeesh_mtx,
I meant increasing sensor pixel clock in the device tree settings, pix_clk_hz
.
for example, please test with… pix_clk_hz = "1188000000";
→ pix_clk_hz = "1366200000";
on specific modules.
hello JerryChang,
Thank you for your suggestion. We will check this option by updating the pix_clk_hz value from “1188000000” to “1366200000” on the specific modules as per your request.
We will test the configuration and provide you with feedback shortly.
Hello jerryChang,
The Dtb file was modified to the following as per your suggestion.
mode4 {
mclk_khz = “37125”;
num_lanes = “2”;
tegra_sinterface = “serial_b”;
vc_id = “0”;
phy_mode = “DPHY”;
discontinuous_clk = “no”;
dpcm_enable = “false”;
cil_settletime = “0”;
lane_polarity = “6”;
active_w = “3840”;
active_h = “2160”;
dynamic_pixel_bit_depth = “10”;
csi_pixel_bit_depth = “10”;
mode_type = “bayer”;
pixel_phase = “rggb”;
readout_orientation = “0”;
line_length = “4512”;
inherent_gain = “1”;
pix_clk_hz = “273240000”;
serdes_pix_clk_hz = “12000000000”;
gain_factor = “10”;
min_gain_val = “0”;
max_gain_val = “480”;
step_gain_val = “1”;
default_gain = “0”;
min_hdr_ratio = “1”;
max_hdr_ratio = “1”;
framerate_factor = “1000000”;
min_framerate = “1500000”;
max_framerate = “24000000”;
step_framerate = “1”;
default_framerate = “24000000”;
exposure_factor = “1000000”;
min_exp_time = “18”;
step_exp_time = “1”;
max_exp_time = “660000”;
default_exp_time = “10000”;
embedded_metadata_height = “1”;
};
The capture was not successful, as no frames were captured. Could you please suggest the next step?
hello ajeesh_mtx,
please contact the NVIDIA Customer Care team for the RMA process.
Hello jerryChang,
We will proceed as suggested. Thanks for your support.