Camera failed with log:VIFALC_TDSTATE on Jetpack5.0.2

Hi, I use a camera that outputs grey to access the jetson system, and it works fine in nano, xavier, and other platforms. But it does not work correctly in orin platform.
Here is the log:

     kworker/3:3-520     [003] ....   922.826162: rtcpu_vinotify_event: tstamp:29561718191 cch:0 vi:1 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:945965440160 data:0x379e300010000000
     kworker/3:3-520     [003] ....   922.826163: rtcpu_vinotify_event: tstamp:29561718466 cch:0 vi:1 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:945965446624 data:0x0000000031000001
     kworker/3:3-520     [003] ....   922.826164: rtcpu_vinotify_event: tstamp:29561718802 cch:0 vi:1 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:945965522144 data:0x379e2d0010000000
     kworker/3:3-520     [003] ....   922.826164: rtcpu_vinotify_event: tstamp:29561719069 cch:0 vi:1 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:945965528672 data:0x0000000031000002

I have searched the forum and made various attempts. But still failed.

To avoid the necessary skew calibration function at 1.5Gbps data bandwidth, I changed the data bandwidth of the camera to 1.2Gbps.
According to this formula, I get pix_clk_hz as 300M.
Output data rate = (sensor or deserializer pixel clock in hertz) * (bits per pixel) / (number of CSI lanes)

Here is a snippet of my dts.

mode0 {
					mclk_khz = "24000";
					num_lanes = "2";
					tegra_sinterface = "serial_a";
					phy_mode = "DPHY";
					discontinuous_clk = "yes";
					dpcm_enable = "false";
					cil_settletime = "0";

					active_w = "3088";
					active_h = "2064";
					mode_type = "raw";
                    csi_pixel_bit_depth = "8";
                    pixel_phase = "y8";
					readout_orientation = "0";
					line_length = "3088";
					inherent_gain = "1";
					mclk_multiplier = "2";
					pix_clk_hz = "300000000";
                    serdes_pix_clk_hz = "300000000";

Also, I tried setting pix_clk_hz and serdes_pix_clk_hz to 100M, but the same error occurred.

I saw in some posts that you provided camera-rtcpu-t234-rce.img to get more logs. But I can’t burn it in jetpack 5.0.2.

$ sudo ./flash.sh -r -k A_rce-fw mmcblk0p1 jetson-agx-orin-devkit
Error: Invalid target board - mmcblk0p1.

Please help me. Thanks in advance.

hello xumm_td,

the partition flash command is incorrect,
please try with… $ sudo ./flash.sh -r -k A_rce-fw jetson-agx-orin-devkit mmcblk0p1.

furthermore,
where did you get the debug firmware?
please use this debug-camera-fw.7z (253.4 KB) for JetPack-5.0.2 / l4t-r35.1 release.

  • Xavier,
    please update binary file, $OUT/Linux_for_Tegra/bootloader/camera-rtcpu-t194-rce.img
    partition flash to update camera firmware, $ sudo ./flash.sh -r -k rce-fw jetson-agx-xavier-devkit mmcblk0p1

  • Orin,
    please update binary file, $OUT/Linux_for_Tegra/bootloader/camera-rtcpu-t234-rce.img
    partition flash to update camera firmware, $ sudo ./flash.sh -r -k A_rce-fw jetson-agx-orin-devkit mmcblk0p1

after system boot-up, you may enable camera use-case to check the logs,
the RCE debug logs will populate to kernel logs, i.e. $ dmesg
thanks

1 Like

Hi,
Thank you for your quick reply.

I was using the debug firmware found in other posts on the forum and it was exactly the same as the one you gave me.(the same md5sum)

After following your instructions and burning the debug firmware, the image capture is normal!
So I test Jetpack5.0.1DP version too.

  1. debug firmware : Short time test OK, no long time test.
  2. Jetpack5.0.1DP : OK.
  3. Jetpack5.0.2: ERR.

hello xumm_td,

could you please give more details about this failure.
for example, did it populate more messages when debug firmware applied?

Hi @JerryChang ,
Using debug firmware, sometimes the images can be captured correctly for a while. But it does not work properly for a long time.
Please take a look at the logs.
dmesg_log_using_debug_rtcpu_firmware.txt (27.1 KB)

hello xumm_td,

there’s CHANSEL_NOMATCH failure, error data show… 0x549.
it shows non-embedded data packet does not match any configured channels at all.
it’s detect CSI data type as RAW8. (this looks correct according to your DT settings)

however,
it looks you have non-ordinary sensor resolution, i.e. 3088x2064.
could you please have a try to revise the active_h, for example, please reduce several lines as active_h = "2048" for testing,
thanks

Hi @JerryChang ,
Thank you for your analysis.
I actually used a camera with 1280*1024 raw8 @108fps mode.
The previous dts setting(3088x2064) didn’t work, I did some tinkering in the sensor driver. (My camera is registered directly to v4l2 framework, not to tegracam_device_register. Because I need flexible resolution and image format.)
This driver and dts has been working fine on xavier.

Any way, I modify the dts to match the camera configuration exactly(1280*1024 raw8 @108fps ).

mclk_khz = "24000";
					num_lanes = "2";
					tegra_sinterface = "serial_a";
					phy_mode = "DPHY";
					discontinuous_clk = "yes";
					dpcm_enable = "false";
					cil_settletime = "0";

					active_w = "1280";
					active_h = "1024";
					mode_type = "raw";
                    csi_pixel_bit_depth = "8";
                    pixel_phase = "y8";
					readout_orientation = "0";
					line_length = "1280";
					inherent_gain = "1";
					mclk_multiplier = "2";
					pix_clk_hz = "300000000";
                    serdes_pix_clk_hz = "300000000";

I use v4l2-ctl to capture frames directly, it works ok for several minutes, then stuck. I log both cases.
Here is the
newlog_dts_1280x1024.txt (7.2 KB)
log.

hello xumm_td,

please also see Property-Value Pairs session for reference,
these two DT settings looks incorrect, the mode_type settings should be yuv for your use-case,
regarding to pixel_phase, please have correct configuration for your sensor pixel phase, for example, uyvy.

furthermore,
are you using a SerDes chip, or you have physical hardware connection to CSI camera connector directly?
serdes_pix_clk_hz property settings is required only for camera modules that use SerDes.
thanks

Hi @JerryChang ,
Sorry for late reply, there has been a delay in recent days.
Regarding the data type aspect, I think there is no problem. I have extended the original code, which can be found in the attachment.
l4t_35.1.patch (8.9 KB)
The same patch and camera driver works fine on xavier@L4T35.1 .
So I don’t think it’s some kind of “obvious misconfiguration” that’s causing this.

Another thing that needs to be mentioned is that different rtcpu firmware can cause huge differences while the linux and dtb parts remain the same.

  1. debug firmware you provide : Short time test OK, can not last long.
    For example, this time I used v4l2-ctl to acquire an image with a camera frame rate of 108fps. about 500 frames were successfully acquired, and then v4l2 got stuck. The error is reported as follows.
    new error log with debug firmware.txt (5.8 KB)

  2. Jetpack5.0.1DP’s camera-rtcpu-t234-rce.img : OK. It works fine.

  3. Jetpack5.0.2’s camera-rtcpu-t234-rce.img : Unable to work for a short period of time.

It looks like the camera-rtcpu-t234-rce.img in Jetpack version 5.0.2 is much more demanding and unstable.

Regarding serdes_pix_clk_hz, I removed this configuration and the problem remained completely unchanged.

Thank you for your patience.

hello xumm_td,

actually, debug firmware is same as Jetpack5.0.2’s camera-rtcpu-t234-rce.img but only enable debug flags.
suspect it’s timing issue, had you evaluate the initial programming time? please try adding set_mode_delay_ms into mode settings. it configure waiting time for the first frame after capture starts, the unit is in milliseconds.

hello xumm_td,

in addition, may I know how many camera devices you had? did you enable all of them all together?
according to the logs,
it looks you at least have dual cameras, timestamp are valid, and one of them is output frame-id to RCE engine.

[  185.157568] [RCE] VI5: vi5_process_error_fifo: tag=0x0b ch=32 frame=65 ts=208282361152 data=0x00000549 ext_data=0x00000000
[  185.213446] [RCE] VI5: vi5_process_error_fifo: tag=0x02 ch=0 frame=65 ts=208337466432 data=0x00000225 ext_data=0x00000046
[  185.213453] [RCE] VI5: vi5_process_error_fifo: tag=0x03 ch=0 frame=0 ts=208337918336 data=0x00100000 ext_data=0x00000000
[  185.213459] [RCE] VI5: vi5_process_error_fifo: tag=0x0b ch=32 frame=75 ts=208374791360 data=0x00000549 ext_data=0x00000000
...

Hi @JerryChang ,
I add ‘set_mode_delay’ as 100ms.
This time it can last longer, about 15000 frames before getting stuck.
dmesg prompts the same message.

may I know how many camera devices you had? did you enable all of them all together?

I have 6 cameras on dts but use only one camera now.

it looks you at least have dual cameras, timestamp are valid, and one of them is output frame-id to RCE engine.

I don’t understand the meaning, could you please explain it to me?

Hi @JerryChang ,
I upload a log of the process from when the image is received properly to when something goes wrong. You can see there are three stages. I hope this will help you in your analysis.
Thanks!
log_from_good_to_err.txt (14.1 KB)

hello xumm_td,

may I also know your test commands.
furthermore,
please try running below commands to boost all the VI/CSI/ISP clocks before test your stream.

sudo su
echo 1 > /sys/kernel/debug/bpmp/debug/clk/vi/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/isp/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/nvcsi/mrq_rate_locked

cat /sys/kernel/debug/bpmp/debug/clk/vi/max_rate |tee /sys/kernel/debug/bpmp/debug/clk/vi/rate
cat /sys/kernel/debug/bpmp/debug/clk/isp/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/isp/rate
cat /sys/kernel/debug/bpmp/debug/clk/nvcsi/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/nvcsi/rate

Hi @JerryChang ,
My script is :

export WIDTH=1280
export HEIGHT=1024
v4l2-ctl --set-ctrl frame_rate=108
v4l2-ctl --set-fmt-video=width=$WIDTH,height=$HEIGHT,pixelformat=GREY --stream-mmap --stream-count=-1 --stream-to=/dev/null

I have now improved the clk, using the debug version of firmware, tested it for 15 minutes and it is still running relatively stable. So far so good.

In addition, stable image acquisition can also be achieved using the RELEASE version of firmware.

Hi @JerryChang ,
Any progress on this issue please? Looking forward to your reply.

hello xumm_td,

this should due to sensor timing issue since you’ve able to improve the stability by using set_mode_delay and also increase system clock frequencies.

Hi @JerryChang ,
I understand what you mean. From the available logs, is it possible to see exactly where there is something wrong with the mipi signal? Is there any suggestive information?
Right now this camea is running on both nano and xavier with no problems.

hello xumm_td,

as your attach logs with debug firmware, you can see logs between work/non-working case.
we need to analysis the logs to understand the problem, it’s not always show the problem clearly since input is the signal.