Tegra-capture-vi: corr_err, status 14 indicates issue with the Falcon

Hi,

I am posting again in relation to an older post of mine here, I was not able to work on this issue for a while, but my original issue which blocked me was solved by the mutli-cam race condition patches and this issue didn’t seem as important at the time. But now it seems to be making our userspace library very hard to bring up as I experience i2c_read failures to the cameras during periods of receiving error spamming such as

[ 54.572018] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 131072, status 14

Similar to my original post, the status specifies CAPTURE_STATUS_FALCON_ERROR, and I have confirmed with our suppliers that the embedded_data_height is set correctly in our device tree.

It is potentially unrelated, but this caught my eye in dmesg and I thought I should also leave it here

[   59.177949] falcon 154c0000.nvenc: Direct firmware load for nvhost_nvenc080.fw failed with error -2
[   59.178301] falcon 154c0000.nvenc: Falling back to sysfs fallback for: nvhost_nvenc080.fw
[   59.181116] falcon 154c0000.nvenc: looking for firmware in subdirectory

The bring-up issue does not always happen, and when it does, I can view the streams, but there is visible tearing that does not go away.

Also, setting the clocks to max_rate

echo 1 > /sys/kernel/debug/bpmp/debug/clk/vi/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/isp/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/nvcsi/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/emc/mrq_rate_locked
cat /sys/kernel/debug/bpmp/debug/clk/vi/max_rate |tee /sys/kernel/debug/bpmp/debug/clk/vi/rate
cat /sys/kernel/debug/bpmp/debug/clk/isp/max_rate | tee  /sys/kernel/debug/bpmp/debug/clk/isp/rate
cat /sys/kernel/debug/bpmp/debug/clk/nvcsi/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/nvcsi/rate
cat /sys/kernel/debug/bpmp/debug/clk/emc/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/emc/rate

Does not remove the issue.

This tearing is always accompanied by a spam of

[  358.753083] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 131072, status 14
[  358.756056] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 131072, status 14
[  358.756142] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 131072, status 14
[  358.819751] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 131072, status 14
[  358.819758] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 131072, status 14
[  358.822738] tegra-camrtc-capture-vi tegra-capture-vi: corr_err: discarding frame 0, flags: 0, err_data 131072, status 14

Here is the output of the trace

echo 1 > /sys/kernel/debug/tracing/tracing_on
echo 30720 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/tegra_rtcpu/enable
echo 1 > /sys/kernel/debug/tracing/events/freertos/enable
echo 3 > /sys/kernel/debug/camrtc/log-level
echo 1 > /sys/kernel/debug/tracing/events/camera_common/enable
echo > /sys/kernel/debug/tracing/trace

v4l2-ctl --stream-mmap -c bypass_mode=0

cat /sys/kernel/debug/tracing/trace > trace.txt

trace.txt (5.2 MB)

Here is a recording of the streams

I have seen other posts talk about updated Falcon firmware and the solution being packaged into .img for the flashing process. I feel as such maybe I should mention that the images being flashed are generated using the sample rootfs and flashed using the OTA update process, all images are regenerated so I do not flash any of the pre-generated images shipped in the BSP sources.

Thank you for any help; it is always greatly appreciated.

The CHANSEL_SHORT_FRAME tell the output size less than expected.
Could be the sensor driver write incorrect REG cause the problem.

Thanks for the input @ShaneCCC, i’ll go through the driver and let you know.

Are there any sort of specific registers that are common to most cameras that could cause this? Just to get an idea of what I should be looking for

Each sensor have different REG.