tegra-vi4 15700000.vi: master error with 12 Megapixels, 23 fps, RAW12 sensor

Hello,

I have connected a mipi-csi-2 12 Megapixels RAW12 camera sensor to my Jetson TX2, and it works perfectly with the following gstreamer pipeline that does not use the ISP

gst-launch-1.0 v4l2src device=/dev/video0 ! video/x-raw,format=GRAY8,height=3008,framerate=234/10 ! nvvidconv ! nvjpegenc ! multifilesink location=%05d.jpg max-files=5

Note: the sensor sends RAW12 pixels, but I have instructed the VI to truncate the pixels to 8 bits when asked to produce GRAY8.

However, if I try to use the ISP, with the following command

gst-launch-1.0 nvcamerasrc ! video/x-raw\(memory:NVMM\),width=4112,height=3008,framerate=1/1 ! fakesink

it stops immediately with the following kernel message :

[  101.330342] tegra-vi4 15700000.vi: master error

In the Parker Manual, Master Error is documented as follows :

MASTER_ERR_STATUS: This status bit is set whenever there is any kind of Error situation [ISPBUFA_ERR,
NOTIFY_FIFO_OVERFLOW, ATOMP_PACKER_FIFO_OVFL, CSIMUX_FIFO_OVFL, HOST_PKTINJECT_STALL_ERR]
that occurs in VI & the corresponding MASK bit is set in INTERRUPT_MASK register.

I have tried ‘framerate=1/1’ above after having my pipeline crashing at 23/1 fps, but changing the framerate does not change the speed used to send individual frames.

I have added the following debugging messages after the ‘master error’ message

diff --git a/drivers/video/tegra/host/vi/vi4.c b/drivers/video/tegra/host/vi/vi4.c
index dcaa529..3d16a3a 100644
--- a/drivers/video/tegra/host/vi/vi4.c
+++ b/drivers/video/tegra/host/vi/vi4.c
@@ -90,6 +90,11 @@ static irqreturn_t nvhost_vi4_error_isr(int irq, void *dev_id)
        if (r) {
                host1x_writel(pdev, VI_CFG_INTERRUPT_STATUS_0, 1);
                dev_err(&pdev->dev, "master error\n");
+               dev_err(&pdev->dev, "VI_CFG_INTERRUPT_STATUS_0 = %x\n", r);
+               if (r & 1) {
+                       dev_err(&pdev->dev, "VI_ISPBUFA_ERROR_0 = %x\n", host1x_readl(pdev, VI_ISPBUFA_ERROR_0));
+                       dev_err(&pdev->dev, "VI_NOTIFY_ERROR_0 = %x\n", host1x_readl(pdev, VI_NOTIFY_ERROR_0));
+               }
                atomic_inc(&vi->overflow);
        }

and I get the following info :

[  113.639740] tegra-vi4 15700000.vi: master error
[  113.644320] tegra-vi4 15700000.vi: VI_CFG_INTERRUPT_STATUS_0 = 3f000001
[  113.650995] tegra-vi4 15700000.vi: VI_ISPBUFA_ERROR_0 = 1
[  113.656462] tegra-vi4 15700000.vi: VI_NOTIFY_ERROR_0 = 0

In the Parker TRM, bit 0 of the VI_ISPBUFA_ERROR_0 register is documented as follows

FIFO_OVERFLOW: Set by Hardware when the ISPBUF's internal FIFO has overflowed. (Generally due
to clock speed mismatch b/w ISP and VI interfaces) Write 1 to clear. Also causes VI Master error.

Is there anything that can be done to solve that problem or is the TX2 ISP simply not able to accept a 12 Mpixels RAW12 image coming in at 4752 Mbps ?

@phdm
User space camera nvcamerasrc/argus not support GRAY8 format.
You may need to fake it a bayer RGB to try.

Sorry if it is not clear in my post. My sensor really sends Bayer RAW12. That’s the reason I need the ISP to handle my images.

GRAY8 is only a way to let v4l2src accept it, because v4l2src does not accept 12-bits Bayer or 12-bits gray images, and to prove that my driver works.

The problem that I have is thus really that the 23 fps 12 Megapixels 12-bit Bayer output of my sensor is not accepted by the ISP. I must also add that if I instruct my sensor to send on 4 channels instead of 8 channels, the framerate and the speed of individual images are lower, and I get 15 fps that the ISP does handle correctly.

The problem is “only” if I set my sensor to send on 8 channels instead of 4 channels, thus sending pixels at twice the speed.

Does the v4l2-ctl command work?

4l2-ctl -d /dev/video0 --set-fmt-video=width=4112,height=3008 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1

Yes,

v4l2-ctl -d /dev/video0 --set-fmt-video=width=4112,height=3008 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1

It answers :

<

and if I let it run a little bit longer :

nvidia@cam5-phdm:/tmp$ time v4l2-ctl -d /dev/video0 --set-fmt-video=width=4112,height=3008 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=690
<<<<<<<<<<<<<<<<<<<<<<<<< 23.07 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.03 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.02 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.01 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.01 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.01 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.01 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<<< 23.00 fps
<<<<<<<<<<<<<<<<<<<<<

real    0m31.231s
user    0m0.024s
sys     0m0.384s
nvidia@cam5-phdm:/tmp$

Have a try to boost the vi/csi clock and run the jetson_clocks to try.

How can I boost the vi/csi clock ?

“sudo ./jetson_clocks.sh” alone does not help

Have a check below link

https://elinux.org/Jetson_TX2_Camera_BringUp

Which value should I try for max_rate ?

Below command should dump the max value. Set it to the rate.
cat /sys/kernel/debug/bpmp/debug/clk/vi/max_rate

Thank you; that works. I have done

nvidia@cam5-phdm:~$ sudo su
root@cam5-phdm:/home/nvidia# echo 1 > /sys/kernel/debug/bpmp/debug/clk/vi/mrq_rate_locked
root@cam5-phdm:/home/nvidia# echo 1 > /sys/kernel/debug/bpmp/debug/clk/isp/mrq_rate_locked
root@cam5-phdm:/home/nvidia# echo 1 > /sys/kernel/debug/bpmp/debug/clk/nvcsi/mrq_rate_locked
root@cam5-phdm:/home/nvidia# cat /sys/kernel/debug/bpmp/debug/clk/vi/max_rate
1036800000
root@cam5-phdm:/home/nvidia# cat /sys/kernel/debug/bpmp/debug/clk/isp/max_rate
1126400000
root@cam5-phdm:/home/nvidia# cat /sys/kernel/debug/bpmp/debug/clk/nvcsi/max_rate
225000000
root@cam5-phdm:/home/nvidia# echo 1036800000 > /sys/kernel/debug/bpmp/debug/clk/vi/rate
root@cam5-phdm:/home/nvidia# echo 1036800000 > /sys/kernel/debug/bpmp/debug/clk/isp/rate
root@cam5-phdm:/home/nvidia# echo 1036800000 > /sys/kernel/debug/bpmp/debug/clk/nvcsi/rate
nvidia@cam5-phdm:~$

Is it safe to work that way ? That s for industrial cameras that must work continuously 24/24 7/7.

Is it possible to set that up in the device-tree ?

Have try to increase the pix_clk_hz in device tree without boost the vi/csi clocks to try.
Or if you are working on r32.x add “serdes_pix_clk_hz=value higher than sensor pixel clocks”

What’s the actual meaning/usage of ‘pix_clk_hz’ by the drivers or nvcamera-daemon ? I do not see it used even indirectly by the csi or vi drivers, but maybe I missed it.

Currently I have it set at 74250000.

I am working with 28.2.1 currently.

Increase the pix_clk_hz in the device tree that will change the VI or CSI clock to help on your case.

I have replaced

pix_clk_hz = "74250000";

by

pix_clk_hz = "160000000";

So pix_clk_hz is more than doubled,
" but that does not make my pipeline work. It fails with the kernel saying as before

[   82.967383] tegra-vi4 15700000.vi: master error
[   82.971949] tegra-vi4 15700000.vi: VI_CFG_INTERRUPT_STATUS_0 = 3f000001
[   82.978583] tegra-vi4 15700000.vi: VI_ISPBUFA_ERROR_0 = 1
[   82.983993] tegra-vi4 15700000.vi: VI_NOTIFY_ERROR_0 = 0

Which value would you suggest me to try for pix_clk_hz ?"

First edit :
Sorry. I probably made a mistake while testing.

Testing a second time, it appears to work. Thank you for your fast and usefuk answers.

Second edit :

Testing a third, fourth and so on times, it never worked again.

Which value should I put in “pix_clk_hz” if my sensor sends its images on four lanes at 1188 Mbps on each lane ?

Third edit :

I have now set

pix_clk_hz = "300000000";

and I can now use my camera reliably, but I’d still like to know what pix_clk_hz represents. Is “pix_clk_hz” a bit, byte or pixel rate, and is it measured on one lane or on the total of all lanes ?

Just a side note. There is a typo in that page : {max_rate} should be {max_rate} (3 times)

Good catch. I am going to correct it.

Hi ShaneCCC,

as I am not sure you have seen my question at the end of the previous comment https://devtalk.nvidia.com/default/topic/1062760/jetson-tx2/tegra-vi4-15700000-vi-master-error-with-12-megapixels-23-fps-raw12-sensor/post/5382064/#5382064, let me rephrase it here : what are the units of “pix_clk_hz” : is it bits/s, bytes/s, or pixels/s, and is it measured on one mipi csi-2 lane or on the total of all lanes of the sensor?

You have consult with sensor vendor for this detail.
General the pixel clock roughly could be widthhighframerate*byteperpixel

If I use your formula for my sensor, I get :

4112 * 3008 * 23.4 * 3/2 = 434148249 = 434 MegaByte/s

If I surmise you meant bit instead of byte, I get

4112 * 3008 * 23.4 * 12 = 3473185996 = 3.473 Gigabit/s

If I divide those values by the number of lanes (4), I get

109 MegaByte/s/lane

and

868 MegaBit/s/lane

I remember that setting “pix_clk_hz” to 200000000 = 200 MegaHz did not work, but 300000000 = 300 MegaHz did work. None of the numbers computed above lies between 200 and 300 Mega, so I still do not understand which value I should put there for other sensors.

To me, it seems that “pix_clk_hz” is actually roughly (a little bit more than) the number of pixels sent by the sensor in one second, i.e.

pix_clk_hz = width * height * framerate

With my data, that gives

4112 * 3008 * 23.4 = 289432166.4 = 289,4 MegaPixel/s which is a number between 200 MHz and 300 MHz.

I applied the same rough calculation for another sensor where I got 360 Mhz and that made that sensor work too.