Can't stream with V4L2 without overriding nvcsi clock

I have developed a driver for the VD16GZ sensor and corresponding device tree. In order to stream images, I have to run these commands first:

# echo 1 > /sys/kernel/debug/bpmp/debug/clk/nvcsi/mrq_rate_locked
# cat /sys/kernel/debug/bpmp/debug/clk/nvcsi/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/nvcsi/rate

What should I change so this workaround is no longer necessary? Here is my device tree (renamed with .txt suffixes for the forum):
tegra234-camera-vd16gz-devkit-PCB1706.dtsi (3.8 KB)
vd16gz-node.dtsi (6.3 KB)

The camera is sending 1124x1364 RAW10 frames over 2 D-PHY lanes at 1.022 Gbps (per lane), with no additional data (no embedded line or motion vectors). Line length is 1236 and pixel clock is 160.8 MHz. It’s an RGBIR sensor so the V4L2 ecosystem doesn’t have a defined bayer pattern for it, but I don’t think that’s relevant. Let me know if you need any more information about the data from the sensor.

After I run those commands, streaming works. If I don’t, then I get these logs and no frames:

[   22.847267] vd16gz 9-0010: device_model_id: 0x5603, device_revision: 32,16, device_number: 0x0000000000000000e9021226459304c5
[   22.858927] vd16gz 9-0010: rom_revision: 3.7.0, ui_revision: 3.14, optical_revision: 0x06, fwpatch_revision: 2.28, vtpatch_id: 0x80110011
[   22.883027] bwmgr API not supported
[   25.568997] tegra-camrtc-capture-vi tegra-capture-vi: uncorr_err: request timed out after 2500 ms
[   25.581752] tegra-camrtc-capture-vi tegra-capture-vi: err_rec: attempting to reset the capture channel
[   25.592295] (NULL device *): vi_capture_control_message: NULL VI channel received
[   25.600020] t194-nvcsi 13e40000.host1x:nvcsi@15a00000: csi5_stream_close: Error in closing stream_id=1, csi_port=1
[   25.610676] (NULL device *): vi_capture_control_message: NULL VI channel received
[   25.618394] t194-nvcsi 13e40000.host1x:nvcsi@15a00000: csi5_stream_open: VI channel not found for stream- 1 vc- 0
[   25.629225] tegra-camrtc-capture-vi tegra-capture-vi: err_rec: successfully reset the capture channel

The first two lines are from my driver, and are as expected. I think the third line is expected on T234, based on looking at the code, also it shows up even with the workaround when streaming works.

I’m currently using an Orin Nano Developer Kit, a camera carrier board from ST, and a custom board to connect the two. I’m using Jetpack 5.1.1. I have this problem only one camera attached.

hello brian100,

it may due to incorrect device tree settings,
please refer to developer guide, Sensor Pixel Clock to examine your pixel_clk_hz property settings.

Yes I have read that guide extensively. I believe I have it set correctly. The sensor has a readout for the current pixel clock, which reports 160800000 hz. My device tree has pix_clk_hz = "160800000"; (not pixel_clk_hz, that doesn’t seem to work at all). Can you confirm that this is correct?

Any other parts of the device tree I should look at?

hello brian100,

yeah, that’s a typo, you should use pix_clk_hz in DT to assign sensor pixel clock.
since it works by boosting nvcsi clock rate, could you please try to configure slightly higher pixel clock values for testing.

Yes, slightly higher values do work around it for testing. 160800000(my actual pixel clock) results in a clock rate of 25211764, which never seems to work. 180800000 results in a clock rate of 28573333, which works sometimes. 200000000 results in a clock rate of 31360975, which seems to always work.

is it an attenuation due to trace length?
please also review Table 10-4 for signal routing requirements within Jetson Orin NX Series and Orin Nano Series Design Guide.

How would changing the nvcsi clock affect the tolerance of signal integrity issues?

hello brian100,

from software side, it’s camera software uses such sensor pixel clock to calculate the exposure and frame rate of the sensor.
so, it must be set correctly to avoid potential issues.

had you probe the CSI to measure the hardware signaling?

I’m still confused why you suspect signal integrity here. If I can make it work or not just by changing the nvcsi clock rate without touching the pixel clock value, that means all the camera settings stay identical, because the software that uses the pixel clock can’t tell the difference.

hello brian100,

please refer to the code snippet as below.

105         err = read_property_u64(node, "pix_clk_hz", &val64);
106         if (err) {
107                 dev_err(dev, "%s:pix_clk_hz property missing\n", __func__);
108                 return err;
109         }
110         signal->pixel_clock.val = val64;
112         err = read_property_u64(node, "serdes_pix_clk_hz", &val64);
113         if (err)
114                 signal->serdes_pixel_clock.val = 0;
115         else
116                 signal->serdes_pixel_clock.val = val64;
118         if (signal->serdes_pixel_clock.val != 0ULL) {
119                 if (signal->serdes_pixel_clock.val < signal->pixel_clock.val) {
120                         dev_err(dev,
121                                 "%s: serdes_pix_clk_hz is lower than pix_clk_hz!\n",
122                                 __func__);
123                         return -EINVAL;
124                 }
125                 rate = signal->serdes_pixel_clock.val;
126         } else {
127                 rate = signal->pixel_clock.val;
128         }

it’s clock configuration according to your sensor pixel clock.
if you issue commands to boost NVCSI clock, it’ll ignore this part to use the max clock directly.
as you already workaround this by setting slightly higher values, that means there’s divergence between the clock rates.

Yes, I’ve been looking at that code. It seems like it doesn’t calculate the correct value in the end. Any advice on how to fix it? Or can you tell me if there are side effects to always forcing it to the max value?

hello brian100,

it depends-on your own use-case.
it’s okay for doing evaluation, but it’s unacceptable for a product keeping clock as maximum always.

you may configure a slightly higher values to resolve this.

If I increase the pix_clk_hz value, then it results in incorrect exposure settings. Is there some other place I can configure a higher value which won’t affect the exposure calculations?

may I know what’s the failure.
had you also review those exposure property settings, such as min_exp_time and max_exp_time…etc in the device tree?

imx185.c has this code:

        coarse_time_shs1 = mode->signal_properties.pixel_clock.val *
                val / mode->image_properties.line_length /

My driver does the same math to determine the number of line times the exposure should be. If Ichange pix_clk_hz in the device tree, won’t this calculate an incorrect number of line times, leading to exposures that don’t match what is written to the V4L2 control?

that’s correct. as mentioned in comment #9. sensor pixel clock must be set correctly to avoid potential issues.

The sensor pixel clock is set correctly, which results in the nvcsi clock rate being too low. You suggested increasing it here, but I don’t see how to do that without causing more problem.

then it means sensor pixel clock isn’t correct.