CSI video input problem (PXL_SOF, trace, no rtcpu information)

We are trying to port a working CSI video capture driver to the L4T 32.1/Linux 4.9 platform.
The driver works for L4T 28.1. The module and carrier board work with the older BSP
and driver. Everything seems to try to start (csi4_phy configured, csi4_stream_init…)
but all I get are PXL_SOF syncpt timeout (50) ! err = -11.

I followed the instructions for enabling tracing from here:


But all I see in the trace are queue peek failures and queue send failures.

kworker/3:0-30    [003] ....   594.213318: rtos_queue_send_failed: tstamp:18949709597 queue:0x0b4a7258
     kworker/3:0-30    [003] ....   594.213325: rtos_queue_send_failed: tstamp:18949715375 queue:0x0b4a7258
     kworker/3:0-30    [003] ....   594.377204: rtos_queue_peek_from_isr_failed: tstamp:18954529987 queue:0x0b4b4500
     kworker/3:0-30    [003] ....   594.497238: rtos_queue_peek_from_isr_failed: tstamp:18959529984 queue:0x0b4b4500
     kworker/3:0-30    [003] ....   594.665233: rtos_queue_peek_from_isr_failed: tstamp:18964530178 queue:0x0b4b4500

I think there is something not started or running on the TX2. Shouldn’t there be
messages like this:

kworker/5:0-5800  [005] ...1   502.381391: rtcpu_start: tstamp:16014585622code]
     kworker/5:0-5800  [005] ...1   503.270835: rtcpu_vinotify_handle_msg: tstamp:16041133487 tag:ATOMP_FS channel:0x00 frame:0 vi_tstamp:3156230914 data:0x00000000
     kworker/5:0-5800  [005] ...1   503.270837: rtcpu_vinotify_handle_msg: tstamp:16041133645 tag:CHANSEL_FAULT channel:0x00 frame:0 vi_tstamp:3156231234 data:0x00010800
     kworker/5:0-5800  [005] ...1   503.270839: rtcpu_vinotify_handle_msg: tstamp:16041139085 tag:CHANSEL_FAULT channel:0x00 frame:0 vi_tstamp:3156236769 data:0x00000801
     kworker/5:0-5800  [005] ...1   503.270841: rtcpu_vinotify_handle_msg: tstamp:16042130809 tag:CHANSEL_SHORT_FRAME channel:0x01 frame:0 vi_tstamp:3157228305 data:0x00000001
     kworker/5:0-5800  [005] ...1   503.270842: rtcpu_vinotify_handle_msg: tstamp:16042130975 tag:ATOMP_FE channel:0x00 frame:0 vi_tstamp:3157228308 data:0x00000000

I’m not seeing these at all.


Have a check below link.
Possible the embedded data relative or short frame may the sensor output height is not as expect.

Thank you, this is very helpful.

In looking at the vi4 capture register configuration, it looks like the capture line
length differes between the two (L4T 28_1, L4T 32_1). It is 1536 and 1440 respectively.
This is for the register named ATOM__SURFACE_STRIDE0. The input data is 720p composite,
so that (1440) is the number of pixels x 2 1536 seems to be this rounded up by 256.

This does not seem to come from line_length in the mode in the device tree, but is rather
calculated from the w and h and pixel format in channel.c/tegra_channel_fmt_align.

The underlying reason that the different values are chosen is that the value
for TEGRA_STRIDE_ALIGN differs between the two releases.

source/kernel/nvidia/include/media/tegra_camera_core.h:#define TEGRA_STRIDE_ALIGNMENT 1
source/kernel/kernel-4.4/drivers/media/platform/tegra/camera/vi/core.h:#define TEGRA_STRIDE_ALIGNMENT 256

I have not yet been able to try the 32_1 version with the modified value, so
I really don’t know if this is the problem.

Any insight would be appreciated. I will provide more information when available.


One of the biggest differences with releases starting with R31 vs. R28 is the handling of some of the clocks. In R28 many of the clocks are set to the max. In the newer releases the clocks are scaled to only what is needed to reduce power consumption.

Your trace shows that the system isn’t getting any frames. This is probably due to the clock settings in the device tree for your camera driver. You could max out the clocks to see if this is the problem. Make sure that your pix_clk_hz value is set correctly in your device tree source. If you are using SerDes hardware between the image sensor and the Jetson use serdes_pix_clk. This is a new setting that was added to correct a situation where the image sensor pixel clock differs from the pixel clock coming off of a deserializer.

The following script should max out the relevant clocks. Try running this before starting a stream. If you get frames take a close look at the clock rate settings in your device tree source.


[ "$UID" != "0" ] && echo 'Root access is required.' && exit 1

    local BASE="/sys/kernel/debug/bpmp/debug/clk"
    local name="$1"
    local clock_base="${BASE}/${name}"

    echo $(cat "${clock_base}/max_rate") > "${clock_base}/rate"
    echo 1 > "${clock_base}/mrq_rate_locked"
    echo -n "${name} clock rate is:"
    cat "${clock_base}/rate"

for clock in vi isp emc nvcsi; do
    max-clock "$clock"

Thank you for the clocks suggestion. We will probably investigate this further when we get
to the performance and video quality testing of the new system.

I am able to capture composite video now, there were two major problems.

Note that with 32_1 I was able to implement the camera driver as a loadable
kernel module, something that wasn’t possible with L4T 28_1. This saved
some time as the 4.9 kernel build, even with a few changes, is more time consuming
than the 4.4 kernel was.

  1. I needed to get all the pixel format entries correct for the yuvy 16 bit
    video format. This required patching several kernel files to get things working.
    I believe these three places are where changes. I believe these are the changes
    that were necessary.

a) in nvidia/drivers/media/platform/tegra/camera/camera_commmon.c, add the following
entry to struct camera_common_colorfmt camera_common_color_fmts


b) in nvidia/drivers/media/platform/tegra/camera/sensor_common.c, add a check for
the ‘yuyv’ pixel format in tatic int extract_pixel_format()

else if (strncmp(pixel_t, "yuyv", size) == 0)
		*format = V4L2_PIX_FMT_YUYV;

c) in nvidia/drivers/media/platform/tegra/camera/vi/vi4_formats.h, add an entry
to struct tegra_video_format vi4_video_formats for this format

TEGRA_VIDEO_FORMAT(YUV422, 16, YUYV8_1X16, 2, 1, T_Y8_U8__Y8_V8,
				YUV422_8, YUYV, "YUV 4:2:2 YUYV"),

The process by which the pixel information is assembled wasn’t 100% clear, but
not all lookups are checked, so the usual failure mode was an access violation
with a backtrace showing calls to csi4/vi4/v4l2/format lookup operations. I unfortunately
did not save a backtrace.

  1. I had the incorrect CIS-PHY type. I believe there was some confusion on the
    part of the hardware developer between what a CSI-PHY and a C-PHY was, so I had
    the device tree configured for CPHY when the actual device was a DPHY. Once this
    change was made the rtcpu activity showed up in the trace buffer and video frames
    were available to the v4l2source gstreamer element.

I believe the underlying problem was the incorrect PHY config in the device tree, so
I will mark this as the solution.

Thanks for the help.


Just as a follow-up the final problem in getting all resolutions to work
was that the settle time was incorrect for some resolutions. This could
have been worked around by setting it in the device tree for each mode,
but the default calculated settle time worked for most modes. After
much digging the underlying problem was that the order of the modes in
the table in the driver did NOT match the mode numbers in the device tree.
One of the routine would search for the mode in one place, find the
incorrect mode number, go to the device tree, get the wrong resolution,
and calculate an incorrect settle time.

In summary, incorrect settle time can cause PXL_SOF timeouts with NO
vinotify entries in the trace messages.


Thanks for taking the time to send the update. This may be very helpful to myself and others in the future.