V4L2 latency on NVIDIA Jetson TX2

Hello,

We have a client using NVIDIA Jetson TX2 along with a camera connected via CSI. They are reporting more than 200ms of glass-to-glass latency, and we need help finding the source of this latency. The complete setup is as follows:

s

  • The camera is a Sony 4k@30FPS (EV7520A)
  • LT6211UX datasheet
  • The Jetson TX2 has BSP 32.7.2
  • They use a simple GStreamer pipeline to capture and show the camera image:
gst-launch-1.0 -v \
    nvv4l2camerasrc \
        device="/dev/video0" \
    ! capsfilter \
        caps="video/x-raw(memory:NVMM),format=UYVY,width=3840,height=2160" \
    ! nvvidconv \
    ! xvimagesink \
        sync=false \
    ;

We measured the latency introduced by the GStreamer pipeline, and it’s around 23ms.

Later, we also tested with a simple C application, which only uses the V4L2 API, and the measured latency is almost the same as with the GStreamer pipeline. This leads us to think that the latency is mainly localized on the V4L2 layer or below.

Furthermore, the VI kernel module always reports a “capture init latency” around 80ms:

[root]$ echo 1 > echo "file /home/pel/jetson/l4t-gcc/Linux_for_Tegra/source/public/kernel/nvidia/drivers/media/platform/tegra/camera/vi/channel.c +p" > /sys/kernel/debug/dynamic_debug/control
[root]$ echo "8 4 1 7" > /proc/sys/kernel/printk
$ gst-launch-1.0 nvv4l2camerasrc ! ...
$ dmesg | grep latency
[ 2531.240497] video4linux video0: free_ring_buffers: capture init latency is 79 ms

Here is the topology diagram reported by media-ctl:

Media controller API version 0.1.0

Media device information
------------------------
driver      	tegra-vi4
model       	NVIDIA Tegra Video Input Device
serial
bus info
hw revision 	0x3
driver version  0.0.0

Device topology
- entity 1: 150c0000.nvcsi--1 (2 pads, 2 links)
        	type V4L2 subdev subtype Unknown flags 0
        	device node name /dev/v4l-subdev0
    	pad0: Sink
            	<- "lt6211ux 2-002b":0 [ENABLED]
    	pad1: Source
            	-> "vi-output, lt6211ux 2-002b":0 [ENABLED]

- entity 4: lt6211ux 2-002b (1 pad, 1 link)
        	type V4L2 subdev subtype Unknown flags 0
        	device node name /dev/v4l-subdev1
    	pad0: Source
            	[fmt:UYVY8_1X16/3840x2160 field:none colorspace:srgb ycbcr:601 quantization:lim-range]
            	[dv.caps:BT.656/1120 min:160x120@25000000 max:3840x2160@297000000 stds:CEA-861,DMT,CVT caps:progressive,reduced-blanking,custom]
            	[dv.detect:BT.656/1120 3840x2160p30 (4400x2250) stds: flags:]
            	[dv.current:BT.656/1120 3840x2160p30 (4400x2250) stds:CEA-861 flags:can-reduce-fps,CE-video]
            	-> "150c0000.nvcsi--1":0 [ENABLED]

- entity 6: vi-output, lt6211ux 2-002b (1 pad, 1 link)
        	type Node subtype V4L flags 0
        	device node name /dev/video0
    	pad0: Sink
            	<- "150c0000.nvcsi--1":1 [ENABLED]

It’s also worth mentioning that, with v4l2-ctl -c low_latency_mode=1, the latency improves between 30ms and 40ms.

We’re struggling to localize the source of this latency. Is there any way to ensure how much latency is introduced by the kernel?

Regards,
Carlos.

may I know what’s your goal, it looks like expected results for capture latency.

Yes, sure, sorry. The goal is to have less than 100ms latency with 4k@30FPS.

please see-also similar topics, for instance, Topic 161122.

Thank you, JerryChang. This thread and the ones referred in there contain handy information.

I’m not sure if I understood well where these 2-3 frames of “default latency” are stored. Are they stored along the Camera -> CSI -> VI -> User Space path? Is there any way to reduce this latency, or is it something intrinsic to the TX2 behavior?

hello cfalgueras1,

there’s queue to store the images, please see-also VI driver,
for example, $public_source/kernel_src/kernel/nvidia/drivers/media/platform/tegra/camera/vi/channel.c

        /* release buffer N at N+2 frame start event */
        if (chan->num_buffers >= (chan->capture_queue_depth - 1))
                free_ring_buffers(chan, 1);