We’ve been using Jetpack 4.6.4 on the Jetson Nano SOM for the past few months in our efforts to reduce latency without much success. Our application pertains to the Healthcare Industry, where low latency is critical.
Currently, we’re dealing with a camera setup that produces frames at 400x400 resolution, 30 frames per second, with a glass-to-glass latency of approximately 130ms.
Here’s a breakdown of our latency issues:
Camera Latency: Around 33ms is attributed to the camera itself as the camera driver handles most ISP operations.
Source-level Latency: We’ve observed 30ms to 60ms latency at the source level through experiments.
Pipeline Latency: The rest of the latency is incurred within our pipeline, involving transformations like scaling, rotation, and overlay.
We’ve also compared our setup with an FPGA-based solution, which demonstrated lower latency.
In attempts to mitigate the issue, we’ve already experimented with the following:
Adjusting QUEUE_BUFFERS to 2.
Setting v4l2-ctl --set-ctrl low_latency_mode to 1.
Upon analyzing logs and other similar cases, we’ve discovered that the Jetson releases frames at the start of N+2 frame, retaining the first 2 frames in its internal queue to apply exposure. Since we manage exposure at the hardware level and don’t require software exposure control, it would greatly benefit us if frames could be released at the end of the N frame.
Upon reviewing the timestamps in vi2_fops.c, we have encountered some unexpected results that we would like to discuss further. Our camera setup is configured to deliver a resolution of 400x400 UYVY at 30fps.
To accurately measure latency, we have adopted the following methodology:
Current Methodology:
We are utilizing sample 12 of the Jetson Multimedia API.
We have adjusted the V4L2_BUFFERS_NUM parameter to 1.
Latency measurements are obtained using a smartphone to assess glass-to-glass latency.
Observations and Findings:
1. Without Patch (Queued Buffers = 2 & Frame Release at Start of N+2 Frame):
Each frame is enqueued after 33ms (as calculated from the difference in Start Capture times).
Frame release occurs at the start of the N+2 frame, yielding release timings of x+0 and x+33, where ‘x’ denotes the time when the Nth frame is enqueued into the V4L buffer.
2. With Patch (Queued Buffers = 2 & Frame Release at EOF of Nth Frame):
The discrepancy arises as the difference between consecutive frame enqueuing intervals is 66ms, twice the expected time. This is unexpected considering our camera operates at 30fps.
Concerns:
We are perplexed by the 66ms interval between frame enqueuing, which contradicts the expected 33ms interval given our camera’s 30fps configuration.
Experiment 1
→ Glass to Glass latency using photodiode experiment
Exp 2
→ Capturing Frame Just after IOCTL call, giving the name of the file as timestamp, Also pointing the camera towards console where I’m printing ts in ns, by taking the difference of time which was there in fileName and captured frame ts is giving me latency(it’s possible that for printing ts 2-3 ms of latency is going ) but the difference is significant
In the vi2_fops.c tegra_channel_capture_frame_multi_thread() program the TEGRA_VI_CSI_SINGLE_SHOT to receive data from sensor then nvhost_syncpt_wait_timeout_ext to wait the SOF. Maybe you can breakdown this function to check if any latency for the NVCSI/VI.