We are currently using a Jetson Xavier NX running JetPack SDK 5.0.2 and a USB camera, in a project that is extremely sensitive to latency. We would like to have the smallest possible latency between the moment the photons hit the camera and the moment OpenCV can start to work on the video frame in our application. Ideally, that latency would be the rolling shutter time plus a tiny overhead.
We’ve so far had our best latency using GStreamer with Video4Linux, and setting the sensor to 120fps in mjpeg (the fastest reported sensor mode), but we are fine with switching to other means of fetching video frames.
Measuring the time between the photons hitting the camera and the availability of some data in a software is hard to do with milliseconds accuracy. We therefore use the glass-to-glass latency, in a similar fashion as what is described in NVIDIA Jetson TX1 TX2 Glass to Glass latency | Jetson TX1 TX2 Capture | RidgeRun - RidgeRun Developer Connection
Here is how we take our measurements: We record a miliseconds-accurate clock display with the USB camera. Then, we record with a 120FPS phone camera at the same time the ground truth clock and the USB camera video feed. By comparing the frame’s difference, we obtain the latency estimate. Unfortunately, this also measures the time it needs to display the video frame on a monitor, while we’re just interested in the time it needs to get the frame in memory. (We do not know how long it realistically takes once the frame is available in the userland software to display it on our 60Hz monitor.) We use the following command:
gst-launch-1.0 -v v4l2src device=/dev/video0 io-mode=4 ! image/jpeg, width=640, height=480, framerate=61612/513 ! jpegdec ! xvimagesink
We measure on average 58ms (7 frames on the 120FPS phone camera) of glass-to-glass latency. Since the camera sensor is set to 120 FPS, the rolling shutter time should only be 8.3ms. This leaves 50ms for the frame to transfer via USB to userland memory, and then to transfer to the monitor. Even with a GPU+monitor latency of 20ms, it still leaves 30ms for the transfer between the camera and the software memory. This is the time we would like to shorten as much as possible.
We have read a couple of older threads that have been dealing with similar issues, but these threads are old, refer to MIPI cameras, and do not reflect the latest developments of L4T for the Xavier NX: (Aplologies, the links are missing because I can’t place more that one link in the post as a new user)
- CSI latency is over 80 milliseconds…?
- One frame latency/delay in TX1 V4L stack
We have not applied any modified library as suggested in the post above, since we are using r35.1, and we are pessimistic about ABI compatibility of these modifications on a Xavier NX.
Still, with all the above, we feel like we’re at the bottom of the rabbit hole, yet the latency is way too high for our application. Any help would be really appreciated.