Latency of CSI Camera Capture on Jetson TX2

There are many post in the devtalk discussing latency of CSI camera. I am still puzzled at some questions.

  1. What the hardware limit of the latency of catpure.

I found a guide of jetpack camera api. On page 55, it picturized the pipeline of the capture:

Does it mean there must have at least 3-4 frames latency in the capture due to CSI/VI/ISP pipeline structure ?

  1. What does the buffer actually mean in nvcamera src plugin?

I saw Ridgerun request to change the default value from 10 to 4 and it seems the buffer size impact the latency. Where is the buffer located and what does the buffer size mean?

  1. What’s the difference between libargus and nvcamera?

As in some post, I noticed NVIDIA guys suggest libargus might have low latency than nvcamera src plugin in gstreamer. Why they differs?

  1. Yes, need 3 - 4 frames latency for the argsu/nvcamerasrc
  2. It’s the memory buffer to fill the frame data. The size should be the buffer count.
  3. PLease have a check the l4t document, there’s software stack in the Camera Architecture Stack

What’s the minimum achievable latency just to get the raw image data?

In our application we cannot afford to have data waiting to be processed in any queues. We have ~50 ms to do all our computation from the time of capture (end of exposure to be precise). At 20 Hz that would be one frame of latency (including all our GPU computations, not just grabbing/converting images).

On the GPU we can enqueue multiple kernels and they will execute as fast as possible. I understand the ISP architecture is different and that’s where the latency comes from. Hence the question what if we don’t use ISP at all?

hello hugomaxwell,

  1. may I know which JetPack release you’re working with.
  2. we already have some implementation in VI-mode to improve capture init latency, (i.e. l4t-r28.2)
$ v4l2-ctl -d /dev/video0 --set-fmt-video=width=2592,height=1944,pixelformat=RG10 --set-ctrl bypass_mode=0 --stream-mmap --stream-count=1
  1. there’s debug print to evaluate the initial capture latency, please follow below steps to enable dynamic debug flag.
sudo -i
cd /sys/kernel/debug/dynamic_debug/
echo file channel.c +p > control
  1. we also fix some known issue with l4t-r28.2, you may refer to Topic 1038067, and Topic 1020202 to update your kernel driver.

I was just trying to get a picture of what I can expect, before ordering any cameras.

The best info I could find so far is this glass to glass latency test on TX1:

Using V4L2 interface they could achieve 2-3 frames latency via ISP pipeline (since they requested I420 output). I would assume that bypassing ISP is at least 1 frame faster, but that’s still not good enough.

We have to be well below one frame of latency, at 20 Hz I would like to get the image into GPU memory in about ~25 ms (from start of transmission to end of copy into memory). If that is not possible we have to use USB3 cameras, where I know that latency is equal to size of image divided by USB3 bandwidth (plus ADC read-out latency, but that’s camera specific).

hello hugomaxwell,

We have to be well below one frame of latency,
unfortunately, you cannot reach latency below one frame either in VI-mode or VI-bypass mode.
each capture request coming from user-space need to wait for at least one frame processing, also waiting till buffer writing finished and return the memory pointer to user-space.
please access developer guide and check Camera Software Development Solution chapter for reference.

FYI, you could also refer to Accelerated GStreamer User Guide and refer to [CUDA VIDEO POST-PROCESSING WITH GSTREAMER1.0] session for gstreamer plugins which process CSI signaling directly without extra memory copy. thanks