Minimum latency for real-time app, stream ecnode method

• Hardware Platform (Jetson / GPU) Jetson Xavier
• DeepStream Version DS 6.0
• JetPack Version (valid for Jetson only) 4.6 (L4T 32.6.1)
• TensorRT Version 8.0.1
**• Issue Type question

Hello everyone!

I have a a deepstream pipeline running real-time to a jetson which specs i provided above. This pipeline resembles a lot to the deepstream app 3 including nstreammux nvinfer and nvtiler.

The pipelines has a uridecodebin and a nvv4l2decoder as 2 first elements.
I wanted to ask which encode protocol to use for minimum latency stream. I tried without deepstream and mjpeg seems faster than the h264. But nvv4l2decoder is optimized for h264 and maybe it is faster. Also mjpeg is not gpu accelerated at the decode which means that it is turning to system memory and not NVMM.
Can U guide me on this?
Thank you very much

could you elaborate on this? how did you test? could you share the media pipeline?

you can use nvjpegdec and "jpegparse ! nvv4l2decoder " to decode jpeg with acceleartion.
you can use nvv4l2decoder and nvv4l2h264enc for HW decoding and encoding. here is a sample:

gst-launch-1.0 -v filesrc location=/home/2.jpg ! jpegparse  !  nvv4l2decoder   ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=I420' ! nvv4l2h264enc bitrate=1000000 ! filesink location=test.264

Thank you for your immediate response, @fanzh.

The sample pipeline I am using is as follows, with the source URI replaced by an RTSP stream:

gst-launch-1.0 \
  uridecodebin uri=rtsp://<RTSP_STREAM_URL> ! \
  videorate max-rate=12 ! \
  nvstreammux width=1920 height=1080 batch-size=1 batched-push-timeout=40000 name=mux ! \
  queue ! \
  nvinfer config-file-path=<DEEPSTREAM_CONFIG_PATH> ! \
  nvvideoconvert ! \
  capsfilter caps="video/x-raw(memory:NVMM), format=RGBA" ! \
  queue ! \
  nvmultistreamtiler ! \
  queue ! \
  nvvideoconvert ! \
  queue ! \
  fakesink

Objective: I aim to minimize latency to achieve the most real-time performance possible, as I have implemented a PTZ controller immediately after this pipeline (and a custom tracker) to follow a tracked object.

Benchmarking MJPEG vs. H264:

I have been researching the trade-offs between MJPEG and H264 encoding/decoding:

  • MJPEG:
    • Pros: Lower latency, which is crucial for real-time applications.
    • Cons: Higher bandwidth usage due to less efficient compression.
  • H264:
    • Pros: More bandwidth-efficient compression.
    • Cons: Potentially higher latency because of the decode process, although this can be mitigated with hardware acceleration.

On my desktop setup, using OpenCV’s cv2, MJPEG streams exhibit noticeably lower latency compared to H264 streams. However, I am curious about the performance expectations on an NVIDIA Xavier platform, where decoding can be hardware-accelerated.

Questions:

  1. Performance Expectations on NVIDIA Xavier:
  • Given that the decode process on Xavier is hardware-accelerated, what performance improvements can I expect when using H264 compared to MJPEG? Is H264 latency still significantly higher, or does the hardware acceleration mitigate this effectively?
  1. Benchmarking Methodology:
  • How can I effectively test and benchmark the latency between MJPEG and H264 encoding/decoding within my pipeline? Are there specific tools or techniques you recommend for accurate latency measurements in GStreamer-based applications?
  1. Reducing Latency Further:
  • Besides choosing the appropriate encoding protocol, what other strategies can I employ within the pipeline to further reduce latency? Are there specific GStreamer elements or configurations known to help achieve lower latency?
  1. Issues with nvmjpegdec:
  • I attempted to use nvmjpegdec in another application for decoding JPEG images from files, but it resulted in altered pixel values. Has anyone else experienced this issue, and are there known solutions or workarounds to ensure accurate decoding with nvmjpegdec?

Additional Information:

  • My application requires real-time processing due to the integration with the PTZ controller, making latency minimization crucial. Ensuring that the entire pipeline operates with the lowest possible delay is essential for accurate and responsive object tracking.
  1. there is only decoding in your pipeline. if there is only one rtsp source, decoding should not be the bottleneck. you can use "sudo tegrastats " to monitor the decoding/encoding utilization.
  2. you can add timestamp on the video or make the camera towards the clock. please refer to this faq for how to “Enable Latency measurement for deepstream sample apps”.
  3. please refer to latency of rtspsrc.
  4. please use nvv4l2decoder or nvjpegdec to decode jpg.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.