We need ultra low latency on encoding and decoding for 4k@60fps real time streaming.
Below is our use case.
Orin NX encoding → RTSP → Orin NX decoding → Rendering
We use jetson_mutlimedia_api to develop our application.
In order to achieve ultra low latency, we set --max-perf enable, poc-type=2 and HW preset type =1 for encoding. Meanwhile, we also set --disable-dpb enable for decoding.
However, the encoding latency is around 26 ms. The decoding latency is around 21~25 ms.
We hope that both encoding and decoding can have lower latency. How can I reduce the latency? Is it possible that encoding and decoding can have zero latency?
Thank you for your help.
You have enabled all factors for low latency. The profiling result looks fine and expected. For further enhancement, you may execute sudo jetson_clcoks to fix CPU cores at maximum clock, and enable VIC engine at maximum clock:
Nvvideoconvert issue, nvvideoconvert in DS4 is better than Ds5? - #3 by DaneLLL
Since hardware encoder and decoder need certain time for encoding and decoding, it may not be possible to achieve zero latency. Please note this and expect certain latency from encoder and encoder.
Thanks for your response.
I tried to fix CPU cores at maximum clock (2.0GHz) by jetson_clocks, and enable VIC at maximum clock (729MHz).
The encoding latency is still around 26 ms, and decoding is around 13~18 ms. Is this the expected result? It seems to work on only decoding.
In 00_video_decode sample, there is blocklinear to pitchLinear conversion, so running VIC at max clock shall enhance this.
For encoding, one more suggestion is to enable slice level encode. Please refer to the posts:
How does slice mode video encoding work for Jetson? - #14 by DaneLLL
Encoder output metadata report - #3 by DaneLLL
This should reduce certain latency in encoder. Should be no change in decoder since it has to receive a complete frames for decoding.
You had mentioned that encoder would keep one reference frame for encoding into I P P P … frames. So encoder should have one frame delay.
If my understanding is correct, encoder behavior would look like the following steps.
encode I → encode P0 → output I → encode P1 → output P0 …
In order to reduce the latency of encoding, is there any chance that encoder can output the encoded reference frame before keeping it? Encoder behavior would look like the following steps.
encode I → output I → encode P0 → output P0 → encode P1 → output P1 …
Thanks for your help.
The current software stack is optimal. There is no room for further improvement. Since the resolution is 4K, it is a heavy loading for encoder. You may try 1080p for comparison. Some latency is from the resolution.
If every frame is I frame for encoding. Since there’s no reference frame, is it possible that encoder can output encoded frame before feeding next frame into encoder? The behavior looks like following steps.
encode I0 → output I0 → encode I1 → output I1 → encode I2 → output I2 …
Thanks for your help.
For encoder, it works in this way. You can run 01_video_encode to check and confirm the behavior. You will see dq callback is called when the frame is encoded.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.