High decoding latency for stream produced by nvv4l2h264enc compared to omxh264enc

While a h264 stream created by the omxh264enc can be decoded with low latency, the stream produced by the nvv4l2h264enc cannot be decoded with low latency.

By analyzing the SPS NALUs for both encoders I found out that the issue is the pic_order_cnt_type.

With default settings the omxh264enc uses pic_order_cnt_type=2 which disables re-ordering of images and allows the decoder to work in low latency.

In contrast, the nvv4l2h264enc uses pic_order_cnt_type=0 which forces the decoder to hold onto decoded frames unneccesarily.

Is there an option to use pic_order_cnt_type=2 with the nvv4l2h264enc ?

Since the omxh264enc is deprecated, it would be great to have the same low-latency functionality from the nvv4l2h264enc.

You may try to set vbv-size parameter of nvv4l2h264enc to a lower value (first try value about 40-50).
For more details, see:

gst-inspect-1.0 nvv4l2h264enc

vbv-size does not affect the ‘pic_order_cnt_type’. For clarification, the issue is that the nvv4l2h264enc produces a h264 stream that is impossible to decode without buffering frames (introducing latency), whereas the stream produced by omxh264enc is.

The documentation for nvidia tegra

Mentions a parameter called poc-type (pic order count type)
This is exactly the parameter I was looking for ! However, on the jetson nano, running gst-inspect shows me that this parameter is not implemented.

Can you please add this parameter to the jetson nano ?
Since omxh264enc uses pic order count type=2 the hardware would be definitely capable of doing so.

Does it help if you set this property:

  maxperf-enable      : Enable or Disable Max Performance mode
                        flags: readable, writable, changeable only in NULL or READY state
                        Boolean. Default: false