Parameter poc-type missing on jetson though mentioned in the documentation

Hi,
The implementation is in gst-v4l2 which is open source. You may download gst-v4l2 of r32.5, compile it on r32.4.4, and replace

/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvvideo4linux2.so

After replacement, please clean cacheL

$ rm .cache/gstreamer-1.0/registry.aarch64.bin

In case the new built one does not work, suggest backup original libgstnvvideo4linux2.so.

Tried this right now, but seems not working, I still got large latency.
ps:
the “latency” I care is “ High decoding latency for stream produced by nvv4l2h264enc compared to omxh264enc

I tested and confirmed it working on nvvh264enc. Note that you have to manually set the param.

of course I set the param,can you share the test pipeline?
hardware decoder ‘nvv4l2decoder’ must be used at receiver side , poc-type seems not affect software decoder like ‘avdec_h264’ in my test.

sender pipeline

gst-launch-1.0 -v nvv4l2camerasrc device=/dev/video1 ! nvvidconv ! 'video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)I420' !  tee name=t \
            t. ! queue ! nvv4l2h264enc idrinterval=30 disable-cabac=true insert-aud=true insert-sps-pps=true poc-type=2 bitrate=8000000 ! rtph264pay pt=96 ! udpsink host=239.255.5.1 port=5000

receiver pipeline SW

gst-launch-1.0 udpsrc uri=udp://239.255.5.1:5000 ! 'application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96' \
   ! rtpstorage size-time=220000000 ! rtpssrcdemux ! rtpjitterbuffer do-lost=1 latency=10 !  rtph264depay ! avdec_h264 ! autovideoconvert ! fpsdisplaysink  sync=false

receiver pipeline HW

gst-launch-1.0 udpsrc uri=udp://239.255.5.1:5000 ! 'application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, packetization-mode=(string)1, sprop-parameter-sets=(string)"Z0JAKJZUA8ARPyo\=\,aM48gA\=\=", payload=(int)96, seqnum-offset=(uint)16448, timestamp-offset=(uint)717084470, ssrc=(uint)3054027786, a-framerate=(string)30' \
   ! rtpstorage size-time=220000000 ! rtpssrcdemux ! rtpjitterbuffer do-lost=1 latency=10 ! rtph264depay ! nvv4l2decoder ! autovideoconvert ! fpsdisplaysink sync=false async=false

that is my test pipeline

                                             /       receiver pipeline SW      \
sender pipeline  with nvv4l2h264enc ======>                                     ====== >  HW got about 200ms more delay
                                             \       receiver pipeline HW     / 

                                             /       receiver pipeline SW      \
sender pipeline  with omxh264_enc ======>                                     ====== >  HW/SW same delay
                                             \       receiver pipeline HW     / 

pps: my jetson release version is r23.4.3

So far I have only seen poc_type negatively affect qualcomm snapdragon decoders.
It won’t magically reduce anything else except bring HW decoding time down to ~10ms from ~80ms on these devices.
70m-80ms end-to-end is doable with jetson as producer and a modern smartphone as consumer.

Edit: Get rid of all the unneeded stuff in your gstreamer pipelines. I got the results using the default pipeline from the docs (poc_type added) and my rtp h264/5 decoder android app:

my sender is on xavier while receiver is using deepstream5.0 on x86 using GTX2080ti’s HW decoder. also I tried jetpack 4.5 on xavier every thing works fine

@geierconstantinabc to clearify, what’s “working” mean by you?

  • work on r23.5.1 with poc-type=2
  • work on r23.4.4 with gst nvv4l2 (from r23.5.x) rebuild and poc-type=2
  • work on r23.4.3 with gst nvv4l2 (from r23.5.x) rebuild and poc-type=2

On the first release for jetson nano with poc_type available as an option, (not sure which version number anymore), I set poc_type=2 and then checked 1) if the resulting bitstream had poc_type=2 (yes) and if the stream was decodable in low latency (10ms max) on most phones (yes).
Using nvv4l2h264enc
See LiveVideo10ms/ListOfCommandsUsedForTesting.txt at master · Consti10/LiveVideo10ms · GitHub
for the command.

Okey, then it’s jetpack4.5(r32.5) on your nano, my tx2 with jetpack4.5 also works fine.
However I have to make this working on jetpack 4.4(r32.4.3) because my xavier hardware provider only support jetpack4.4 .

@DaneLLL Seems update gst-plugin is not enought, Is there anything more can try?

PS: all the ‘r23.x.x’ above should be ‘r32.x.x’

Hi,
It is possible to upgrade to r32.5?

I’s possible in the future, but depend on the xavier HW provider, right now they have no plan to support jetpack4.5

Also checked kernel driver for ‘nvhost-msenc’(device name from gst-plugin source) , but found nothing, is this part open-sourced?

@DaneLLL does such a “ordering” parameter also exist to support low-latency for h265 videos or would the same be achieved with different parameter settings (e.g., num-B-Frames = 0 on the Xavier)?

Hi @philipp12
The property poc-type is specific to h264 encoding. For h265, you may set the properties to encode a frame into multiple slices:

        -slt <type>           Slice length type (1 = Number of MBs, 2 = Bytes) [Default = 1]
        -slen <length>        Slice length [Default = 0]
        --sle                 Slice level encode output [Default = disabled]

So that we can reduce latency by getting encoded slices instead of complete frames.

Thanks @DaneLLL . I did no find these parameter options for nvv4l2h265enc nor omxh265enc (r32.5.1). Did I miss them somewhere?

Two other related questions regarding low-latency h265 streaming:

  • For low-latency, is a constant bitrate generally preferable? Should vbv-size be low, even lower than the bitrate/fps recommendation?
  • Can iframes still be inserted on demand, akin to the example at h265 decode failed? I did not find the SliceIntraRefreshEnable parameter for h265 but would also like to insert iframes programatically.

Thanks!

Hi @DaneLLL , just wanted to check in about my question above. Thanks.

Hi,
For using nvv4l2h265enc plugin, please try to set the properties:

  bit-packetization   : Whether or not Packet size is based upon Number Of bits
                        flags: readable, writable, changeable only in NULL or READY state
                        Boolean. Default: false
  slice-header-spacing: Slice Header Spacing number of macroblocks/bits in one packet
                        flags: readable, writable, changeable only in NULL or READY state
                        Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 0