H.265 vs H.264 performance

All,

On the tegra x1 I am encoding many different streams concurrently. 1-2 video sources are taken in and then using the ‘tee’ element i encode them any number of times.

I am using gstreamer + the omxh26*enc plugins.

The most basic example being:

I have seen that when using the h.264 encoder I am able to encode a total of 8 streams at 1080p30 concurrently without any frame drops. The following gst-launch stream works perfectly and holds 30fps for all the streams.

gst-launch-1.0 v4l2src device=/dev/video0 do-timestamp=true io-mode=rw ! "video/x-raw, width=1920, height=1080, format=(string)UYVY, framerate=(fraction)30/1" !  queue ! nvvidconv output-buffers=25 ! 'video/x-raw(memory:NVMM), width=1920, height=1080,format=I420, framerate=30/1' ! queue ! tee name='vtee0' !  \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=5098  sync=false vtee0. ! \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=6098  sync=false vtee0. ! \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=7098  sync=false vtee0. ! \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=8098  sync=false vtee0. ! \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=9098  sync=false vtee0. ! \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=10098  sync=false vtee0. ! \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=11098  sync=false vtee0. ! \
omxh264enc bitrate=5000000 control-rate=2 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=12098  sync=false

This alone confuses me since the claim of the tx1 is it can claim 4K@30 which means theoretically only 4x 1080p30 should be capable. Yet i am runnning 8 without issue.

But when switch to h265 there seems to be a drop in framerate starting with 6+ streams. For example the following runs at 27 fps instead of 30:

gst-launch-1.0 v4l2src device=/dev/video0 do-timestamp=true io-mode=rw ! "video/x-raw, width=1920, height=1080, format=(string)UYVY, framerate=(fraction)30/1" !  queue ! nvvidconv output-buffers=25 ! 'video/x-raw(memory:NVMM), width=1920, height=1080,format=I420, framerate=30/1' ! queue ! tee name='vtee0' !  \
omxh265enc bitrate=5000000 control-rate=2 ! 'video/x-h265, stream-format=(string)byte-stream' ! h265parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=5098  sync=false vtee0. ! \
omxh265enc bitrate=5000000 control-rate=2 ! 'video/x-h265, stream-format=(string)byte-stream' ! h265parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=6098  sync=false vtee0. ! \
omxh265enc bitrate=5000000 control-rate=2 ! 'video/x-h265, stream-format=(string)byte-stream' ! h265parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=7098  sync=false vtee0. ! \
omxh265enc bitrate=5000000 control-rate=2 ! 'video/x-h265, stream-format=(string)byte-stream' ! h265parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=8098  sync=false vtee0. ! \
omxh265enc bitrate=5000000 control-rate=2 ! 'video/x-h265, stream-format=(string)byte-stream' ! h265parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=9098  sync=false vtee0. ! \
omxh265enc bitrate=5000000 control-rate=2 ! 'video/x-h265, stream-format=(string)byte-stream' ! h265parse ! mpegtsmux alignment=7 ! udpsink host=224.0.0.2 port=10098  sync=false

Furthermore it gets progressively worse as you add more streams.
it runs at 23fps for 7 streams and 20fps for 6 streams.

Using tegrastats I have seen that the CPUS are running fine:

RAM 564/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [0%,0%,0%,0%]@1734 EMC 19%@1600 AVP 3%@80 VDE 0 GR3D 0%@76 EDP limit 1734
RAM 568/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [38%,19%,18%,13%]@403 EMC 38%@800 AVP 4%@80 VDE 0 GR3D 0%@76 EDP limit 1734
RAM 569/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [41%,16%,9%,26%]@1326 EMC 28%@1065 AVP 4%@80 VDE 0 GR3D 0%@76 EDP limit 1734
RAM 570/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [49%,13%,24%,24%]@1224 EMC 28%@1065 AVP 3%@80 VDE 0 GR3D 0%@76 EDP limit 1734
RAM 570/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [44%,18%,17%,19%]@1326 EMC 28%@1065 AVP 3%@80 VDE 0 GR3D 0%@76 EDP limit 1734
RAM 571/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [51%,16%,18%,14%]@1734 EMC 18%@1600 AVP 4%@80 VDE 0 GR3D 0%@76 EDP limit 1734
RAM 571/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [48%,16%,15%,19%]@1132 EMC 37%@800 AVP 4%@80 VDE 0 GR3D 0%@76 EDP limit 1734
RAM 571/3854MB (lfb 741x4MB) SWAP 0/0MB (cached 0MB) cpu [40%,5%,27%,26%]@1734 EMC 18%@1600 AVP 3%@80 VDE 0 GR3D 0%@76 EDP limit 1734

Is the NVENC engine know to have less efficiency encoding H.265 vs H.264 streams? What is the theoretical max?
I am trying to figure out if any optimizations can be done on my end to improve performance of the H.265 version of this pipeline. I have already tried adding queues and whatnot to no avail.

Hello,
Please refer to TX1 datasheet.
For H265, it can support 2160p/30fps, and that cannot guarantee working for higher resolution/fps.

H264 is not so complicated as H265, so it may have more margin. But that still can not guarantee stable working with resolution/fps higher than datasheet.

br
ChenJian

Understood. That is what i thought. that is why i was surprised i was still able to do 8x at H.264.