Some issues about TX1/TX2 H.265 video encode performance

Hi,

We prepare to use TX1/TX2 for video encoding, but some questions below about H.265 encode performance need your kindly help to clarify.

  1. First of all, according to H.264/H.265 standard, we know that compared to H.264HP,the encoding compression efficiency of H.265 should upgrade 30 ~ 50%, could H.265 in TX1/TX2 can reached this level?

  2. Could you help to provide NVIDIA’s test reports about H.265 encoding performance? Alternatively, YUV files and corresponding encoded bitstream .h265 files is OK. If test reports or YUV/.h265 files are not in your hand, maybe we can send you raw YUV files(website links), and you provide bit stream files to us?

  3. Below are two max scenarios that we possible use in TX1/TX2:
    1). TX1: 4KP30 encode + 4KP60 decode + 5 channel 4KP60–>1080P60 scale + 4KP60 GUI output
    2). TX2: 4KP60 eocode + 4KP60 decode + 5 channel 4KP60–>1080P60 scale + 4KP60 GUI output
    We require encode&decode&scale&output should run at the same time, we want to know whether TX1/TX2 are able to handle its above scenarios?

Thx.

This post claims it only does 4KP30:

https://devtalk.nvidia.com/default/topic/1003950/jetson-tx2/encoding-yuv420-received-from-csi/post/5139336/#5139336

I have the same enquiries too.

RASAP

Thanks.

Hi,
For TX2, please refer to the link for achieving 4Kp30 H265 encoding:
https://devtalk.nvidia.com/default/topic/1004950/jetson-tx2/multimedia-api-scale-encode/post/5135482/#5135482

5 channel 4Kp60 encoding/decoding does exceed HW limitation.

Here is a case of 4 channel 1080p25 transcoding:
https://devtalk.nvidia.com/default/topic/979908/jetson-tx1/gstreamer-transcoding-performance-issue/post/5033461/#5033461

Hi DaneLLL,

Sorry for misleading you, 5 channels running is just for scaling from 4KP60 to 1080P60, not for encoding/decoding.

1). TX1: 1 channel 4KP30 encode + 1 channel 4KP60 decode + 5 channel 4KP60–>1080P60 scale + 4KP60 GUI output
2). TX2: 1 channel 4KP60 eocode + 1 channel 4KP60 decode + 5 channel 4KP60–>1080P60 scale + 4KP60 GUI output

if so, will TX1/TX2 can meet our requirement?

Hi Yugui,
1). TX1: 1 channel 4KP30 encode + 1 channel 4KP60 decode + 5 channel 4KP60–>1080P60 scale + 4KP60 GUI output
How do you implement ‘5 channel 4KP60–>1080P60 scale’? Except this, others look fine.

2). TX2: 1 channel 4KP60 encode + 1 channel 4KP60 decode + 5 channel 4KP60–>1080P60 scale + 4KP60 GUI output
1 channel 4KP60 encode should be fine but is almost the HW limitation(up to 120Mbps)

Hi DaneLLL,

In our scenario, “+” means run concurrently, in other words, we need process video encode, decode, scale, and GUI output at the same time, moreover, ‘5 channel 4KP60–>1080P60 scale’ means we need to scale five video stream at the same time.

Is there any problem for TX1/TX2 to run the combination scenarios?

Hi Yugui,
We don’t have full test cases similar to yours, bug I think ‘5 channel 4KP60–>1080P60 scale’ will be an issue if using HW converter. Here is a relevant thread:
https://devtalk.nvidia.com/default/topic/992751/jetson-tx1/multimedia-api-videoconverter/post/5081466/#5081466

Hi DaneLLL,

Thanks for your patient reply.
How about my question 1&2, look forward your update info.

Hi DaneLLL,

As we know, some manufactures can achieve 720P30 video encode through 512K bandwidth, if we use TX1/TX2 to run 720P30/1080P30/4KP30 video encoding, how much minimum bandwidth will consume? do you have any suggested bandwidth for these encoding?

The texture units can render 5 textures of size 4096x2048, down-sampled to 2048x1024, at 60 Hz, with no problem.
If the destination of the scaled streams is the GUI output, then you could just use the streams at the full resolution, and scale them in the display code, using texture units rather than video scaler units.

The question then is: Where does the data come from? Are you capturing 5 separate 4KP60 inputs over CSI/MIPI? Do they arrive over the network? USB3? Are they being decoded, or are they raw? What format? Bayer, RGB, YCbCr, or what?

I think a better understanding of the specific requirements (like, a flow chart, with data formats marked up) would make it easier to understand what parts could be done by what bits of hardware, because these details really do matter.

To snarky,

Thanks for your sharing this important info, our five channel scaling source are from two MIPI(YCbCr) + two decode(YCbCr) + GUI display(RGBA).

Do you have more information to share with us? cause we pay more attention with the question 1&2 here:

  1. First of all, according to H.264/H.265 standard, we know that compared to H.264HP,the encoding compression efficiency of H.265 should upgrade 30 ~ 50%, could H.265 in TX1/TX2 can reached this level?

  2. Could you help to provide NVIDIA’s test reports about H.265 encoding performance? Alternatively, YUV files and corresponding encoded bitstream .h265 files is OK. If test reports or YUV/.h265 files are not in your hand, maybe we can send you raw YUV files(website links), and you provide bit stream files to us?

Hi Yugui,
For question 1&2, we do not have the reports. Do you have available test reports to share with us? We would like to know more about what test reports you refer to.

Hi DaneLLL,

We don’t have reports neither, so we hope NVIDIA can provide some files to us, including YUV raw file and *.h265 encoded bitstream files. You can attach *.h265 bitstream files in this topic, or send them to my e-mail: xuyugui@huawei.com.

According to our experience, we apply two ways to test H.265’s encode efficiency and performance:

  1. Send the same YUV to encoder, and use CQP rate control to demonstrate whether H.265’s encode efficiency have improve 30%-50%, compare with H.264HP.
  2. Send the same YUV to encoder, set VBR and same bitrate to both H.264&H.265 encoders,then see whether H.265 have better encoded image quality than H.264?

So it sounds like your scaling + GUI can totally be solved by the GPU half of the Jetson.
I know nothing about the internals of the video encoder, unfortunately.
If the specifics are really important to you, I’d suggest buying a devkit and encoding the specific files you need, to run the experiment yourself.

about encode and decode,TX1 and TX2 only supports yuv420? how about yuv422 and yuv444?

Please refer to
https://devtalk.nvidia.com/default/topic/1003950/jetson-tx2/encoding-yuv420-received-from-csi/post/5139336/#5139336

yuv422 and yuv444 are not supported.