Tx2 with jetpack 3.1 , GPU utilization issue for video encoding MSENC/NVENC

Hi ,

  1. Me trying to get the performance numbers and GPU utilization for H.265 encoding using hardware accelerators on Tegra Tx2 Board.

1.a) I am getting MSENC number how to related % utilization of hardware accelerators ? below is the snap of output.
1.b) Command for encoding {./video_encode COSTARICA4k.yuv 3840 2160 H265 out1.h265 -rc vbr -br 10000000 -fps 30 1}
1.c) Command for checking resource utilization.

root@tegra-ubuntu:~# ./tegrastats
RAM 3548/7851MB (lfb 1x2MB) cpu [0%@1575,off,off,0%@1574,0%@1574,0%@1572] EMC 20%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 3518/7851MB (lfb 1x2MB) cpu [16%@653,off,off,33%@652,58%@652,13%@654] EMC 27%@1062 APE 150 MSENC 1113 GR3D 0%@318
RAM 3488/7851MB (lfb 1x2MB) cpu [32%@806,off,off,22%@806,43%@806,34%@805] EMC 27%@1062 APE 150 MSENC 1113 GR3D 0%@114
RAM 3462/7851MB (lfb 1x1MB) cpu [47%@1114,off,off,25%@1113,32%@1113,54%@1113] EMC 22%@1331 APE 150 MSENC 1113 GR3D 30%@114
RAM 3445/7851MB (lfb 4x256kB) cpu [27%@806,off,off,49%@806,15%@806,31%@806] EMC 24%@1331 APE 150 MSENC 1113 GR3D 0%@420
RAM 3440/7851MB (lfb 4x256kB) cpu [19%@1153,off,off,54%@1153,4%@1156,19%@1157] EMC 23%@1600 APE 150 MSENC 1113 GR3D 0%@114
RAM 3441/7851MB (lfb 4x256kB) cpu [25%@1581,off,off,21%@1573,7%@1574,37%@1575] EMC 25%@1600 APE 150 MSENC 1113 GR3D 20%@114
RAM 3411/7851MB (lfb 5x256kB) cpu [36%@806,off,off,35%@806,32%@806,44%@806] EMC 24%@1331 APE 150 MSENC 1113 GR3D 24%@114
RAM 3402/7851MB (lfb 3x256kB) cpu [35%@652,off,off,34%@653,35%@652,31%@652] EMC 24%@1331 APE 150 MSENC 1113 GR3D 0%@420
RAM 3395/7851MB (lfb 3x256kB) cpu [35%@652,off,off,28%@805,46%@806,37%@806] EMC 24%@1331 APE 150 MSENC 1113 GR3D 24%@114

  1. In the sample example code , Their is two encoders one is "01_video_encode" and other is 03_video_cuda_enc?
    2.a) what is the difference ?
    2.b) which one uses cuda core and hardware accelerators (NVENC)?
    {both code looks similar just 01_video_encode has more option for configuration }

  2. MSENC for h.264 is showing MSENC only ? which means hardware accelerators for H.264 is MSENC (dedicated hardware chip)

hi meRaza,
Please run ‘sudo ./tegrastats’ to get full information.

[01_video_encode]
input YUVs -> H264 encoder
[03_video_cuda_enc]
input YUVs -> CUDA-processing -> H264 encoder

The HW engine is shown MSENC in tegrastats. It is independent of GPU.

Hi DaneLLL,

Thanks for quick reply .
still my quires are not clarified . can you kindly help me to clarify Asap.

  1. I tried "sudo ./tegrastats" and tried in root also sudo -s both give same result.

root@tegra-ubuntu:~# sudo ./tegrastats
RAM 4024/7851MB (lfb 2x256kB) cpu [0%@1037,off,off,0%@1036,0%@1037,0%@1037] EMC 24%@1062 APE 150 MSENC 1113 GR3D 0%@216
RAM 4023/7851MB (lfb 2x256kB) cpu [20%@806,off,off,38%@806,30%@806,16%@808] EMC 20%@1600 APE 150 MSENC 1113 GR3D 0%@216
RAM 4022/7851MB (lfb 2x256kB) cpu [33%@652,off,off,20%@652,22%@654,45%@653] EMC 22%@1331 APE 150 MSENC 1113 GR3D 0%@114

I want to find how much Hardware accelerator utilization in terms of percentage (%) for MSENC 1113.
How to relate MSENC 1113 to ?%.{10%, 20%…} ?
because i want to calculate ,Max number of multiple encoding supports for different resolution.

  1. If i correctly understood from your reply that --> "01_video_encode" (uses Hardware accelerator MSENC/NVENC ) and other is 03_video_cuda_enc (uses 256 Cuda core—GPU ) ?

  2. which APIs is better for 4k encoding , OpenMAX IL ? or MMAPI ? or Gstreamer ? or v4l2 ?.
    because my product need to supports realtime multiple 4K/2K/1080p encoding. (which uses Tx2 ) with jetpack3.1 latest

Hi meRaza,
It shows the current frequency of MSENC engine in tegrastats. It cannot show in percentage. There are two posts about encoder capability:
https://devtalk.nvidia.com/default/topic/1008984/jetson-tx2/video-encode-speed/post/5149942/#5149942
https://devtalk.nvidia.com/default/topic/1009082/jetson-tx1/multiple-h-265-video-encoding-/post/5150527/#5150527

Both 01_video_encode and 03_video_cuda_enc use MSENC to do encoding. Encoding via GPU is only valid on desktop GPUs, not on TEGRAs.
03_video_cuda_enc demonstrates doing CUDA processing on input YUVs before sending into MSENC.

We support MMAPIs(some implemented in v4l2) and gstreamer. We don’t support OpenMAX IL.

Hi Danel,
Thanks for the quick reply, it really helped me to solve many doubts & confusions.

I have following additional quries:

  1. Input to Hardware encoder(MSENC) is YUV420(I420 or NV12) format?
    • What other formats are supported?

API’s Selection
2) Tegra MMAPI’s
2a) “03_video_cuda_enc” cuda processing(EGLImage) is used to convert “input raw src” from camera to YUV420(I420 or NV12) in MMAPI?
2b) The cuda processing is necessary for hardware encoding of raw camera output? for example “01_video_encode” does not use any cuda processing

  1. Related to gstreamer API, as I read in another post :
    (https://devtalk.nvidia.com/default/topic/979908/jetson-tx1/gstreamer-transcoding-performance-issue/post/5033461/#5033461)
    a) omxh264dec, omxh265enc… etc uses HW engines in TX2 for encoding/decoding
    b) Gstreamer uses omx internally
    Based on my above understanding, gstreamer omx is OpenMAX IL? If so, then OpenMAX IL should be supported in TX2 with gstreamer

  2. which is better MMAPI’s or Gstreamer API’s or v4l2 API’s for 4k encoding of raw camera ouput?

Hi meRaza,

No other formats are supported. Only I420 and NV12.

The cuda processing is not necessary. It is to demonstrate the functionality.

It is supported through gstreamer. It is not supported if you call OMX-IL APIs directly.

Either one is good. If you are familiar with gstreamer, you can go with gstreamer. If your case is probably limited by gstreamer, you can go with MMAPIs.

Hi Danel,
Thanks a lot your expert info helped me to solve my doubts.