Performance issue while 14_multivideo_decode + output change to RGB

hobin0920 · July 9, 2020, 8:21am

Hi Sir:

 I follow this https://forums.developer.nvidia.com/t/tx2-decide-h264-with-tegra-multimedia-api/121421/32
to get RGB frame from hw codec, but while I run 2 files input, the fps less than 24fps.  could we get it fast?

thank you

DaneLLL · July 9, 2020, 11:17pm

Hi,
Do you save the YUVs to a file? The performance should be enough if you convert the video frame from NV12 to RGBA through NvBufferTransform(). Since size of 4K RGBA is 3840x2160x4 bytes, it will take some time to save to disk.

hobin0920 · July 10, 2020, 1:32am

Hi Dane:

the YUVs is okay, but while I use NvBufferTransform() to save RGBA 4k resolution, the performance is not good enough to 24fps. do you have any idea to speed up?

thank you

DaneLLL · July 10, 2020, 2:37am

Hi,
Looks like you run

4K stream -> decode to YUV -> convert to RGBA via NvBufferTransform() -> save to a video file

Please remove save to a video file and check again.

hobin0920 · July 10, 2020, 7:21am

Hi Dane:

already removed, but it still the same while I run more than 1 file decode and convert to RGBA. the fps is lower than 24fps for each. Do you have any idea to improve it ?

thank you

DaneLLL · July 14, 2020, 2:29am

Hi,
We don’t observe the issue in running 14_multivideo_decode. The log is

14_multivideo_decode$ ./multivideo_decode num_files 2 /home/nvidia/4k.h264 H264 -o ~/a1.yuv /home/nvidia/4k.h264 H264 -o ~/a2.yuv --disable-rendering --stats                  Set governor to performance before enabling profiler
Creating decoder in blocking mode
Creating decoder in blocking mode
Opening in BLOCKING MODE
Set governor to performance before enabling profiler
NvMMLiteOpen : Block : BlockType = 261
Opening in BLOCKING MODE
Set governor to performance before enabling profiler
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
NvMMLiteBlockCreate : Block : BlockType = 261
Setting frame input mode to 1
Setting frame input mode to 1
Starting decoder capture loop thread
Starting decoder capture loop thread
Video Resolution: 3840x2160
Decoder colorspace ITU-R BT.709 with standard range luma (16-235)
Video Resolution: 3840x2160
Decoder colorspace ITU-R BT.709 with standard range luma (16-235)
Query and set capture successful
Query and set capture successful
Input file read complete
Input file read complete
Exiting decoder capture loop thread
Instance 0 executed sucessfully.
Exiting decoder capture loop thread
Instance 1 executed sucessfully.
*****************************************
Stream = /home/nvidia/4k.h264
Total Profiling time = 16.4949
Average FPS = 54.5017
Average latency(usec) = 0
Minimum latency(usec) = 18446744073709551615
Maximum latency(usec) = 0
*****************************************
*****************************************
Stream = /home/nvidia/4k.h264
Total Profiling time = 16.4675
Average FPS = 54.5923
Average latency(usec) = 0
Minimum latency(usec) = 18446744073709551615
Maximum latency(usec) = 0
*****************************************
App run was successful

The patch of converting decoded NV12 to RGBA:

diff --git a/multimedia_api/ll_samples/samples/14_multivideo_decode/multivideo_decode_main.cpp b/multimedia_api/ll_samples/samples/14_multivideo_decode/multivideo_decode_main.cpp
index ebc4095..24f15ba 100644
--- a/multimedia_api/ll_samples/samples/14_multivideo_decode/multivideo_decode_main.cpp
+++ b/multimedia_api/ll_samples/samples/14_multivideo_decode/multivideo_decode_main.cpp
@@ -600,8 +600,7 @@ query_and_set_capture(context_t * ctx)
     input_params.width = crop.c.width;
     input_params.height = crop.c.height;
     input_params.layout = NvBufferLayout_Pitch;
-    input_params.colorFormat = ctx->out_pixfmt == 1 ? NvBufferColorFormat_NV12 :
-                                            NvBufferColorFormat_YUV420;
+    input_params.colorFormat = NvBufferColorFormat_ABGR32;
     input_params.nvbuf_tag = NvBufferTag_VIDEO_DEC;
 
     ret = NvBufferCreateEx (&ctx->dst_dma_fd, &input_params);
@@ -1069,7 +1068,7 @@ dec_capture_loop_fcn(void *arg)
              /* If we need to write to file or display the buffer, give
                the buffer to video converter output plane instead of
                returning the buffer back to decoder capture plane. */
-            if (ctx->out_file || (!ctx->disable_rendering && !ctx->stats))
+            if (1)
             {
 #ifndef USE_NVBUF_TRANSFORM_API
                 NvBuffer *conv_buffer;

hobin0920 · July 14, 2020, 2:38am

DaneLLL:

eating decoder in blocking mode
Creating decoder in blocking mode
Opening in BLOCKING MODE
Set governor to performance before enabling profiler
NvMMLiteOpen : Block : BlockType = 261
Opening in BLOCKING MODE
Set governor to performance before enabling profiler
NvMMLiteOpen : Block : BlockType = 261

Hi Dane:
I got the same result while only 4k resolution x2, but while add to 4k resolution x4, slower than 24 fps per channel. so the different between is 2 or 4 channels and also blocking mode or non-blocking mode. Do you have any idea to improve the performance on 4k x 4channel, or it’s the limitation on this platform?

DaneLLL · July 14, 2020, 2:44am

Hi,
It is hardware limitation of TX2. 4x 4Kp30 is not supported.

hobin0920 · July 14, 2020, 2:54am

Hi Dane:

how about NX? Is it powerful than TX on decoder part?

thank you

DaneLLL · July 14, 2020, 4:59am

Hi,
For Xavier NX, please check
https://developer.nvidia.com/jetson-xavier-nx-data-sheet
It can run 4x 4Kp30 in HEVC decoding.

Topic		Replies	Views
How to achieve the H.264 encoding performance: 4K (3,840x2,160)/30fps with OpenMAX IL API/L4T R24.1 Jetson TX1	40	11912	October 18, 2021
Questions on NVIDIA Jetson Xavier NX decoded video output format Jetson Xavier NX decoder , chinese	14	963	December 6, 2022
TX2 decide H264 with tegra_multimedia_api Jetson TX2 mmapi	34	2044	October 18, 2021
TX2 tegra_multimedia_api encode/decode issue Jetson TX2 mmapi	19	2225	October 18, 2021
Nvv4l2h264enc performance problem Jetson Xavier NX camera , gstreamer	12	1841	October 18, 2021
xavier encode and decode do not match official description Jetson AGX Xavier	3	1124	October 18, 2021
What is the h264 encoding speed on Jetson AGX Xavier? Jetson AGX Xavier	4	917	October 18, 2021
performance limitation in backend multimedia api sample Jetson AGX Xavier	9	518	October 18, 2021
Fps is reduced to 48 when running two 4k camera in 60fps in xavier nx platform Jetson Xavier NX gstreamer , fps	7	985	January 9, 2023
Jetson TX2 decoder frame delay? Jetson TX2	4	734	October 18, 2021

Performance issue while 14_multivideo_decode + output change to RGB

Related topics