Jetson Hardware accelerated encoders max. performance

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson
• DeepStream Version 6.2
• JetPack Version (valid for Jetson only) JetPack 5.1.1
• TensorRT Version 8.5.2.2
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs) Questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hello,
We are looking to have a Gstreamer pipeline having 100MP with 10 fps as an input source. The data will be then encoded to h264, h265 and av1 formats.
We are using an OrinNX with 16GB RAM.
It would be beneficial to know the limits of the hardware accelerated encoders (nvv4l2h265enc and others) and what is the maximum bandwidth they can tolerate.
Also, it’s helpful to know how we can use them at their best performance in terms of buffer allocation and memory management.

This pipeline already fails:

gst-launch-1.0 videotestsrc ! video/x-raw, width=6000, height=3750, framerate=10/1  ! nvvidconv compute-hw=GPU nvbuf-memory-type=nvbuf-mem-cuda-device  ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! h265parse ! fakesink 

logs:

nvbufsurftransform: Could not get EGL display connection
Setting pipeline to PAUSED ...
nvbuf_utils: Could not get EGL display connection
Opening in BLOCKING MODE 
Pipeline is PREROLLING ...
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 8 
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 8 
ERROR: from element /GstPipeline:pipeline0/nvv4l2h265enc:nvv4l2h265enc0: Failed to process frame.
Additional debug info:
/dvs/git/dirty/git-master_linux/3rdparty/gst/gst-v4l2/gst-v4l2/gstv4l2videoenc.c(1513): gst_v4l2_video_enc_handle_frame (): /GstPipeline:pipeline0/nvv4l2h265enc:nvv4l2h265enc0:
Maybe be due to not enough memory or failing driver
ERROR: pipeline doesn't want to preroll.
Setting pipeline to NULL ...
Freeing pipeline ...

With GST_DEBUG=4

0:00:01.143249546 22184 0xaaaafdf51060 ERROR          v4l2allocator gstv4l2allocator.c:366:gst_v4l2_memory_group_new: buffer size 99876864 is smaller then negotiated size 99893248, this is usually the result of a bug in the v4l2 driver or libv4l.

0:00:01.143274859 22184 0xaaaafdf51060 ERROR       v4l2bufferpool gstv4l2bufferpool.c:1217:gst_v4l2_buffer_pool_start:<nvv4l2h264enc0:pool:sink> we received 0 buffer from device '/dev/nvhost-msenc', we want at least 2

0:00:01.143286187 22184 0xaaaafdf51060 ERROR           bufferpool gstbufferpool.c:559:gst_buffer_pool_set_active:<nvv4l2h264enc0:pool:sink> start failed

0:00:01.143295115 22184 0xaaaafdf51060 WARN            v4l2videoenc gstv4l2videoenc.c:1492:gst_v4l2_video_enc_handle_frame:<nvv4l2h264enc0> error: Failed to allocate required memory.

0:00:01.143300523 22184 0xaaaafdf51060 WARN            v4l2videoenc gstv4l2videoenc.c:1492:gst_v4l2_video_enc_handle_frame:<nvv4l2h264enc0> error: Buffer pool activation failed

Thanks

The hardware encoder specification of orin nx does not support 6000x3750

Refer this table.

Thank you,
When I ran it on 4k10, I got this error msg:

/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform_copy.cpp:335: => Failed in mem copy

Thanks,
I am trying a pipeline with 5 input sources with 1080p10fps,
The pipeline is PLAYING but there is nothing happening from the GPU perspective.
Also the nvenc HW engine starts working with roughly 700Mhz and stops immediately (from jtop).
This is the Gstreamer command:

gst-launch-1.0 compositor name=mix sink_0::xpos=0 sink_1::xpos=10 sink_2::xpos=20 sink_3::xpos=30 sink_4::xpos=40 ! nvvidconv  ! 'video/x-raw(memory:NVMM),width=9600,height=1080,framerate=10/1' ! nvv4l2h265enc maxperf-enable=1 ! fakesink videotestsrc pattern=0 ! video/x-raw,width=1920,height=1080,framerate=10/1 ! mix.sink_0 videotestsrc pattern=1 ! video/x-raw,width=1920,height=1080,framerate=10/1 ! mix.sink_1 videotestsrc ! video/x-raw,width=1920,height=1080,framerate=10/1 ! mix.sink_2 videotestsrc pattern=3 ! video/x-raw,width=1920,height=1080,framerate=10/1 ! mix.sink_3 videotestsrc pattern=4 ! video/x-raw,width=1920,height=1080,framerate=10/1 ! mix.sink_4

OrinNX can support 4k encoding. I read the data in the table wrongly.

For 4k test. Please use nvvideoconvert with VIC to convert the memory format. The GPU will report an error due to insufficient video memory.

gst-launch-1.0 videotestsrc ! video/x-raw, width=3260, height=2160, framerate=10/1  ! nvvideoconvert ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! h265parse ! fakesink

For 5 input source, please use the following command line. The compositor will combine 5 input sources into 9600x1080, which exceeds the hardware specifications

gst-launch-1.0 videotestsrc ! video/x-raw, width=3260, height=2160, framerate=10/1  ! nvvideoconvert ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! fakesink videotestsrc ! video/x-raw, width=3260, height=2160, framerate=10/1  ! nvvideoconvert ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! fakesink videotestsrc ! video/x-raw, width=3260, height=2160, framerate=10/1  ! nvvideoconvert ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! fakesink videotestsrc ! video/x-raw, width=3260, height=2160, framerate=10/1  ! nvvideoconvert ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! fakesink videotestsrc ! video/x-raw, width=3260, height=2160, framerate=10/1  ! nvvideoconvert ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! fakesink videotestsrc ! video/x-raw, width=3260, height=2160, framerate=10/1  ! nvvideoconvert ! 'video/x-raw(memory:NVMM)'  ! nvv4l2h265enc maxperf-enable=1 vbv-size=1 ! fakesink

Hello,

Thank you for you quick response.
I just got a quick question, should we assume that since the OrinNX is capable of 3x4k@30fps that it would be able to encode 5x4k@10?
I have tried already the pipeline with VIC option for the nvvideoconvert and 5x4k@10 (video/x-raw, width=3840, height=2160, framerate=10/1), the pipeline seems to work with proper logging:

Setting pipeline to PAUSED ...
Opening in BLOCKING MODE 
Opening in BLOCKING MODE 
Opening in BLOCKING MODE 
Opening in BLOCKING MODE 
Opening in BLOCKING MODE 
Pipeline is PREROLLING ...
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 8 
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 8 
NvMMLiteOpen : Block : BlockType = 8 
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 8 
Redistribute latency...
===== NVMEDIA: NVENC =====
Redistribute latency...
NvMMLiteBlockCreate : Block : BlockType = 8 
NvMMLiteOpen : Block : BlockType = 8 
===== NVMEDIA: NVENC =====
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 8 
NvMMLiteBlockCreate : Block : BlockType = 8 
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 8 
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 8 
NVMEDIA: H265 : Profile : 1 
NVMEDIA: Need to set EMC bandwidth : 2872000 
NVMEDIA_ENC: bBlitMode is set to TRUE 
NVMEDIA: H265 : Profile : 1 
NVMEDIA: H265 : Profile : 1 
NVMEDIA: H265 : Profile : 1 
NVMEDIA: Need to set EMC bandwidth : 2872000 
NVMEDIA_ENC: bBlitMode is set to TRUE 
NVMEDIA: Need to set EMC bandwidth : 2872000 
NVMEDIA_ENC: bBlitMode is set to TRUE 
NVMEDIA: Need to set EMC bandwidth : 2872000 
NVMEDIA_ENC: bBlitMode is set to TRUE 
NVMEDIA: H265 : Profile : 1 
NVMEDIA: Need to set EMC bandwidth : 2872000 
NVMEDIA_ENC: bBlitMode is set to TRUE 
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock

Also the HW engines are working fine:

but I got this error in the dmesg

nvmap_alloc_handle: PID 252211: gst-launch-1.0: WARNING: All NvMap Allocations must have a tag to identify the subsystem allocating memory.Please pass the tag to the API call NvRmMemHanldeAllocAttr() or relevant.```

Need to try it, I’m not sure if the BSP limits the number of instances of the 4k encoder.

Perhaps you can get more information here on the BSP forum.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.