Decode and Multiple render of tx2nx

I cretat decoders
dec0 = NvVideoDecoder::createVideoDecoder("dec0");
dec1 = NvVideoDecoder::createVideoDecoder("dec1");

two threads:
pthread_create(dec0_tid, NULL, dec_capture_loop_fcn, nvidia_data);
pthread_create(dec1_tid, NULL, dec_capture_loop_fcn, nvidia_data);

two renders:
auto renderer0 = NvEglRenderer::createEglRenderer(render_name.c_str(), w, h, x, y);
auto renderer1 = NvEglRenderer::createEglRenderer(render_name.c_str(), w, h, x, y);

in the dec_capture_loop_fcn0:
renderer0->render(fd);
renderer1->render(fd);

It’s the newest jetpack 4.6.
There are two threads to do decoder->render.
But the two display videos are abnormal.
I find that render will take a long time (maybe 30-60 ms) sometimes.
If I delete render(fd), the costing time is fast (about 10 ms).

Hi,
Creating multiple renders in single process may not achieve target performance. Please composite the sources into single video plane through NvBufferComposite() so that you can create single render to render the video plane.

@DaneLLL
Hello!
How to use the NvBufferComposite()?
Is there any sample?

Hi,
There is code of calling NvBiufferComposite() in 13_multi_camera. Please take a look.

I see 14.multivideo_decode that it createed several renders and several threads(a thread for a render). Is that the same performance of NvBufferComposite?

Hi,
By default it does not support rendering in 14_multivideo_decode. If you run with –help, will see this NOTE:

        NOTE: Currently multivideo_decode to be only run with --disable-rendering Mandatory

for rendering frames from multiple sources, we would suggest use NvBufferComposite() to composite the sources into single video plane. We have similar implementation called nvmultistreamtiler in DeepStream SDK. Woud suggest have same implementation in using jetson_multimedia_api.

I see that
NvBufferComposite(m_dmabufs, m_compositedFrame, &m_compositeParam);
g_renderer->render(m_compositedFrame);

if I receive several rtsp stream and decode in different threads. That means m_dmabufs in different threads. That means I should have a render thread. Do I need to have a mutex lock the m_dmabufs or I can do it like that
while(1)
{
NvBufferComposite(m_dmabufs, m_compositedFrame, &m_compositeParam);
g_renderer->render(m_compositedFrame);
}

1 Like

Hi,
For thie use-case, you would need multiple decoding threads and one rendering thread. Would need queue to put Nvbuffer, and mutex to protect NvBuffer read/write.

haha! I see.
NvBufferComposite(m_dmabufs, m_compositedFrame, &m_compositeParam);
How many m_dmabufs can I use?

Hi,
The limitation is defined as

/**
 * Defines the maximum number of input video frames that can be used for composition.
 */
#define MAX_COMPOSITE_FRAME 16

Can I let it change to 32 or more?

Hi,
It is a hardware constraint. Please check this topic:
Can I change the MAX_COMPOSITE_FRAME - #3 by DaneLLL

I see.
As I know, dose it depend on the performance of GPU? nano is lower, nx is higher?

Hi,
The task is don on hardware converter VIC. For achieving maximum performance, please refer to this post:
Nvvideoconvert issue, nvvideoconvert in DS4 is better than Ds5? - #3 by DaneLLL

This command shows available frequencies:

$ cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/available_frequencies

The maximum frequency is different on the Jetson platforms. You an run the command to get maximum frequency of the Jetson platform.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.