Hi
I used nvcompositor and nvoverlaysink to composit 4 upd streams(1920x1080pixs 30fps) for 4k display .CPU usage was 230% for 4 cores and fps was down to 5. I checked cpu usage using top command.
gst-launch-1.0 nvcompositor name=comp \
sink_0::xpos=0 sink_0::ypos=0 sink_0::width=1920 sink_0::height=1080 \
sink_1::xpos=1920 sink_1::ypos=0 sink_1::width=1920 sink_1::height=1080 \
sink_2::xpos=0 sink_2::ypos=1080 sink_2::width=1920 sink_2::height=1080 \
sink_3::xpos=1920 sink_3::ypos=1080 sink_3::width=1920 sink_3::height=1080 ! nvoverlaysink \
udpsrc multicast-iface="eth1" multicast-group=224.1.1.5 "port=40000" ! "application/x-rtp, media=video, encoding-name=H264" ! rtph264depay ! queue ! h264parse ! omxh264dec enable-low-outbuffer=1 ! comp.sink_0 \
udpsrc multicast-iface="eth1" multicast-group=224.1.1.5 "port=40010" ! "application/x-rtp, media=video, encoding-name=H264" ! rtph264depay ! queue ! h264parse ! omxh264dec enable-low-outbuffer=1 ! comp.sink_1 \
udpsrc multicast-iface="eth1" multicast-group=224.1.1.5 "port=40000" ! "application/x-rtp, media=video, encoding-name=H264" ! rtph264depay ! queue ! h264parse ! omxh264dec enable-low-outbuffer=1 ! comp.sink_2 \
udpsrc multicast-iface="eth1" multicast-group=224.1.1.5 "port=40010" ! "application/x-rtp, media=video, encoding-name=H264" ! rtph264depay ! queue ! h264parse ! omxh264dec enable-low-outbuffer=1 ! comp.sink_3 -e
It was quite higher than 4 x11 displays using nv3dsink whose CPU usage was 160%.
gst-launch-1.0 udpsrc multicast-group=224.1.1.5 "port=40010" ! "application/x-rtp, media=video, encoding-name=H264" ! rtph264depay ! queue ! h264parse ! nvv4l2decoder ! nv3dsink -e
Hi,
Is is expected and optimized since nvcompositor utilizes hardware acceleration(VIC engine) as we have commented at
[url]https://devtalk.nvidia.com/default/topic/1055591/jetson-nano/two-nvoverlaysink-problem-on-jetson-nano/post/5351525/#5351525[/url]
If using nv3dsink is good in your usecase, you may use nv3dsink.
Hi DaneLLL
Thank you for your reply.
Why cpu persentage is so high? It there something wrong with setting?
I need use nvcompositor for setting picture position.
I think I cannot set window position if i use nv3dsink.
Testing nv3dsink is just to compare the cpu persentage with nvcompositor.
Hi,
We can reveal that nvcompositor is implemented based on NvBufferComposite()( defined in nvbuf_utils.h ). However, the source code is not open for public.
We will support to configure w, h, x, y to nv3dsink in next r32 release.
Hi
Thank you for your support.
When is the next r32 release date?
I tested nvcompositor mp4 file which similar to nvcompositor example.The result was same.nvcompositor cause cpu usage high.
①using nvcompositor only one resource. Cpu persentage was 100%
gst-launch-1.0 nvcompositor name=comp \
sink_3::xpos=1920 sink_3::ypos=1080 sink_3::width=1920 sink_3::height=816 ! nvoverlaysink \
filesrc location=/home/vsdc/TheBourneUltimatumTrailer.mp4 ! qtdemux name=demux0 \
! h264parse ! omxh264dec ! comp.sink_3 -e
②just using nvoverlaysink.Cpu persentage is 15%.
gst-launch-1.0 filesrc location=/home/vsdc/TheBourneUltimatumTrailer.mp4 ! qtdemux name=demux0 \
! h264parse ! omxh264dec ! nvoverlaysink overlay-x=0 overlay-y=0 overlay-w=1920 overlay-h=816 overlay=2
cat /etc/nv_tegra_release
R32 (release), REVISION: 1.0, GCID: 14531094, BOARD: t210ref, EABI: aarch64, DATE: Wed Mar 13 07:46:13 UTC 2019
Hi,
We have checked and found the CPU usage is from gstreamer frameworks. nvcompositor plugin is based on GstVideoAggregator:
https://gstreamer.freedesktop.org/documentation/video/gstvideoaggregator.html?gi-language=c
GstVideoAggregator take some CPU usage. If you want to eliminate it, we suggest you try tegra_multimedia_api. You can call NvBufferComposite() to achieve the same function as nvcompositor plugin.
1 Like
Hi DaneLLL
I really appreciate your checking.
I am very interested to use NvBufferComposite().
Is there any ducument or sample code about using tegra_multimedia_api to create gstreamer plugin?
Hi
Thank for your advice. I tried flowing, and 8 steams compoisted.And found the problem that sometimes 1 stream’ image is a bit fuzzy.
Using gstreamer to decode 8 h264 streams
↓
geting 8 dmabuf_df from appsink
↓
using NvBufferComposite to 1 compositeFrame
↓
rendering compositeFrame
Is it the data synchronization problem?
I have flowing questions.
- How to synchroize between dmabuf_df and compositeFrame to avoid using writing buf?
- How many fifos(banks) in dmabuf_df and compositeFrame when I use omxh264dec nvvidconv and NvBufferComposite?
- How to increase the buf size(frame fifo) of of dmabuf_df and compositeFrame?
Hi,
For synchronization, you may utilize pts and dts in GstBuffer. Before executing composite, please call gst_buffer_ref() to keep the buffers. After composite is done, call gst_buffer_unref() to return the buffer.
Please also set below property in nvvidconv to test more working buffers.
output-buffers : number of output buffers
flags: readable, writable, changeable in NULL, READY, PAUSED or PLAYING state
Unsigned Integer. Range: 1 - 4294967295 Default: 4
Hi
Thank for your reply.
Does GstBuffer work for HW?
How to extract GstBuffer from nvbuffer like ExtractFdFromNvBuffer.
/**
* This method must be used to extract dmabuf_fd of the hardware buffer.
* @param[in] nvbuf Specifies the `hw_buffer`.
* @param[out] dmabuf_fd Returns DMABUF FD of `hw_buffer`.
*
* @returns 0 for success, -1 for failure.
*/
int ExtractFdFromNvBuffer (void *nvbuf, int *dmabuf_fd);
Hi,
You can get it from GstSample:
buffer = gst_sample_get_buffer (sample);
g_print("PTS= %lu\n", buffer->pts);