Accelerated GStreamer text overlay

pettair · May 8, 2021, 8:38pm

Hi,

Can you recommend any solution to display/burn in dynamic text overlay on vide using GStreamer?

We have a very simple pipeline:

nvarguscamerasrc sensor-id=<DCL_SENSOR_ID> sensor-mode=0 gainrange=“1 16” ispdigitalgainrange=“1 1”
! video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)NV12, framerate=(fraction)30/1
! nvvidconv ! textoverlay name=text_overlay ! video/x-raw,format=I420
! nvvidconv ! nvv4l2vp8enc bitrate=8000000 control-rate=1 ! rtpvp8pay mtu=1400 ! udpsink auto-multicast=true clients=<DCL_UDP_SINK_CLIENTS>

where we change the text 5 times in a second and this consumes ~50% of a CPU core on a Nano, but without the text overlay it is below 20%.

Do you have any idea how could we make it more efficient / add less overhead?

Thanks!

Bests,
Peter

DaneLLL · May 10, 2021, 2:40am

Hi,
You may try this:
Tx2-4g r32.3.1 nvivafilter performance - #16 by DaneLLL

pettair · May 10, 2021, 7:56pm

Looks very promising! Thanks!

Im new to CUDA, is this ‘nvivafilter’ will execute the linked code as in case with simple CUDA kernels? Or is there any limitation I should take into account? Based on the example: nvsample_cudaprocess.cu it seems to be.

Also can you point me to a documentation / example where I can see how to pass data between the CPU and GPU? Our use case would be to have a simple C++ app which would receive the events and based on them it should be some updates on the video overlay. So is there any ‘gstreamer’-ish async event based communication or should I use cudaMallocManaged to allocate variables in the shared memory?

Thanks!

Bests,
Peter

pettair · May 19, 2021, 7:44pm

Hi,

Using this approach we now implemented a much more performant solution. Thank you for the suggestion!

But as I’m pretty new to GPU programming the solution itself is a naive solution and Im confident there are lot of room for further improvements.

What Im not sure is how can I profile or debug the nvivafilter. I looked around and the as I see there are very nice tools for such task for standalone CUDA application, nvprof and cuda-memcheck is popping up most of the time. However can I use this with GStreamer and if yes how? Or is there any other alternative to gain performance insights from the nvivafilter?

Thanks!

Honey_Patouceul · May 19, 2021, 8:17pm

I think that native profiling on Jetson is on stand-by for now.

You may use host side tools such as nsight for getting GPU stats. Someone else would better advise.

pettair · May 19, 2021, 8:53pm

I see :( its a bummer it looked to be a very useful tool.

But my question would still apply for these Nsight tools as well. Can these tools be used to profile CUDA code when it is referenced from a GStreamer pipeline as a shared library?

Honey_Patouceul · May 19, 2021, 9:26pm

You may create a new topic for this as it gets far from original title ;-)

For anyone getting here, here are some modified files from public_sources/nvsample_cuda_process:
nvsample_cudaprocess.cu (6.5 KB)
Makefile (5.0 KB)

This may still require some clean up work for avoiding some fault when closing…

gst-launch-1.0 nvarguscamerasrc num-buffers=300 ! nvivafilter customer-lib-name="libnvsample_cudaprocess.so" post-process=true ! 'video/x-raw(memory:NVMM),format=RGBA' ! nvvidconv ! nvoverlaysink

pettair · May 24, 2021, 2:57pm

Hi!

Sorry for the late response!

Thank you for the code examples! In the meantime I created a solution for the text overlay based on this:
https://github.com/dusty-nv/jetson-video/blob/master/cuda/cudaFont.cu

Not as flexible as the one you mentioned, but I guess it will cut it for now.

So my last remaining question would be how can I profile these applications.

But you are right it is getting diverged from the current topic :D so I created a new one:

Thanks for the help!

pixilottkiss · October 12, 2021, 1:41pm

Hi Honey_Patouceul
when compile your nvsample_cudaprocess.cu code with make command, customer_functions.h is not recognized.
Can you help me to find this header file?
fatal error: customer_functions.h: No such file or directory

Ty.

Topic		Replies	Views
Profile nvivafilter on Jetson Nano Jetson Nano gstreamer	8	1409	October 15, 2021
Best approach for dynamic text overlay Jetson Nano camera , gstreamer	8	1323	August 9, 2022
Gst-nvivafilter Jetson Nano cuda , gstreamer	7	1694	September 11, 2021
jetson nano accelerated gstreamer using opencv Jetson Nano	6	5090	October 18, 2021
How to use CUDA in gstreamer pipeline Jetson TX2 cuda , gstreamer	4	3515	October 18, 2021
Nvivafilter: different input and output buffers Jetson Nano gstreamer	9	2222	October 15, 2021
Deteriorating GStreamer pipeline performance Jetson Nano camera , gstreamer	7	2130	October 15, 2021
Gstreamer CUDA Implementation Low FPS, cudaDeviceSynchronize Load Jetson Nano gstreamer	2	1039	October 15, 2021
Nvvidconv colorspace conversion difficulties Jetson Nano	6	1927	October 14, 2021
Gstreamer lag increases with frame rate recording from Raspberry Pi camera (nvarguscamerasrc and nvivafilter) Jetson Nano camera , cuda , gstreamer	26	2446	January 17, 2023

Accelerated GStreamer text overlay

Related topics