Best approach for dynamic text overlay


My goal is to create an application that reads data from an USB H264 camera, decodes it, adds some text/image overlays and then streams it over RTMP. I already have proof-of-concepts working well for the first and last part, but I’m having difficulties choosing the best option for the middle part. The processing involves adding text/image overlays based on dynamically computed data. Mainly sensor data read over serial/USB.

  1. From what I understand, in order to get the most out of Jetson’s architecture, the overlaying should be done using CUDA, thus saving CPU time on copying those frames across. Is that correct?

  2. I saw some mentions of using the nvivafilter and found the same source files for ‘nvsample_cudaprocess’. Is there any guide on what must happen in the pre_process() and post_process() functions? Also, what API can I use in the main gpu_process() function in order to add my overlays?

PS: I’m using a Jetson Nano Dev Kit setup.

Thank you!

A possible solution is to use nvivafilter plugin. You may refer to the patches:
Tx2-4g r32.3.1 nvivafilter performance - #16 by DaneLLL
Unable to overlay text when using udpsrc - #13 by DaneLLL

Or you can access the buffer in nvvidconv plugin through NvBuffer APIs. The source code is in
To add the text overlay in nvvidconv plugin, and rebuild/replace the plugin.

Am I correct in thinking that, if I modify the nvvidconv plugin I won’t be able to share data dynamically from the main application as I could do with the shared nv cuda process library?

Outlook for iOS

You can add it as a property and construct the pipeline like:

… ! nvvidconv ! tee name=t ! queue ! RTMP_streaming t. ! queue ! nvvidconv enable_text_overlay=1 ! …

Thank you! Is there an example / guide to recompiling a gsreamer plugin like nvvidconv? I assume I won’t be able to use the nvvidconv original plugin for its initial functions afterwards?

Secondly, as I don’t have any extensive experience with any of the two solutions, am I right with this pro/con list?

  1. The nvivafilter can be faster as it relies on CUDA processing, but only in very specific scenarios. Downside is the extra steps requried for compiling a CUDA shared library. Debugging only possible through generic “printf()” statements.

  2. A Gstreamer plugin can also only be debugged through printf() statements. Other than that, the only small advantage that I can see is that it’s probably a bit easier to set up the development tools around it.

All source code is in

Please download it and follow READMEs to build/replace the plugins.

Using nvivafilter plugin, you will get the CPU/GPU pointer and can access the frame data directly. Using nvvidconv plugin, you would need to check NvBuffer APIs and do some customization. Either way is fine and depends on your use-case.

Thank you, that part is now clear!

Is there any guide or can you provide at least a hint of what APIs I can use to overlay text/images on the frame for each option?

Secondly, did I understand correctly about the debugging process? That it’s only done through printf() statements and you can’t step through the code? Would it be possible to have a Gstreamer application, use appsink/appsrc to shortcircuit the pipeline and then use NvBuffer APIs to retrieve the frames, modify them and then put them back in the pipeline?

You can get CPU pointer of the NvBuffer and use Cairo APIs. May refer to the patch:
Tx2-4g r32.3.1 nvivafilter performance - #16 by DaneLLL

And yes, you can set like export GST_DEBUG=*:4 to get debug prints for further debugging.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.