Looking for a simple deepstream pipeline (python) which does both super-resolution and detection of objects

I’m a beginner, learning a deepstream pipeline for stream video input files (e.g., “sample_720p.h264” under samples).
Fist of all, I am sure what to start first though… but I’m trying to explain what I have done so far with the current dev. environment.

For practices , just want to extend the “deepstream-test1.py” under “/opt/nvidia/deepstream/deepstream6.1/sources/deepstream_python_apps/apps/” of deepstream6.1 (or 6.2, which doesn’t matter).
→ This example already uses resnet10.caffemodel for the detection of objects. Therefore, I already ran this test1 successfully and learned how this detection works using resnet10 model. Only problem is that I need to know how to combine this with another model for doing “super resolution” on video frames.

For the super-resolution, I have a PyTorch model (e.g., “SRCNN”), and I converted the PyTorch model into .onnx model for deepstream pipeline.

Then, I’m stuck at this point, not sure what to do first.
It seems like this “super-resolution” model can be used with nvinfer or some other elements in the pipeline.
Any reference documentation or examples will be appreciated. Thanks!

My environment setting is:
Jetson AGX Orin
JetPack 5.0.2 (L4T R35.1.0)
Deepstream6.1
Python 3.8

Do you want to integrate a “upsampling imags” model into DeepStream? If so, the model will involve video scaling in the pipeline, the original gst-nvinfer is a “in-place” transforming plugin, it can not do video scaling. So you need to customize the functions by yourself.

A customized gst-nvvideotemplate plugin may be used to customized the inferencing based on TensorRT. Gst-nvdsvideotemplate — DeepStream 6.2 Release documentation

Can you tell us which model are you using for “super resolution”?

Hi Fiona, I have the same doubt about how can I apply super-resolution model into deepstream pipeline so I can have an upscaled video in the pipeline? I already converted my SR model to TensorRT

Hello Fiona,
Thanks a lot for references!
I’m using the IDN model (GitHub - yjn870/IDN-pytorch: PyTorch Implementation of Fast and Accurate Single Image Super-Resolution via Information Distillation Network (CVPR 2018)) for “super resolution” – want to integrate this model into Deepstream pipeline with Yolo (already built in the pipeline).
Hopefully I get a bit more details about how to integrate this model into the pipeline; e.b., which .cpp files I should create, what part of the python codes I should modify, and how I configure these files in the configuration file.
Any comments or references will be appreciated.

BTW, I kind of learn that “super resolution” is the same terminology for “upsampling imags” of Deepstream.

@jshin10129

For your model, seems it is pytorch model, you’d better generate the ONNX model from the pytorch model if you want to run the model with our GPU. You can get all model related parameters from the repo you post or just consult the author of the repo.

For the customization, please make sure you have already been familiar with GStreamer and TensorRT - Get Started | NVIDIA Developer before you start with DeepStream.

The c/c++ parts you should implement:

  • you need to implement the TensorRT inferencing with your model by yourself, you can refer to the inferencing implementation in nvinfer library: /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer
  • You need to implement the TensorRT inferencing module with nvdsvideotemplate customlib_impl Interfaces. Gst-nvdsvideotemplate — DeepStream 6.1.1 Release documentation

The python part:
You can customize youe own pipeline or use any pipeline in the samples NVIDIA-AI-IOT/deepstream_python_apps: DeepStream SDK Python bindings and sample applications (github.com)

@Fiona.Chen
Thank you so much! this guides me where I should focus on for the implementation.

Cause I’ve just begun working on the deepstream 6.1, I need to clarify a few things more…

I’ve found Nvidia’s benchmark model of super resolution from GitHub - NVIDIA-AI-IOT/jetson_benchmarks: Jetson Benchmark – e.g., “super_resolution_bsd500-bs1.onnx” and I converted this into an engine, “super_resolution_bsd500-bs1.engine” using Jetson AGX Orin.
(The training code is GitHub - dusty-nv/super-resolution: PyTorch super resolution model with RGB support and ONNX exporter).

Can I just use this engine model for a simple implementation, for testing purpose? Then, is there a way to test this model in the pipeline without configuring model related parameters? It seems like Nvidia is testing this model so I can test this in my pipeline easily with default parameters, somehow.

Another question is about the pipeline. I am unclear regarding how to construct the pipeline using your suggestion for multiple models in inference and build the frames again: For example, currently I’m having the following pipeline:

“gst-launch-1.0 filesrc location=./sample_720p.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 ! nvinfer name=detect_object config-file-path= ./my_custom_detection_configuration.txt ! nvvideoconvert ! dsexample full-frame=0 ! nvdsosd ! nvegltransform ! nveglglessink”

→ In this pipeline, the nvinfer custom model detects object for now.

Let’s say that I implemented two custom TensorRT inferencing modules as you guided.
How do I test this in the pipeline?
For example, should I use a pipeline like the following?

“gst-launch-1.0 filesrc location=./sample_720p.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m batch-size=1 width=1280 height=720 !
nvinfer name=super_resolution config-file-path= ./my_custom_super_resolution_configuration.txt !
nvdsvideotemplate customlib-name=“libcustom_impl.so” !
nvinfer name=detect_object config-file-path= ./my_custom_detection_configuration.txt ! nvvideoconvert ! dsexample full-frame=0 ! nvdsosd ! nvegltransform ! nveglglessink”

Will this pipeline work? or is there a better way to include multiple plugins in the pipeline? I need to know how to debug the implemented c/c++ parts in my pipeline so please let me know how to achieve this if possible.

The last question is how do I know what parameters are required for the super-resolution. You said I can refer to the repo to get these parameters, but I need to know what parameters are required by the nvdsinfer or nvdsvideotemplate. Are these parameters used only for nvdsinfer?

Any comments or feedback will appreciated!
Thanks a lot.

Do you mean use the engine file with DeepStream? If the engine is generated in the same GPU and the same environment(the same CUDA, TensorRT versions), it can work with DeepStream.

No. How to use the model depends on the model itself but not DeepStream.

It depends on the relationship between the two models.
We have the concept of primary GIE(PGIE) and secondary GIE(SGIE). PGIE inferences on the whole frame while SGIE inferences on the objects.
There is PGIE + SGIE sample in GitHub - NVIDIA-AI-IOT/deepstream_lpr_app: Sample app code for LPR deployment on DeepStream.

We don’t know since we don’t know the relationship between the models.

For the customization, please make sure you have already been familiar with GStreamer and TensorRT - Get Started | NVIDIA Developer before you start with DeepStream.

And please read the DeepStream document Welcome to the DeepStream Documentation — DeepStream 6.3 Release documentation, all the things you want to know have already been included in the document.

The parameters needed by gst-nvinfer are described in Gst-nvinfer — DeepStream 6.3 Release documentation

For nvdsvideotemplate is just a template plugin to encapsulate your customized inferencing functions, the parameters are designed and defined by you.

The c/c++ plugins are normal c/c++ libraries and they are also GStreamer plugins, so the GStreamer debugging tools work on these plugins and c/c++ debugging tools also work. Basic tutorial 11: Debugging tools (gstreamer.freedesktop.org)
Debug a C or C++ Program on Linux Using gdb (maketecheasier.com)

@Fiona.Chen
Hello Fiona,
This is great! Thanks a lot. I’ll read all the references you provided and move forward.

BTW, in the tutorial (Gst-nvinfer — DeepStream 6.2 Release documentation), there is a list of network Gst-nvinfer currently works on:

  • Multi-class object detection
  • Multi-label classification
  • Segmentation (semantic)
  • Instance Segmentation

Actually, the “super-resolution” restore and reconstruct the resolution of a noisy low-quality image into a very high quality high resolution image. Therefore, this “super-resolution” network doesn’t belong any of the 4 networks mentioned above ( Gst-nvinfer currently works on).

As you pointed out there are two plugins required for the implementation of using “super resolution” model in the Deepstream pipeline – nvinfer and nvdsvideotemplate.
And, “nvdsvideotemplate” is nothing but an encapsulation of inference functions as you said.
So,
Question#1: is it gonna work when I implement the “nvinfer” plugin?
Question#2: should I only implement only the “nvdsvideotemplate” plugin for calling the “super-resolution” model, get the result, and do upscaling the pixels of frames?
Question#3: No matter what the tutorial mentioned (above, 4 networks), should I still implement the “nvinfer” and “nvdsvideotemplate” plugins?

Thanks for any comments and feedbacks!

@Fiona.Chen

Regarding the parameters needed by gst-nvinfer when I use a “super-resolution” model:

I see that I can get the model parameters like “model paths”, “input resolutions”, or “output resolutions”.

However, I’m a bit confused by some other parameters in the configuration file.

For example, the Yolo model (object detection)
“/opt/nvidia/deepstrea/deepstream-6.1/sources/deepstream_python_apps/apps/deepstream-test1” has a configuration file.
And, there are multiple [configuration] and [class-attrs-all] parameters.
It seems like the [property] parameters are required by Deepstream, and [class-attrs-all] parameters are required by the Yolo model.

When you look at the below configuration example, how do I know “maintain-aspect-ratio” is required by Deepstream for the model?
I mean, are there default parameters I should collect from the model repository for the Deepstream pipeline?
If possible, could you explain a bit how these parameters are used in the plugins using a simple example?
Thanks a lot!

Example:
[property]
gpu-id=0
num-detected-classes=10
network-type=0
cluster-mode=2
maintain-aspect-ratio=1

[class-attrs-all]
nms-iou-threshold=0.3
threshold=0.7

No. gst-nvinfer is a “in-place” transforming plugin, the output is the same buffer as input. You can not do “super-resolution” scaling with such plugin.

Yes. Currently this is the only way you can customize the scaling function in DeepStream.

There is the fifth type of networks which we defined as “others”. The “super-resolution” model is the “others” model. We have some “others” models samples in deepstream_tao_apps/apps/tao_others at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub. Please refer to the table in NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream (github.com)

For the customization, please make sure you have already been familiar with GStreamer and TensorRT - Get Started | NVIDIA Developer before you start with DeepStream.

And please read the DeepStream document Welcome to the DeepStream Documentation — DeepStream 6.2 Release documentation , all the things you want to know have already been included in the document and sample code.

@Fiona.Chen
This is great to know that I only need to implement nvdsvideotemplate to load up the “super-resolution” model and do upscaling the pixels.
Thanks a lot on the references and all the detailed guideline.

@Fiona.Chen
All right, I still need to be familiar with GStreamer and TensorRT - Get Started | NVIDIA Developer before you start with DeepStream.
Appreciated!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.