How to crop and resize different input sources separately?


I’m looking to create a multistream pipeline which takes multiple RTSP streams as input.
I want to crop for each camera a specific part which, and resize them all to the same size as to fit my model.
I’m unsure how to achieve such preprocessing step? Preferably using Python API ?

I looked at the existing python apps, specifically at test1 and imagedata_multistream.

For a single input source, I’m cropping / resizing like this:

    # Sink    
    nvvidconv2 = make_element('nvvideoconvert', 2)
    # Left:Top:Width:Height
    CROP= "75:670:2425:384"
    nvvidconv2.set_property('src-crop', CROP)
    # Add
    print(f"Adding elements to pipeline...")

It seems to visually crop as expected, but I’m not sure if it’s behaving as desired under the hood, or if there’s a better way to do it.

For the multistream, could I use the same method? Where should I insert the nvvideoconvert ?

    for i in range(number_sources):
        os.mkdir(folder_name + "/stream_" + str(i))
        frame_count["stream_" + str(i)] = 0
        saved_count["stream_" + str(i)] = 0
        print("Creating source_bin ", i, " \n ")
        uri_name = args[i + 1]
        if uri_name.find("rtsp://") == 0:
            is_live = True
        source_bin = create_source_bin(i, uri_name)
        if not source_bin:
            sys.stderr.write("Unable to create source bin \n")
        padname = "sink_%u" % i
        sinkpad = streammux.get_request_pad(padname)
        if not sinkpad:
            sys.stderr.write("Unable to create sink pad bin \n")
        srcpad = source_bin.get_static_pad("src")
        if not srcpad:
            sys.stderr.write("Unable to create src pad bin \n")


Set up the following pipeline.

uridecodebin | --> nvvideoconvert --> |
             |                        |
uridecodebin | --> nvvideoconvert --> | --> nvstreammux --> ......
             |                        |
uridecodebin | --> nvvideoconvert --> |

First I need to know what you need to do, for the nvinfer plugin, there is no need to use nvvideoconvert for scaling. I’m not sure if you need ROI, you can refer to this.

Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
• The pipeline being used

Thank you, I will try this out!

The model I will be using, is trained and optimized for a certain content, in our training pipeline, we’re cropping the images, such as to only have the ROI processed in the network.

The various cameras, are set in different locations and will need to be cropped differently.
And finally, the model (in TF2, which is converted to ONNX) has as certain input-size expectation, thus we resize/rescale the ROI to this expected H/W.

I am unclear, how deepstream deals with ROI and resizing, could you clear me up?

Am I assuming correctly, that ROI in this DeepStream/Jetson context, refers to the actual cropping/resizing mechanism?


Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU): Jetson Xavier AGX
• DeepStream Version: 6.3 (*edit)
• JetPack Version (valid for Jetson only): JetPack 5.1.2 (*edit)
• Issue Type( questions, new requirements, bugs): Questions, maybe new requirements
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description): Crop and resize input sources indenpendently to a common final size (H/W)

During inference, only the ROI of the preprocess is detected, not croping or resizing.
You can read the above document for details.

By the way, DS-6.2 only runs on Jetpack 5.1 GA.


Yes, totally, it’s DS-6.3 was a typo on my part, JetPack I’m not 100% sure, I checked $ cat /etc/nv_tegra_releaseand I get # R35 Revision 4.1- which I suppose is JetPack 5.1.2?

I will correct my initial entry!

I will read it again and come back to you, can I use this plugin with Python API only, or will I need to write a custom C++/CU myself ?


sudo apt-cache show nvidia-jetpack | grep "Version"

Yes, You can use preprocess in python like other plugins.
You can refer to the example of deepstream-preprocess-test

Hi again,

>Version: 5.1.2-b104

So I had time to try the ROI options with the sample video, and as I understand, we supply the ROI and only what’s inside will be processed, right?
Can you share an insight how the model receives those ROIs?
Is the ROI cropped and resized and sent to the model, or is the whole image processed and detections outside of ROI are just ignored ?

More specifically, for my use-case, let’s say I record images at 680x420 resolution.
I preprocess those images by cropping only the left part resulting in a cropped image of resolution 200x420.
I train my network on a detection / anomaly detection task on this cropped part using TensorFlow or pyTorch.
I then convert my model to ONNX for the Jetson/DeepStream deployement.

For the live-inference part, can I use the full 680x420 resolution image, and just provide the DeepStream an (nvdspreprocess) preprocess ROI of roi-params-src-0=0;0;200;420

Would this work with the aforementioned model? Am I missing something?

Further, if I add let’s say a second camera, and roi-params-src-1=0;0;400;200 Will the network still accept this ROI as input?

I hope this helps clarify my question a little.

I’m not in the process of integrating my custom model, so I can’t test it right away, but it would be helpful to know if I need to change something already.

Thank you very much!


I guess your model input size is 200x420? This has nothing to do with ROI and can be adjusted through infer-dims property of nvinfer without additional scaling.
nvinfer will handle it all.

gst-nvinfer is open source, Documentation and code can be referenced

Yes, you are right.

If you need to specify roi for different sources, you can add group-* entries in the preprocess configuration file.

gst-nvdspreprocess is open sources.
There are configuration files in the directory for reference.

Could you maybe help me / show me how to connect the nvvideoconvert to the uridecodebin?
Do I need to add / link them to the ghost pad?

I tried looking at rtsp-in-out example and at imagadata_multistream for support, but I’m unsure what I’m doing wrong.

Thank you!

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

The ghost pad is necessary, so your approach is completely correct.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.