Known way to tile images for segmentation of large images


we are trying to segment very large images on a Jetson Xavier NX using a deep neural network. When inferring the images as a whole, the network would be too large for the memory of the Jetson. Obviously, we could upgrade to a Jetson with more memory, but their memory wouldn’t be enough either.
Currently, we are cropping/tiling the images in a sliding window approach in python.
I have seen this approach quite frequently done in practice.

If possible, I’d like to that more efficiently (i.e. not in python with numpy).
Is there already any known DeepStream-module or other NVIDIA SDK function which does this?

For the lack of a better example, I quickly found this paper which also uses this process:

As a bonus, I’d also like to combine the results in a single image while averaging confidence values of the overlapping regions.

• Hardware Platform (Jetson / GPU): Jetson Xavier NX
• DeepStream Version: 6.0.0
• JetPack Version (valid for Jetson only): 4.6.1
• TensorRT Version: 8.0.1
• Issue Type: Question

I’d be thankful for any hints to existing/similar functionalities.

The nvvideoconvert plugin supports crop videos and nvmultistreamtiler plugin support tile the videos into one video.