Multiple inputs to tensorRT engine from a single input stream


I have a Deep learning Pytorch model with me to predict a certain activity from given input video. With respect to the model, I have converted that model to ONNX to TensorRT Engine file.

Now, To predict that particular activity, I am able to create a PyTorch pipeline with the model and three inputs with shape (1,3,12,640,640), (10,3,640,640) and (10,3,640,640).

The first input is basically 12 images combined with shape 3x640x640. The first input would be the set of continuous 12 images extracted from a single video stream. Other two inputs would be first 10 images and last 10 images from the first input. Iteratively, This pre-processing would be used to feed the three inputs to the model.

My objective is to replicate the same PyTorch pipeline in Deep stream using TensorRT ‘.engine’ file.

Can you please help me provide some light on this? What can we use from Deepstream or GStreamer or anything else, To Extract the set of continuous images from a single input video stream and process it further to feed it to the model in Deepstream pipeline.

Let me know if you need any other details or clarity from my end.

Any help/suggestions would be really appreciated.

Thank you,


Just want to clarify first.

Is your first input dimension (12,3,640,640)?
Or it is (1,3,12,640,640) as you list above which has five axes.

More, is the last 10 images indicate the frames right before the end of streams?
Or the last few frames in a pre-defined period?


Hi @AastaLLL

Thanks for the revert.

The model expects this input dimension > (1,3,12,640,640)

But, We take 12 continuous images with shape 3x640x640 and preprocess it (PyTorch permute and PyTorch unsqueeze) to make it (1,3,12,640,640) which is expected shape by the model.

Last 10 images are not the frames right before the end of streams. Eventually it will be for the last iteration. We are extracting it from this input only > (12,3,640,640)

Basically, We are considering last frames from the first input itself and so on. Something where a sliding window of 12 images moves forward with defined step. All the three inputs will be extracted from this sliding window of 12 images only until the stream ends.

Hence, From this shape > (12,3,640,640) We will extract the three inputs of the model. ((1,3,12,640,640), (10,3,640,640) and (10,3,640,640))

I hope I have answered your questions!

Let me know in case of any other clarity.


Hi @AastaLLL

Is there any update or suggestion from your end?



Thanks for your clarification.
Could you also fill the information about your setup first.

• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)



Please find the below information about the setup as requested.

• Hardware Platform: GPU(Quadro RTX 5000)
• DeepStream Version: 5.1
• TensorRT Version: 7.2.3-1+cuda11.1
• NVIDIA GPU Driver Version: 460.91.03
• Triton+DeepStream docker image: 5.1-21.02-triton
• Issue Type: questions
• Requirement details: Approach to feed Multiple inputs to tensorRT engine from a single input stream, Details are given in this thread.

Thanks for the revert, Looking forward to get some suggestions for the same.



Thanks for your feedback.

You will need to use temporal batch which is not supported currently.
However, nvdsinfer component is open-sourced so you can add the support directly:



1 Like

@AastaLLL Thank you for the help!

Hi @AastaLLL

Can you please confirm if temporal batching is supported for nvinferserver/Triton inference server or not?

If not, Are plugins open-sourced to add the support for the same like nvdsinfer?



Temporal batching is not available for nvinferserver either.
And it doesn’t open-source.


Hey @AastaLLL

Can we use this sample app for above requirement?

Is it correct understanding that temporal batching is now supported in nvinfer plugin for DeepStream 6.0?

Can we use the preprocess plugin provided in this sample app to serve the purpose and feed it to the model to do nvinfer based execution?

I would really appreciate your suggestions for the same.



Sorry for the late update.

Yes. The temporal batch is supported from Deepstream 6.0.
You can try it with nvdspreprocess componenet:


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.