Deepstream architecture advice: how to dynamically switch between inference engines?

nicholasthumiger · April 11, 2022, 9:52am

We are using Gstreamer, along with the ‘nvinfer’ NVIDIA deep stream GstElement for inference. The idea is to have one video stream, and be able to instantaneously and dynamically select which combination of inference engines we want to apply to the stream. For instance, we might want to have no inference occurring, or both a person and car detection model/engine running. Note that the inference elements are not configured to edit the video in any way - we simply need the metadata they produce.

The key requirement is that the switching occurs almost instantly. As a result we hope to have a static Gstreamer pipeline layout, with the ability to disable/enable certain elements in the pipeline if needed at runtime. When an inference element is disabled, it should not be using any resources.

So far we attempted to have a ‘tee’ element, branching off of our video source: with a separate inference engine on each branch. A valve in front of each inference element allows us to determine which inference engines we want to use at runtime.

Source ----- [Tee] ---------------------------- [Aggregator?] ----- Output
               |                                      |
               |----- [valve] ----- [inference1] -----|
               |                                      |
               |----- [valve] ----- [inference2] -----|

However we are struggling to then obtain all of the metadata for each inference engine back onto the main branch. The aggregators seem not to work, as they require all of their input branches to have video frames before they aggregate the data.

So our question is what the best way to achieve our goal? Is our ‘parallel’ approach viable, and if so, how do we merge the metadata in the different branches back together again. If not, what is an alternative way to achieve this?

user163329 · April 11, 2022, 1:07pm

This might be easier achieved by writing a custom solution directly in TensorRT and then load all your engines up front and then have a coordinator processing the results and activating different models based on your requirements.

chinthy2 · April 11, 2022, 4:38pm

Why don’t you have separate static pipelines for different types of tasks (pipeline1, pipeline2, etc). Can you decide the type of analytics you need depends on the stream informations. If that is the case, when you get a new stream uri, you can use a separate cpp component to decide the pipeline it should use for analytics. something like below.

stream_switcher >> nvstreammux ->peoplenet->classifier1->metadatapadprobes->nvmsgconv->nvmsgbroker
                >> nvstreammux ->traficcamnet->classifier2->metadatapadprobes->nvmsgconv->nvmsgbroker
                >> nvstreammux ->peoplenet->traficcamnet->classifier3->metadatapadprobes->nvmsgconv->nvmsgbroker

You can dynamically add streams to the nvstreammux at runtime. Also you can send metadata to the same topic of your message broker.

gharbv · April 13, 2022, 2:01am

this link will help you

system · May 3, 2022, 3:20am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.