Parallel execution of branches

pawel.kaczmarczyk · November 24, 2020, 9:26am

• Hardware Platform: Jetson
• DeepStream Version: 5.0
• JetPack Version (valid for Jetson only): 4.4
• TensorRT Version: 7.0
• Issue Type: Questions

Hi,

I’m currently developing an IVA application for the Jetson.
I want to use deepstream to fully utilize undelying hardware - the application will consist of few CV tasks - some of them are independent from each other.

I want to split the pipeline into branches that can be processed independently, but in the end i want the results to be assigned to the corresponding frame
Not every frame must be inferred - if pipeline is overloaded the older frames may be dropped
I’d like to draw the inference results and expose them on output rtsp stream

The main flow:

[RTSP] ----> [Detector] ---> [Tracker] ---> [First classifier]
        |                               +-> [Additional processing(this will push downstream different buffer)] ---> [Pose estimation] ---> [Classifier]
        |                                                                                                        +-> [Cascade detector] ---> [Classifier]
        +-->[Scene classifier]

*Each component will produce metatata basically in custom format (every single one will be operating in place)
** Each component will be working with batches that comes from multiple camera streams

I was looking on the nvinfer component and I’ve seen that it has option to infer classifier in asynchronous mode. I’m wondering if it would fit this use case - if there is an async inference how I can ensure that every frame that inference process is done for every frame that goes into the output pipeline?

Could you please advise how I should build the workload using deepstream to ensure concurrent executuion of independent tasks?

Thanks

Fiona.Chen · November 25, 2020, 3:55am

The inference result(for any model such as detector or classifier or segmetation…) is output to downstream as DsMetaData which is attached to GstBuffer for the frames.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_metadata.html

The definition of ‘classifier-async-mode’ of nvinfer is in Gst-nvinfer — DeepStream 6.1.1 Release documentation

I don’t understand the pipeline you list. Which of them are inference models? Is ‘[Additional processing(this will push downstream different buffer)]’ a inference too? How many inference models are there in your pipeline? What are the input and output of the models? What are the relationship between these models? Have you evaluated the performance of the models on Jetson before you use them with DeepStream?

pawel.kaczmarczyk · November 25, 2020, 10:51am

Thank you for the response!

Which of them are inference models?
How many inference models are there in your pipeline?
What are the relationship between these models?

So there are 4 main (independent - but they are using the same detector and additional processing step) tasks for this pipeline. I have added a numbers to indicate which of them are unique.

Scene classifier (1) - which takes whole frame and attach classification results
Detector (2) → Classifier (3)
Detector (2) → Additional processing → Pose Estimation (4)-> Classifier (5)
Detector (2) → Additional processing → Face Detector (6)-> Classifier (7)

There are 7 different inference models.

Is ‘[Additional processing(this will push downstream different buffer)]’ a inference too?

Additional processing is a step that produces new frames that would be consumed by Pose Estimation and Face Detector. This step does need to produce new image (that will not be displayed anywhere - only used for inference, so I also consider to attach this produced image as a custom metadata). It’s not a inference step, but its required by Pose Estimation (4) and Face Detector (6).

What are the input and output of the models?

Scene classifier (1) - takes a RGB frame and should return probabilities for each class as well as some raw data from specified tensors (this step output will be a custom metadata)
Detector (2) - standard detector input / output
Classifier (3) - standard classification of detected objects (people only)
Pose Estimation (4) - takes a RGB frame and returns list of keypoints for each detected skeleton
Classifier (5) - takes a sequence of skeletons and returns standard classification output
Face Detector (6) - takes a RGB frame and return position of detected faces
Classifier (7) - standard classification of faces

Have you evaluated the performance of the models on Jetson before you use them with DeepStream?

Yes, we have run the performance tests (we’ve run the benchmarks for models in TensorRT with Jetson AGX in MAXN mode. The slowest one from our workload achieved 60FPS with batch size = 1 (Final solution will introduce batching to improve this result).

Fiona.Chen · November 25, 2020, 1:01pm

The 7 models can be used in the same pipeline, you can refer to the sample codes of /opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test2 which use 4 models, one model to detect cars and persons, three classifiers to identify the car’s color, type and manufacturers.

The only problem is what is your ‘Additional processing’? Is it the pre-processing for ‘Pose estimation’ and ‘face detector’? If so, what kind of pre-processing is needed? Scaling, color format conversion, normalization or any other pre-processing?

pawel.kaczmarczyk · November 26, 2020, 11:30am

The 7 models can be used in the same pipeline, you can refer to the sample codes of /opt/nvidia/deepstream/deepstream-5.0/sources/apps/sample_apps/deepstream-test2 which use 4 models, one model to detect cars and persons, three classifiers to identify the car’s color, type and manufacturers.

Thanks for the sample, but I’m afraid that would not fit to our case - we would like to execute in parallel whole branch - This example shows only how to run one model asynchronously.

The only problem is what is your ‘Additional processing’? Is it the pre-processing for ‘Pose estimation’ and ‘face detector’? If so, what kind of pre-processing is needed? Scaling, color format conversion, normalization or any other pre-processing?

‘Additional processing’ step consists of several steps - it will be custom developed.
During this stage we will produce new image that we are planning to attach as a buffer meta data. The new image will be created based on specified parts of the original frame - this is our project requirement.

Fiona.Chen · November 26, 2020, 11:49am

Why do you say so?

Why do you insert the new images into meta data? How will you use the meta data in downstream? Will these images be used as the input of ‘Pose Estimation’ and ‘Face Detector’?

pawel.kaczmarczyk · November 26, 2020, 12:00pm

OK, so maybe I misunderstood something :) I will give it a try.

For PoseEstimation we still need to develop custom input/output parsers for our models like in apps/sample_apps/deepstream-infer-tensor-meta-test sample and provide for nvinfer functions to correctly parse input/output tensors.

Fiona.Chen · November 27, 2020, 3:09am

So it is just a part of inference, you don’t need to list it as a separated step in your deepstream pipeline.

Topic		Replies	Views
Some question about Deep stream 5 DeepStream SDK	42	1782	October 12, 2021
Dynamic Management of Video Sources and nvinfer Plugins for Multi-Model Inference DeepStream SDK	8	329	March 1, 2024
How to maximize inferences/sec in a deepstream pipeline DeepStream SDK	13	1067	October 12, 2021
Multiple input streams with multiple primary and secondary inference DeepStream SDK	19	1267	February 7, 2023
Implementing Real-Time, Multi-Camera Pipelines with NVIDIA Jetson Technical Blog	7	1500	July 9, 2024
DeepStream SDK and decoding RTSP on GPU DeepStream SDK	23	2004	October 12, 2021
How to pass custom input to non image layer of model during runtime DeepStream SDK cuda , jetson-inference , gstreamer , jetson , deepstream	14	99	December 13, 2024
Deepstream can run async mode? DeepStream SDK gstreamer	7	1966	March 8, 2021
How to run LSTM model on multiple frames with Deepstream? DeepStream SDK	7	413	March 20, 2024
Python sample of chain a detection and classification models DeepStream SDK	4	488	March 14, 2022

Parallel execution of branches

Related topics