What is the correct way to implement post-processing with DeepStream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson Xavier AGX
• DeepStream Version 6.0.1
• JetPack Version (valid for Jetson only) 4.6
• TensorRT Version 8.2.1
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? Based on this code: gpubootcamp/Introduction_to_Multi-DNN_pipeline.ipynb at a647a2c3fc75828cbbf1cbd5ab29f865c491a35c · openhackathons-org/gpubootcamp · GitHub

I am wondering what would be the appropriate way to perform some kind of post processing (CPU stuff) when DeepStream pipeline finishes a detection on a frame, without compromising too much the pipeline throughput. For instance:

  1. DS pipeline detects, crop and OCR a car license plate and I need to find information in a database with this license, or;

  2. DS pipeline detects a person, and I need to check if this person is inside a forbidden area, or;

  3. DS pipeline recognize a face, and I need to check if such guy is allowed to get in into a room, or;

  4. … and so on.

I suppose it would be necessary to do it with a probe, like the osd_sink_pad_buffer_probe(pad,info,u_data) from the tutorial above?

And then? Should I:
a) Call inside this probe an routine to perform this post-processing?
b) Use python multiprocessing to send the necessary data to a different process?
c) Create another filter, sequentially positioned inside DS pipeline, that would perform this post-processing?
d) Use some kind of message broker like MQTT, to deliver that to another subscriber?

According to the description, I don’t think they are “post-processing”. You want to use the inferencing result to implement some functions(do the business). which have nothing to do with the video and the inferencing process.

Yes. The probe function is one of the ways to get the data out from the pipeline.

No. The probe function is a callback in the main pipeline thread, it will block the whole pipeline. The time consuming tasks can not be done in the probe function. It is recommended to create your own working thread, and you can send the data you get from the pipeline to your working thread in the probe function.

It is OK.

Depends on the workload. As I mentioned before, your business does not process video directly, it is not recommended to use GStreamer plugin (filter) to handle your business.

There is already message broker supported in DeepStream. Gst-nvmsgbroker — DeepStream 6.2 Release documentation. Whether to use it depends on your business. DeepStream is just a SDK, the DeepStream APIs can be used together with other SDKs and APIs in an application.

1 Like

BTW, the ROI enter/leave analysis is already supported in DeepStream SDK. Gst-nvdsanalytics — DeepStream 6.2 Release documentation

Thank you Fiona for tour several remarks. So I must keep processing by the book. Business rules must be sent to CPU, video analytics to GPU.

The nvosd.set_property(‘process-mode’, <0 CPU, 1 GPU>) is set by default to CPU. Thus, at first, it seems to me that I should keep process-mode to CPU in order to free GPU for the inference pipeline. However, drawing bounding boxes, labels, OSD text to an image are somewhat at the middle of CPU-GPU. What are the pro/cons from this process mode?

And how do I check CPU and GPU overload? Today I only use jtop.

Even after you set “process-mode” to CPU of gst-nvosd, there is still some work to be done with GPU because the input video buffer to gst-nvosd is GPU buffer.
OSD operations are very low loading for GPU comparing to inferencing, we have optimized the algorithms in gst-nvosd to minimize the GPU consumptin.

There several ways to check GPU loading:

  1. Use “tegrastats” to measure the GPU usage in realtime.
  2. For detailed GPU kernel profiling, please refer to Nsight Systems | NVIDIA Developer
1 Like