Could I get the time spent by each module in the deepstream-app or deepstream-parallel-inference-app runtime?

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version Docker 6.3
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

When I am using deepstream-app or deepstream-parallel-infer-app, I want to get the elapsed time of each model, such as video decoding elapsed time, detection model elapsed time, classification model elapsed time, what should I do to get it? Meanwhile, for single-source multi-model (detection model + classification model), how do the detection model and classification model schedule the GPU resources? (50% occupied by detection models and 50% by classification models?)

You can refer to the FAQ:

Thanks for your reply, I get the latency time successfully, I want to konw what are the units of component latency (ms or s)? When I add [source 0] num-sources 1–> 16, batch-num = fram-num → batch-num ≠ fram-num. What is the relationship between batch-num and frame-num?

And for pgie and secondary-gie or different secondary-gie, how do they allocate GPU resources?


About this question, you can refer to our source code directly.
source code: sources\gst-plugins\gst-nvinfer

Thanks, I will try it later. The BATCH-NUM shown in the terminal, can I understand it as a batch coming out of gst-nvstreammux?And when decoding sources at the same time, different sources have different frames, is because of Gst-Nvstreammux?

And when I run the follow command:


the FPS is display on the terminal, how could I do to display it, by the way, the follow command is openning


When Frame_num =1441, Frame_latency = 531ms, the displayed omponent latency’s add up to equal 138ms, I want to know what is the remaining elapsed time used for? Looking forward to your reply, thanks.

Which config file did you run? How did you calculate the total latency? Could you attach the config file and the log file?

cd /opt/nvidia/deepstream/deepstream-6.3/samples/configs/deepstream-app
deepstream-app -c source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt

num-sources = 1: The config file is the original config file, I didn’t make any changes to it.
num-sources = 16: nothing to do except change all batch-size = 1 → 16
I’m loading the video stream as a local file, is that why?

No. You can refer to the link Generate GStreamer Pipeline Graph to get the graph of the pipeline. You can find that in addition to a few plugins that printed, there are also many other plugins in the whole pipeline.

I know, I’ve exported the pipeline and I can see that there are many more plugins in there, also, I’ve found many more in… /deepstream/lib/gst-plugins as well.

Could I understand that when sources go into the whole piepeline, the all green elements in the pipeline below are re-run at every frame, like GstTees, GstQueue, Gstnvvideoconvert?

Meanwhile, when I run deepstream-app -c source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt, I’ve seen classifiers take only about 0.01ms of elapsed time, I thinks it is very amazing!

Yes. This graph shows the flowchart of each buffer. The performace of hardware acceleration processing is indeed fast.

Thank you very much, DS is a great development tool.
One more question, I found that when the number of sources increases, for example sources=16, the Frame latency difference between different sources is particularly large, I would like to know what causes the difference?
just like :

Source id = 2 Frame_num = 1763 Frame latency = 131.454102 (ms) 
Source id = 5 Frame_num = 1759 Frame latency = 236.458008 (ms) 

It may be due to a certain plugin requiring synchronization processing or the special processing on encoding and decoding of some frames. This requires the specific analysis of the latency of each plugin.

What should I do to get the latency of each plugin? Meanwhile, I’d like to know if DS natively has any resource limitations for deepstream-app / deepstream_parallel_inference_app when using them, like only 80% of the resources are allowed for decoding and 70% for detecting? Thanks.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

There is a print of each component on the image you posted earlier. Like Comp name = ....
We has no limitations about the resource. It limited by the hardware and some processing of the plugins.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.