Decoder Latency increases when enabling DLA

axnet · April 7, 2024, 12:34am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) jetson agx orin
• DeepStream Version 6.2
• JetPack Version (valid for Jetson only) jp 5.1
• TensorRT Version 8.5.2

I have a pipeline which ustlizes DS plugins as follow
Decoder → nvstreammux → detect → classify → encode
i have noticed when running the detect/classify models on DLA, decoding time only increases to almost double compared with running the same models on GPU. Is there any reason for this? and additionally can we run the model without using nvstreammux in deepstream as it introduce latency to the piepline.
The ENCODE-DECODE uses NVDEC

Fiona.Chen · April 7, 2024, 2:15am

How did you measure the “decoding time”?

No.

The nvstreammux is the key of inferencing since it generate the batched data for TensorRT.
You may need to set the “width” and “height” properties of nvstreammux to the same as the video’s original resolution, then the nvstreammux latency will be very small.

axnet · April 7, 2024, 10:41am

Thank you for the clarification.

I have enabled complement latency measurements by :
$ export NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1
And please note that some layers of the model fell back to the gpu

Fiona.Chen · April 7, 2024, 10:49am

This method measures the latency of the GstBuffer going into the sink pad and going out from the src pad, it is not a latency of the processing time. Sometimes the downstream element consumes the buffer late then the latency of the upstream element becomes long. It does not mean the upstream element processes the buffer slowly.

Please measure the GPU/DLA loading with the pipeline running to find the real bottleneck of the pipeline.

axnet · April 7, 2024, 11:02am

can you please clarify this point because as per documentation that this command will give the latency to each plugin in the pipeline and i have decode as one of the used plug-ins. The command gave an output of time in ms for each used plug-in the pipeline for each Gstbuffer “Frame in my case”

please advise how can i measure the loading

Fiona.Chen · April 8, 2024, 2:15am

As described in the document, the latency to each plugin in the pipeline can be measured in this way. Your concern about the “decoding time only increases to almost double compared with running the same models on GPU.” may not be the performance drop of the decoder, the inferencing speed drop may also cause the decoding plugin latency be larger.

The easiest way is to use the “Jetson Power GUI”. The command “tegrastats” may also help.

system · April 22, 2024, 2:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Plug-in latency and FPS DeepStream SDK	6	375	February 6, 2024
How to test the pipeline latency of deepstream test3? DeepStream SDK deepstream	11	148	May 23, 2025
Deepstream: Latency between image capture and muxer source timestamp DeepStream SDK	3	374	March 1, 2022
Latency time at nvv4l2decoder about 250ms when running deepstream-app DeepStream SDK	4	708	October 12, 2021
Inference performance test of deepstream DeepStream SDK tensorrt , ubuntu	2	352	February 28, 2023
Deepstream Decode Acceleration DeepStream SDK	8	1103	October 12, 2021
An understanding of the delay result produced by latency_measurement_buf_probe DeepStream SDK camera , cudnn , deepstream	40	410	December 24, 2024
How to use enable_perf_measurement=1 using dockerd DeepStream SDK deepstream	13	177	December 30, 2024
Deepstream 6.0: Image capture to muxer large latency DeepStream SDK	13	1693	February 28, 2022
The most efficient method to evaluate time each plugin (in DeepStream)cost? DeepStream SDK	8	1394	October 12, 2021

Decoder Latency increases when enabling DLA

Related topics