Run yolov9e with deepstream python bindings

nrknandu18 · May 22, 2025, 3:05pm

Is any sample python code for running yolov9e with deepstream python bindings(like that of deepstream_python_apps/apps/deepstream-test1)
my env- cuda 12.2, deepstream 7.0

junshengy · May 23, 2025, 6:18am

Refer to this reposity , just use yolov9 as the pgie of deepstream-test1

junshengy · May 23, 2025, 6:21am

Same awesome work from the community

nrknandu18 · May 26, 2025, 5:45am

Hi thanks for the reply
I tried to run yolov9e using deepstream python bindings. Its run successfully but for multistream(6 video input) the FPS is very less as compared to the cpp method(i refered Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT) which i got 23 FPS for each stream in multistream(6 input video) option and the model used is yolov9e fp32 for both cpp and python option. Is there any issue with the code and config. Also i attached my code,config and system details.

warnings-

system details: NVIDIA TITAN Xp, cuda-12.2, gpu_driver=535.183.06 , gpu_memory_size=12GB

test.zip (8.4 KB)
I refered the code from deepstream_python_apps/apps/deepstream-test3 at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub

junshengy · May 26, 2025, 10:46am

Make sure the pipelines of both are exactly the same.You can dump the pipeline as dot file to confirm it.
DeepStream SDK FAQ - #10 by mchi
From the code you provided, the python project uses fp16 instead of fp32
Although Python programs are usually much slower, you can compare the GPU usage of Python and native programs.

nrknandu18 · May 27, 2025, 4:53am

Hi thanks for the quick reply
Sorry I mean the binay approach(reference: Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT) iam able to get 23 FPS on each stream in multistream(6 video input)
But in the config i assigned ‘network-type=0’ which is fp32 and also commented the model engine path. So it will create the fp32 model on running the application.
I tried both both cpp and python approach with fp32 engine model and both are getting 5-6 FPS on each stream in multistream(6 video input). But for binary side getting around 23 fps for each stream..
The below image shows gpu usage of python bindings with yolov9e engine

The below image shows gpu usage of binary option(Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT) with yolov9e engine

junshengy · May 27, 2025, 11:13am

Why are there multiple python instances? This seems to result in different video memory usage.(742M vs 2894M)

nrknandu18 · May 27, 2025, 11:36am

Hi
1.Why are there multiple python instances? Sorry don’t know about that
Also i ran cpp sample application with yolov9e - deepstream-test3(multistream 6 video input) and FPS observed as same as using python bindings about 5-7 FPS.
Here below GPU usage of deepstream-test3(cpp) with multistream(6)

Here below GPU usage of binary(deepstream-app) with multistream(6) and FPS observed 23-24

Could you please find a solution to resolve this issue of low FPS. Requirement is to run yolov9e multistream with decent FPS(above 21) using deepstream python binding or cpp application code.

junshengy · May 29, 2025, 3:25am

Please make sure to compare fps under the same GPU loading.Otherwise it will be meaningless

1.For deepstream-app, just add multiple sources in the configuration file and start one process. There is no need to start multiple instances, which uses too much CPU/GPU.
2. Use fakesink to make the pipeline run as fast as possible to avoid the impact of clock synchronization
3. Can you share your yolov9e.onnx model?

nrknandu18 · May 29, 2025, 7:01am

Hi i tried to attach the onnx model but its exceed its allowded upload size(100MB)
steps to convert onnx below

download the model using python script
from ultralytics import YOLO
model = YOLO(“yolov9e.pt”)
By refering this Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT i convert to onnx
/ultralytics# python3 export_yoloV9.py -w yolov9e.pt --dynamic

junshengy · May 30, 2025, 6:12am

I use A40 GPU and I can get the same result

diff --git a/config_infer_primary_yoloV9.txt b/config_infer_primary_yoloV9.txt
index 8db48a0..733f490 100644
--- a/config_infer_primary_yoloV9.txt
+++ b/config_infer_primary_yoloV9.txt
@@ -2,11 +2,11 @@
 gpu-id=0
 net-scale-factor=0.0039215697906911373
 model-color-format=0
-onnx-file=yolov9-c.onnx
-model-engine-file=model_b1_gpu0_fp32.engine
+onnx-file=/root/yolov9-s-converted.pt.onnx
+model-engine-file=/opt/nvidia/deepstream/deepstream/sources/deepstream_python_apps/apps/deepstream-test3/model_b6_gpu0_fp32.engine
 #int8-calib-file=calib.table
 labelfile-path=labels.txt
-batch-size=1
+batch-size=6
 network-mode=0
 num-detected-classes=80
 interval=0
diff --git a/deepstream_app_config.txt b/deepstream_app_config.txt
index 8c6822f..9e8878a 100644
--- a/deepstream_app_config.txt
+++ b/deepstream_app_config.txt
@@ -1,6 +1,6 @@
 [application]
 enable-perf-measurement=1
-perf-measurement-interval-sec=5
+perf-measurement-interval-sec=1
 
 [tiled-display]
 enable=1
@@ -11,17 +11,19 @@ height=720
 gpu-id=0
 nvbuf-memory-type=0
 
+
 [source0]
 enable=1
 type=3
 uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
-num-sources=1
+num-sources=6
 gpu-id=0
 cudadec-memtype=0
 
 [sink0]
+#Type - 1=FakeSink 2=EglSink/nv3dsink (Jetson only) 3=File
 enable=1
-type=2
+type=1
 sync=0
 gpu-id=0
 nvbuf-memory-type=0
@@ -44,7 +46,7 @@ nvbuf-memory-type=0
 [streammux]
 gpu-id=0
 live-source=0
-batch-size=1
+batch-size=6
 batched-push-timeout=40000
 width=1920
 height=1080
@@ -56,7 +58,7 @@ enable=1
 gpu-id=0
 gie-unique-id=1
 nvbuf-memory-type=0
-config-file=config_infer_primary.txt
+config-file=config_infer_primary_yoloV9.txt
 
 [tests]
 file-loop=0

diff --git a/apps/deepstream-test3/deepstream_test_3.py b/apps/deepstream-test3/deepstream_test_3.py
index 1d04ebc..c3a05dc 100755
--- a/apps/deepstream-test3/deepstream_test_3.py
+++ b/apps/deepstream-test3/deepstream_test_3.py
@@ -109,7 +109,7 @@ def pgie_src_pad_buffer_probe(pad,info,u_data):
                 obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
             except StopIteration:
                 break
-            obj_counter[obj_meta.class_id] += 1
+            # obj_counter[obj_meta.class_id] += 1
             try: 
                 l_obj=l_obj.next
             except StopIteration:
@@ -345,7 +345,7 @@ def main(args, requested_pgie=None, config=None, disable_probe=False):
     elif requested_pgie == "nvinfer" and config != None:
         pgie.set_property('config-file-path', config)
     else:
-        pgie.set_property('config-file-path', "dstest3_pgie_config.txt")
+        pgie.set_property('config-file-path', "/root/DeepStream-Yolo/config_infer_primary_yoloV9.txt")
     pgie_batch_size=pgie.get_property("batch-size")
     if(pgie_batch_size != number_sources):
         print("WARNING: Overriding infer-config batch-size",pgie_batch_size," with number of sources ", number_sources," \n")
@@ -399,7 +399,7 @@ def main(args, requested_pgie=None, config=None, disable_probe=False):
         if not disable_probe:
             pgie_src_pad.add_probe(Gst.PadProbeType.BUFFER, pgie_src_pad_buffer_probe, 0)
             # perf callback function to print fps every 5 sec
-            GLib.timeout_add(5000, perf_data.perf_print_callback)
+            GLib.timeout_add(1000, perf_data.perf_print_callback)
 
     # Enable latency measurement via probe if environment variable NVDS_ENABLE_LATENCY_MEASUREMENT=1 is set.
     # To enable component level latency measurement, please set environment variable

python3 deepstream_test_3.py -i file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 --no-display |grep PERF

**PERF:  {'stream0': 74.87, 'stream1': 79.19, 'stream2': 75.2, 'stream3': 74.87, 'stream4': 75.21, 'stream5': 79.2} 
**PERF:  {'stream0': 78.95, 'stream1': 78.96, 'stream2': 78.96, 'stream3': 78.96, 'stream4': 78.96, 'stream5': 78.96} 
**PERF:  {'stream0': 78.89, 'stream1': 78.89, 'stream2': 78.89, 'stream3': 78.89, 'stream4': 78.89, 'stream5': 78.89} 
**PERF:  {'stream0': 77.95, 'stream1': 77.95, 'stream2': 77.95, 'stream3': 77.95, 'stream4': 77.95, 'stream5': 77.95} 
**PERF:  {'stream0': 78.91, 'stream1': 78.91, 'stream2': 78.91, 'stream3': 78.91, 'stream4': 78.91, 'stream5': 78.91}

deepstream-app -c deepstream_app_config.txt

**PERF:  FPS 0 (Avg)	FPS 1 (Avg)	FPS 2 (Avg)	FPS 3 (Avg)	FPS 4 (Avg)	FPS 5 (Avg)	
**PERF:  79.05 (75.87)	77.43 (74.10)	77.43 (74.10)	77.43 (74.10)	77.43 (74.10)	77.43 (74.10)	
**PERF:  77.44 (77.46)	77.44 (77.09)	77.44 (77.09)	77.44 (77.09)	77.44 (77.09)	77.44 (77.09)	
**PERF:  77.14 (77.22)	77.14 (77.02)	77.14 (77.02)	77.14 (77.02)	77.14 (77.02)	77.14 (77.02)	
**PERF:  78.37 (77.74)	78.37 (77.60)	78.37 (77.60)	78.37 (77.60)	78.37 (77.60)	78.37 (77.60)	
**PERF:  78.34 (77.78)	78.34 (77.67)	78.34 (77.67)	78.34 (77.67)	78.34 (77.67)	78.34 (77.67)	
**PERF:  78.19 (77.81)	78.19 (77.72)	78.19 (77.72)	78.19 (77.72)	78.19 (77.72)	78.19 (77.72)	
**PERF:  77.81 (77.84)	77.81 (77.76)	77.81 (77.76)	77.81 (77.76)	77.81 (77.76)	77.81 (77.76)

nrknandu18 · May 30, 2025, 6:27am

Hi thanks for the reply, is the same yolov9e.pt model is you have used that i mentioned in above chat or different?

junshengy · May 30, 2025, 6:46am

I tested yolov9s/yolov9e, and there is no difference in fps between native and python. This problem should have nothing to do with the model. Please debug it yourself.

yolov9e

python
**PERF:  {'stream0': 14.27, 'stream1': 18.26, 'stream2': 14.38, 'stream3': 18.26, 'stream4': 14.27, 'stream5': 14.26} 
**PERF:  {'stream0': 14.99, 'stream1': 13.99, 'stream2': 14.99, 'stream3': 14.99, 'stream4': 14.99, 'stream5': 14.99} 
**PERF:  {'stream0': 14.99, 'stream1': 14.99, 'stream2': 14.99, 'stream3': 14.99, 'stream4': 14.99, 'stream5': 14.99} 
**PERF:  {'stream0': 15.99, 'stream1': 15.99, 'stream2': 15.99, 'stream3': 15.99, 'stream4': 14.99, 'stream5': 15.99} 
**PERF:  {'stream0': 14.98, 'stream1': 14.98, 'stream2': 14.98, 'stream3': 14.98, 'stream4': 14.98, 'stream5': 14.98} 
**PERF:  {'stream0': 14.99, 'stream1': 14.99, 'stream2': 14.99, 'stream3': 14.99, 'stream4': 14.99, 'stream5': 14.99} 

native
**PERF:  FPS 0 (Avg)	FPS 1 (Avg)	FPS 2 (Avg)	FPS 3 (Avg)	FPS 4 (Avg)	FPS 5 (Avg)	
**PERF:  20.81 (14.78)	14.98 (7.96)	14.98 (7.96)	14.98 (7.96)	14.98 (7.96)	0.00 (0.00)	
**PERF:  15.15 (15.79)	15.15 (15.10)	15.15 (15.10)	15.15 (15.10)	15.15 (15.10)	15.15 (15.10)	
**PERF:  15.10 (15.42)	15.10 (15.04)	15.10 (15.04)	15.10 (15.04)	15.10 (15.04)	15.10 (15.05)	
**PERF:  15.06 (15.29)	15.06 (15.03)	15.06 (15.03)	15.06 (15.03)	15.06 (15.03)	15.06 (15.03)

yingliu · July 14, 2025, 6:31am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

system · July 28, 2025, 6:31am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiple video inference with deepstream-yolo-app DeepStream SDK	14	2448	October 12, 2021
Yolo V3 slow DeepStream SDK yolo	23	2961	January 18, 2022
Develop a Python AI app using YOLOv3 and deepstream DeepStream SDK	10	670	October 12, 2021
DeepStream Yolo Multiple Streams DeepStream SDK	7	967	July 13, 2021
Deep Stream SDK DeepStream SDK gstreamer	4	600	October 12, 2021
IP camera lag with Deepstream python binding DeepStream SDK camera , gstreamer , python , jetson , deepstream , jetson-orin	10	174	July 10, 2025
How to get same FPS on python custom apps as the deepstream_config.txt apps DeepStream SDK camera , gstreamer , python	6	1955	October 12, 2021
Deepstream yolov4 process multiple streams is slow DeepStream SDK	7	1473	November 30, 2021
Deepstrem 5.0 Python yolo DeepStream SDK	16	2839	October 12, 2021
Some question about Deep stream 5 DeepStream SDK	42	2350	October 12, 2021

Run yolov9e with deepstream python bindings

Related topics