Deepstream 4 + yolov3 multi source slow

I need to work with 4-5 RTSP streams but the performance is very bad…
With 2 video .mp4 (local) I have only 4-5 fps for each source.

How can I have the same performance declared in your website?
https://developer.nvidia.com/deepstream-sdk

Jetson AGX Xavier
h.264 -> 32
h.265 -> 49

Thanks

Hi,
Please share you run which config:

deepstream_sdk_v4.0_jetson\sources\objectDetector_Yolo\deepstream_app_config_yoloV3.txt
deepstream_sdk_v4.0_jetson\sources\objectDetector_Yolo\deepstream_app_config_yoloV3_tiny.txt

For multiple sources, it can be better to run yoloV3 tiny.

My deepstream_app_config_yoloV3.txt

<a target='_blank' rel='noopener noreferrer' href=''></a>
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[server]
port=2222
wait-for-client-output=0
dump-client-image-crop=0
crop-width=640
crop-height=480
crop-format=1
crop-color-format=1

[tiled-display]
enable=0
rows=1
columns=1
width=1920
height=1080
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=4
# uri=file://../../samples/streams/sample_1080p_h264.mp4
uri=rtsp://admin:Password@192.168.30.61/
num-sources=1
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
##STREAMING RTSP SE NO 2
type=4
##----
codec=1
sync=0
source-id=0
gpu-id=0
##STREAMING RTSP
bitrate=2000000
#----
nvbuf-memory-type=0
##STREAMING RTSP
udp-port=5400
rtsp-port=8554
#---

[sink2]
enable=1
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265
codec=1
sync=0
#iframeinterval=10
bitrate=2000000
width=1280
height=720
output-file=/media/nvidia/xavier_ssd/output_file/out.mp4
source-id=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live 
##STREAMING RTSP
live-source=1
##----
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1280
height=720
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
#model-engine-file=model_b1_int8.engine
labelfile-path=labels.txt
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
#interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3.txt

[tests]
file-loop=0

My config_infer_primary_yoloV3

[property]
gpu-id=0
net-scale-factor=1
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/yolov3.cfg
model-file=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/yolov3.weights
#model-engine-file=model_b1_int8.engine
labelfile-path=labels.txt
int8-calib-file=yolov3-calibration.table.trt5.1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=80
gie-unique-id=1
is-classifier=0
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so

[server]
port=2222
wait-for-client-output=0
dump-client-image-crop=0
crop-width=640
crop-height=480
crop-format=1
crop-color-format=1

The result with local sample file, in streaming RTSP, isn’t acceptable…
[img]https://ibb.co/d7LyJcZ[img]
[img]https://ibb.co/c1zXqD6[img]

Hi,
Could you try config_infer_primary_yoloV3_tiny.txt? It shall brings better performance. Besides, do you execute ‘sudo nvpmodel -m 0’ and ‘sudo jetson_clocks’?

Yes, it works, but the precision of detections with yolov3 tiny is more low than yolov3.

What is the max number of souces processed by Xavier?

Thanks

Hi,

We got around 26 fps on the Jetson Xavier.
There are several discussion on the YOLO performance before. You can check it to give some idea.

1. Update deepstream config
https://devtalk.nvidia.com/default/topic/1058668/deepstream-sdk/-tx2-yolo-v2-tiny-handles-video-slowly-

2. Update model size directly
https://devtalk.nvidia.com/default/topic/1058408/deepstream-sdk/yolov3-fps-on-xavier/post/5378683/#5378683

Thanks.

Hi,

It looks like the application is running the model with float32 precision.

## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0

Update to INT8 or FP16 can get some improvement in the performance.
Thanks.

Hi,

I’ve heard DeepStream’s Yolo sample is only optimized for Tegra, right?

take a look at the following link:
https://devtalk.nvidia.com/default/topic/1049402/deepstream-sdk/deepstream-yolo-app-performance-vs-tensor-core-optimized-yolo-darknet/

is this solved in the newest version of deepStream SDK (the fourth version)?

Hi,

DS now supports the standard yolo models - yolov2, v2-tiny, v3 and v3-tiny natively in the nvinfer plugin which works on all platforms. You can see the same in the sdk sources at sources/objectDetector_Yolo/