I have been trying to benchmark several models on Deepstream on RTSP streams and the results indicate that I cannot run more than 5 real-time streams without a drop in frame rate which becomes significant as I increase the number of streams further.
The model I am using is the ResNet-10. Although the same is observed using the custom YOLO implementation provided with Deepstream.
With ResNet, FPS drops moving from 30 (near real time) with 1 RTSP stream down to 18 per stream when the number of streams is increased to 10. With YOLO, it drops from 30 to 8 as we increase streams from 1 to 10.
Here is the deepstream config file:
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=2
kitti-track-output-dir=/nvme/test/metadata_fahad_rtsp_1
#gie-kitti-output-dir=streamscl
[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=2
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
#(5): nvbuf-mem-handle - Allocate Surface Handle memory, applicable for Jetson
#(6): nvbuf-mem-system - Allocate Surface System memory, allocated using calloc
nvbuf-memory-type=0
[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=4
#uri=file:/dfs/AutomationWorkspace/EncodedVideos/20191016-150001/camera16/cam16Concat_28fps.mp4
uri=rtsp://153.64.131.17/stream
gpu-id=2
# (0): memtype_device - Memory type Device
# (1): memtype_pinned - Memory type Host Pinned
# (2): memtype_unified - Memory type Unified
cudadec-memtype=0
[sink0]
enable=1
type=1
#1=mp4 2=mkv
#1=h264 2=h265 3=mpeg4
## only SW mpeg4 is supported right now.
qos=0
sync=0
gpu-id=2
iframeinterval=10
output-file=/software/Video_Output_Fahad/Out_RTSP_0.mp4
container=1
codec=3
source-id=0
#end
[osd]
enable=1
gpu-id=2
border-width=1
text-size=20
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Arial
process-mode=1
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0
[streammux]
gpu-id=2
##Boolean property to inform muxer that sources are live
live-source=1
batch-size=4
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=1000
## Set muxer output width and height
width=1280
height=720
#num-surfaces-per-frame=31
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0
# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=2
#model-engine-file=model_b4_int8.engine
labelfile-path=labels.txt
batch-size=4
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=1
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3_Fahad.txt
[tracker]
enable=0
tracker-width=320
tracker-height=180
#ll-lib-file=/usr/local/deepstream/libnvds_mot_iou.so
#ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so
#ll-lib-file=/usr/local/deepstream/libnvds_mot_klt.so
#ll-lib-file=/usr/local/deepstream/libnvds_tracker.so
ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_nvdcf.so
#ll-config-file required for IOU only
ll-config-file=/root/deepstream_sdk_v4.0_x86_64/samples/configs/deepstream-app/tracker_config.yml
#ll-config-file=iou_config.txt
gpu-id=2
enable-batch-process=1
[tests]
file-loop=0
And here is the inference config file:
[property]
net-scale-factor=1
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=/root/deepstream_sdk_v4.0_x86_64/sources/objectDetector_Yolo/yolov3.cfg
model-file=/root/deepstream_sdk_v4.0_x86_64/sources/objectDetector_Yolo/yolo-obj_20000.weights
#model-engine-file=model_b1_int8.engine
labelfile-path=labels.txt
#int8-calib-file=yolov3-calibration.table.trt5.1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
gie-unique-id=1
is-classifier=0
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3
custom-lib-path=/root/deepstream_sdk_v4.0_x86_64/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
These numbers do not match the claimed throughput on Deepstream. Is there a problem with my config files?