Deepstream 5.1 inference caps at 30fps

EbinJose · October 25, 2021, 11:41am

I lauch deepstream 5.1 using the following dockers

'dockerrun --gpus all -it --rm -v /tmp/.X11-unix:/tmp/.X11-unix -e
DISPLAY=$DISPLAY -w /opt/nvidia/deepstream/deepstream-5.1
nvcr.io/nvidia/deepstream:5.1-21.02-triton`

i run infrence using :

deepstream-app -c /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8_gpu1.txt

The problem is i get 4 streams each averaging 30 fps combined 120 fps

I wanna run inference only on a single sample video (how do i change the number of sample)
Only one GPU is being utilised and that also only 15% ,fps don’t go over 30
i tried setting the sinks to Fake Sink and EglSink,dis-enabled the titled display,how do i maximise the fps possibly to 1000+ ?

TensorRT Version7.2.1.6
Quadro RTX 5000 dual GPU
Driver Version: 455.23.05
CUDA Version: 11.1
Ubuntu 18.04
python 3.6

Amycao · October 26, 2021, 2:32am

Please refer to this, how to reach max fps Performance — DeepStream 6.1.1 Release documentation

EbinJose · October 26, 2021, 5:18am

Thank you,i had followed that ,but basically it bumped up the fps by 3 to 4 .so basically a stream caps at right around 30 fps , so a powerful GPU allows us to run multiple stream like 30 , 40. according to the computation power.

Can you please let me know,
1.if there any script to run at 1000 fps (inside deepstream)
2. yolo inference script, prebuilt model ?
3.how to run inference simultaneously on 30, 40 streams.

.

Amycao · October 26, 2021, 6:54am

1.if there any script to run at 1000 fps (inside deepstream)
2. yolo inference script, prebuilt model ?
[amycao] You can run with multi streams, to reach your GPU computation capability.
3.how to run inference simultaneously on 30, 40 streams.
[amycao ] set multi streams in config file.

one example:

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=3
uri=file://…/…/streams/sample_1080p_h264.mp4
num-sources=15

EbinJose · October 26, 2021, 11:07am

I ran 30 streams using
file1.

source30_1080p_dec_infer-resnet_tiled_display_int8.txt (4.7 KB)

and 4 streams using

file2.
source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8_gpu1.txt (5.8 KB)

so as you stated

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=3
uri=file://…/…/streams/sample_1080p_h264.mp4
num-sources=15

changing the num-sources doesn’t change the number of streams it still 4 , i want to know where is the line which causes the code to run 4 streams or 30 streams

I think the batch-size is where it reflects the no of streams a s file1 has 30 and file2 has 4,it doesn’t make much sense. Playing around with batch size in both source and config file did not give any fruitful result

[amycao] You can run with multi streams, to reach your GPU computation capability.
[amycao ] set multi streams in config file.
I cant see any Multi steams option in the config file.

Amycao · October 27, 2021, 3:00am

Yes, you also need to change batch-size in pgie and streammux batch-size to the number of sources. set batch-size to number of sources in nvinfer element, will let GPU run inference computation simutaneously.

EbinJose · October 27, 2021, 7:27am

what do you mean by pgie and nvinfer element?

here is my source 30 file:

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=0
rows=1
columns=6
width=1280
height=720
gpu-id=1
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=3
uri=file://…/…/streams/sample_qHD.mp4
num-sources=60
#drop-frame-interval=2
gpu-id=1

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

cudadec-memtype=0

[source1]
enable=0
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=3
uri=file://…/…/streams/sample_1080p_h264.mp4
num-sources=60
gpu-id=0

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=1
nvbuf-memory-type=0

[sink1]
enable=0
type=2
#1=mp4 2=mkv
container=1
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
#iframeinterval=10
bitrate=2000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
output-file=out.mp4
source-id=0

[sink2]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming
type=2
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=4

set below properties in case of RTSPStreaming

rtsp-port=8554
udp-port=5400

[osd]
enable=1
gpu-id=1
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=1
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=60
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000

Set muxer output width and height

width=1920
height=1080
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

If set to TRUE, system timestamp will be attached as ntp timestamp

If set to FALSE, ntp timestamp from rtspsrc, if available, will be attached

attach-sys-ts-as-ntp=1

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

[primary-gie]
enable=1
gpu-id=1
model-engine-file=…/…/models/Primary_Detector/resnet10.caffemodel_b30_gpu0_int8.engine
#Required to display the PGIE labels, should be added even when using config-file
#property
batch-size=60
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
#Required by the app for SGIE, when used along with config-file property
gie-unique-id=1
nvbuf-memory-type=1
config-file=config_infer_primary.txt

[tests]

###################
###################
##################

here is my config file

################################################################################

Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a

copy of this software and associated documentation files (the “Software”),

to deal in the Software without restriction, including without limitation

the rights to use, copy, modify, merge, publish, distribute, sublicense,

and/or sell copies of the Software, and to permit persons to whom the

Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in

all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

DEALINGS IN THE SOFTWARE.

################################################################################

Following properties are mandatory when engine files are not specified:

int8-calib-file(Only in INT8)

Caffemodel mandatory properties: model-file, proto-file, output-blob-names

UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names

ONNX: onnx-file

Mandatory properties for detectors:

num-detected-classes

Optional properties for detectors:

cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)

custom-lib-path,

parse-bbox-func-name

Mandatory properties for classifiers:

classifier-threshold, is-classifier

Optional properties for classifiers:

classifier-async-mode(Secondary mode only, Default=false)

Optional properties in secondary mode:

operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),

input-object-min-width, input-object-min-height, input-object-max-width,

input-object-max-height

Following properties are always recommended:

batch-size(Default=1)

Other optional properties:

net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),

model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,

mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),

custom-lib-path, network-mode(Default=0 i.e FP32)

The values in the config file are overridden by values set through GObject

properties.

[property]
gpu-id=1
net-scale-factor=0.0039215697906911373
model-file=…/…/models/Primary_Detector/resnet10.caffemodel
proto-file=…/…/models/Primary_Detector/resnet10.prototxt
model-engine-file=…/…/models/Primary_Detector/resnet10.caffemodel_b30_gpu0_int8.engine
labelfile-path=…/…/models/Primary_Detector/labels.txt
int8-calib-file=…/…/models/Primary_Detector/cal_trt.bin
batch-size=60
process-mode=1
model-color-format=0

0=FP32, 1=INT8, 2=FP16 mode

network-mode=1
num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid
force-implicit-batch-dim=1
#parse-bbox-func-name=NvDsInferParseCustomResnet
#custom-lib-path=/path/to/libnvdsparsebbox.so

0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)

#cluster-mode=1
#scaling-filter=0
#scaling-compute-hw=0

#Use these config params for group rectangles clustering mode
[class-attrs-all]
pre-cluster-threshold=0.2
group-threshold=1
eps=0.2
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

#Use the config params below for dbscan clustering mode
#[class-attrs-all]
#detected-min-w=4
#detected-min-h=4
#minBoxes=3

Per class configurations

#[class-attrs-0]
#pre-cluster-threshold=0.05
#eps=0.7
#dbscan-min-score=0.95

#[class-attrs-1]
#pre-cluster-threshold=0.05
#eps=0.7
#dbscan-min-score=0.5

#[class-attrs-2]
#pre-cluster-threshold=0.1
#eps=0.6
#dbscan-min-score=0.95

#[class-attrs-3]
#pre-cluster-threshold=0.05
#eps=0.7
#dbscan-min-score=0.5

I have set source and batch size to 60 in both of these files it still giving me 30 streams

Amycao · October 27, 2021, 9:00am

Nvinfer is the element, pgie or sgie is the created element name.

EbinJose · November 1, 2021, 6:08am

Sorry, but i still don’t understand. When i look for Nvinfer there are multiple files named the same,do i have to edit one of these files. What is pgie and sgie, i am kinda lost.

Amycao · November 1, 2021, 6:14am

for your reference:
GstElement *pgie = NULL, *sgie1 = NULL;

pgie = gst_element_factory_make (“nvinfer”, “primary-nvinference-engine”);

sgie1 = gst_element_factory_make (“nvinfer”, “secondary1-nvinference-engine”);

system · November 23, 2021, 2:38am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream multiple rtsp output latency DeepStream SDK	2	639	November 10, 2022
What are the detailed specifications of the properties of new nvstreammux configuration file? DeepStream SDK	7	323	June 25, 2024
Deepstream can't sink more than 2 video streams at the same time DeepStream SDK	8	1114	October 12, 2021
deepstream - 56 sources not working, solving buffer drop scenarios DeepStream SDK	7	989	October 12, 2021
DeepStream4.0.2 sample config outputs 0 bytes mp4 file DeepStream SDK	3	362	October 12, 2021
ERROR: Failed to enqueue trt inference batch in deepstream-app DeepStream SDK tensorrt , cuda	6	793	April 13, 2023
Why is only the first frame in a batch inferred in the sample deepstream_app? DeepStream SDK	9	711	March 10, 2022
Deepstream5.0.1+yolov5+resnet50 No output classifier result DeepStream SDK gstreamer	4	930	August 4, 2021
Performance drop in multiple source RTSP streaming DeepStream SDK	6	869	October 12, 2021
Write CNN outputs into file or make them accessible by another process DeepStream SDK	4	393	May 30, 2022

Deepstream 5.1 inference caps at 30fps

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

(0): memtype_device - Memory type Device

(1): memtype_pinned - Memory type Host Pinned

(2): memtype_unified - Memory type Unified

set below properties in case of RTSPStreaming

Set muxer output width and height

If set to TRUE, system timestamp will be attached as ntp timestamp

If set to FALSE, ntp timestamp from rtspsrc, if available, will be attached

attach-sys-ts-as-ntp=1

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a

copy of this software and associated documentation files (the “Software”),

to deal in the Software without restriction, including without limitation

the rights to use, copy, modify, merge, publish, distribute, sublicense,

and/or sell copies of the Software, and to permit persons to whom the

Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in

all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL

THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

DEALINGS IN THE SOFTWARE.

Following properties are mandatory when engine files are not specified:

int8-calib-file(Only in INT8)

Caffemodel mandatory properties: model-file, proto-file, output-blob-names

UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names

ONNX: onnx-file

Mandatory properties for detectors:

num-detected-classes

Optional properties for detectors:

cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)

custom-lib-path,

parse-bbox-func-name

Mandatory properties for classifiers:

classifier-threshold, is-classifier

Optional properties for classifiers:

classifier-async-mode(Secondary mode only, Default=false)

Optional properties in secondary mode:

operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),

input-object-min-width, input-object-min-height, input-object-max-width,

input-object-max-height

Following properties are always recommended:

batch-size(Default=1)

Other optional properties:

net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),

model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,

mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),

custom-lib-path, network-mode(Default=0 i.e FP32)

The values in the config file are overridden by values set through GObject

properties.

0=FP32, 1=INT8, 2=FP16 mode

0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)

Per class configurations

Related topics