DeepStream GST-NVSTREAMMUX change width and height doesn't affect FPS

Hi,
I’m trying to test YoloV3 with DeepStream 4.0.1.
I also use “perf” element from “gst-perf” plugin: https://github.com/RidgeRun/gst-perf
What I got is 1.9 FPS with 1920x1080 video.
I tried to change the width and height of Gst-Nvstreammux to width=480, height=320.
The output is exactly that size, but the fps doesn’t change.
Is changing Gst-Nvstreammux a way to improve fps?
Is there any other way to improve fps?

Hi,

I assume your platform in Jetson nano since you report 2FPS for yolov3. This is a compute intensive model, and those computations are independent of input video resolution. To increase the FPS you can try the following -

  1. switch to yolov3-tiny
  2. Use fp16 mode for inference
  3. Reduce the network input resolution (in yolov3.cfg) of the network from
width=608
height=608

to

width=416
height=416
  1. You can also use a tracker and avoid doing inference every frame this will reduce GPU utilization and increase throughput. Use nvinfer plugins “interval” property to set and tune it according to your use-case.

deepstream-app config file

################################################################################
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file://../../samples/streams/sample_1080p_h264.mp4
num-sources=1
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
model-engine-file=model_b1_fp16.engine
labelfile-path=labels.txt
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=5
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3.txt

[tracker]
enable=1
tracker-width=480
tracker-height=272
ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so
gpu-id=0

[tests]
file-loop=0

nvinfer config file

################################################################################
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

# Following properties are mandatory when engine files are not specified:
#   int8-calib-file(Only in INT8), model-file-format
#   Caffemodel mandatory properties: model-file, proto-file, output-blob-names
#   UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
#   ONNX: onnx-file
#
# Mandatory properties for detectors:
#   num-detected-classes
#
# Optional properties for detectors:
#   enable-dbscan(Default=false), interval(Primary mode only, Default=0)
#   custom-lib-path
#   parse-bbox-func-name
#
# Mandatory properties for classifiers:
#   classifier-threshold, is-classifier
#
# Optional properties for classifiers:
#   classifier-async-mode(Secondary mode only, Default=false)
#
# Optional properties in secondary mode:
#   operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
#   input-object-min-width, input-object-min-height, input-object-max-width,
#   input-object-max-height
#
# Following properties are always recommended:
#   batch-size(Default=1)
#
# Other optional properties:
#   net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
#   model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
#   mean-file, gie-unique-id(Default=0), offsets, gie-mode (Default=1 i.e. primary),
#   custom-lib-path, network-mode(Default=0 i.e FP32)
#
# The values in the config file are overridden by values set through GObject
# properties.

[property]
gpu-id=0
net-scale-factor=1
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=yolov3.cfg
model-file=yolov3.weights
model-engine-file=model_b1_fp16.engine
labelfile-path=labels.txt
int8-calib-file=yolov3-calibration.table.trt5.1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
gie-unique-id=1
is-classifier=0
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so

I have made changes specified the comment above in these config files and we can achieve a throughput of 20 FPS. Please make sure to change the height and width to 416 in yolov3.cfg before generating the engine file. If the tracking results are bad for your test video, you can reduce the interval to improve the accuracy further but the FPS will drop as well. Its trade-off that needs to be tuned for your use-case.

Also, run this command before running the application to increase clock frequencies

sudo jetson_clocks

Thanks for you help.
Changing the height and width to 416 in yolov3.cfg works.

Hi @CJR
I’m using yolov3 on my jetson nano currently at ~1-2 FPS and want to achieve a faster result and I found your post!

Did you ge the 20 FPS performance (with your the configuration files) using a yolov3 or a yolov3-tiny model? I want to understand this before deciding to implement it in Deepstream.

cheers and thanks in advance,
Enrique

yolov3

thanks @CJR !

When you say you “achieve a throughput of 20 FPS”, you do not mean that you are doing 20 yolov3 inferences per second. Rather, because of the “interval=5” setting, my understanding is that yolov3 inference is happening every fifth (or sixth?) frame. So the actual number of yolov3 inferences per second is 3-4? Can you clarify?

Hi gabriel.nell,

Please open a new topic for your issue. Thanks

That’s right, but the tracker is used for all the frames when inference is skipped by the nvinfer plugin.