Deepstream Nvtracker, bounding boxes issues

Hello Rik,

I’ve used YoloV3 provided in DeepStream and ran the following deepstream-app config with detection interval of 2 for 20FPS stream you provided. I verify that everything looked fine with these pipeline. No bbox ID changes for moving cars, and bboxes are following cars properly. I notice a bit of drag of bboxes for the frames detection is not done, but I believe you can tweak the params more, you should be able to improve the behavior. I generated an output video, but I couldn’t upload it here due to restrictions in forum reply.

Below is deepstream-app config I used. Please note that I used 960x544 in tracker width/height:

################################################################################
# Copyright (c) 2019-2020, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
#uri=file://../../../streams/sample_1080p_h264.mp4
#uri=file:///home/pshin/Downloads/NvBUGs/Issue_122834/axisOriginal_6FPS.mkv
uri=file:///home/pshin/Downloads/NvBUGs/Issue_122834/axisOriginal_20FPS.mkv
num-sources=1
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[sink2]
enable=1
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
## only SW mpeg4 is supported right now.
codec=3
sync=0
bitrate=2000000
output-file=out_issue_122834.mp4
source-id=0
qos=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
model-engine-file=model_b1_gpu0_int8.engine
labelfile-path=labels.txt
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=2
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3.txt

[tracker]
enable=1
tracker-width=960 #640
tracker-height=544 #368
#ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_mot_iou.so
#ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_mot_klt.so
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvdcf.so
ll-config-file=tracker_config.yml
#ll-config-file=iou_config.txt
gpu-id=0
#enable-batch-process applicable to DCF only
enable-batch-process=1
enable-past-frame=0

[tests]
file-loop=0

Below is the config file I used for tracker_config.yml for NvDCF tracker:

%YAML:1.0
  
NvDCF:
  # [General]
  useUniqueID: 1    # Use 64-bit long Unique ID when assignining tracker ID. Default is [true]
  maxTargetsPerStream: 99 # Max number of targets to track per stream. Recommended to set >10. Note: this value should account for the targets being tracked in shadow mode as well. Max value depends on the GPU memory capacity
  
  # [Feature Extraction]
  useColorNames: 1     # Use ColorNames feature
  useHog: 1            # Use Histogram-of-Oriented-Gradient (HOG) feature
  useHighPrecisionFeature: 1   # Use high-precision in feature extraction. Default is [true]

  # [DCF]
  filterLr: 0.15 # learning rate for DCF filter in exponential moving average. Valid Range: [0.0, 1.0]
  filterChannelWeightsLr: 0.22 # learning rate for the channel weights among feature channels. Valid Range: [0.0, 1.0]
  gaussianSigma: 0.75 # Standard deviation for Gaussian for desired response when creating DCF filter [pixels]
  featureImgSizeLevel: 2 # Size of a feature image. Valid range: {1, 2, 3, 4, 5}, from the smallest to the largest
  SearchRegionPaddingScale: 1 # Search region size. Determines how large the search region should be scaled from the target bbox.  Valid range: {1, 2, 3}, from the smallest to the largest
  
  # [MOT] [False Alarm Handling]
  maxShadowTrackingAge: 30  # Max length of shadow tracking (the shadow tracking age is incremented when (1) there's detector input yet no match or (2) tracker confidence is lower than minTrackerConfidence). Once reached, the tracker will be terminated.
  probationAge: 3           # Once the tracker age (incremented at every frame) reaches this, the tracker is considered to be valid
  earlyTerminationAge: 1    # Early termination age (in terms of shadow tracking age) during the probation period. If reached during the probation period, the tracker will be terminated prematurely.

  # [Tracker Creation Policy] [Target Candidacy]
  minDetectorConfidence: -1  # If the confidence of a detector bbox is lower than this, then it won't be considered for tracking
  minTrackerConfidence: 0.7  # If the confidence of an object tracker is lower than this on the fly, then it will be tracked in shadow mode. Valid Range: [0.0, 1.0]
  minTargetBboxSize: 10      # If the width or height of the bbox size gets smaller than this threshold, the target will be terminated.
  minDetectorBboxVisibilityTobeTracked: 0.0  # If the detector-provided bbox's visibility (i.e., IOU with image) is lower than this, it won't be considered.  
  minVisibiilty4Tracking: 0.0  # If the visibility of the tracked object (i.e., IOU with image) is lower than this, it will be terminated immediately, assuming it is going out of scene.
  
  # [Tracker Termination Policy]
  targetDuplicateRunInterval: 5 # The interval in which the duplicate target detection removal is carried out. A Negative value indicates indefinite interval. Unit: [frames]
  minIou4TargetDuplicate: 0.9 # If the IOU of two target bboxes are higher than this, the newer target tracker will be terminated.

  # [Data Association] Matching method
  useGlobalMatching: 0   # If true, enable a global matching algorithm (i.e., Hungarian method). Otherwise, a greedy algorithm wll be used.

  # [Data Association] Thresholds in matching scores to be considered as a valid candidate for matching
  minMatchingScore4Overall: 0.0   # Min total score
  minMatchingScore4SizeSimilarity: 0.5    # Min bbox size similarity score
  minMatchingScore4Iou: 0.1       # Min IOU score
  minMatchingScore4VisualSimilarity: 0.2    # Min visual similarity score
  minTrackingConfidenceDuringInactive: 0.8 #1.0  # Min tracking confidence during INACTIVE period. If tracking confidence is higher than this, then tracker will still output results until next detection 

  # [Data Association] Weights for each matching score term
  matchingScoreWeight4VisualSimilarity: 0.8  # Weight for the visual similarity (in terms of correlation response ratio)
  matchingScoreWeight4SizeSimilarity: 0.0    # Weight for the Size-similarity score
  matchingScoreWeight4Iou: 0.1               # Weight for the IOU score
  matchingScoreWeight4Age: 0.1               # Weight for the tracker age

  # [State Estimator]
  useTrackSmoothing: 1    # Use a state estimator
  stateEstimatorType: 1   # The type of state estimator among { moving_avg:1, kalman_filter:2 }

  # [State Estimator] [MovingAvgEstimator]
  trackExponentialSmoothingLr_loc: 0.5       # Learning rate for new location
  trackExponentialSmoothingLr_scale: 0.3     # Learning rate for new scale
  trackExponentialSmoothingLr_velocity: 0.05  # Learning rate for new velocity

  # [State Estimator] [Kalman Filter] 
  kfProcessNoiseVar4Loc: 0.1   # Process noise variance for location in Kalman filter
  kfProcessNoiseVar4Scale: 0.04   # Process noise variance for scale in Kalman filter
  kfProcessNoiseVar4Vel: 0.04   # Process noise variance for velocity in Kalman filter
  kfMeasurementNoiseVar4Trk: 9   # Measurement noise variance for tracker's detection in Kalman filter
  kfMeasurementNoiseVar4Det: 9   # Measurement noise variance for detector's detection in Kalman filter
  
  # [Past-frame Data] 
  useBufferedOutput: 0   # Enable storing of past-frame data in a buffer and report it back
  
  # [Instance-awareness]
  useInstanceAwareness: 0 # Use instance-awareness for multi-object tracking
  lambda_ia: 2            # Regularlization factor for each instance
  maxInstanceNum_ia: 4    # The number of nearby object instances to use for instance-awareness