Deepstream Nvtracker, bounding boxes issues

Hello everyone,

I would like to know if I made any mistake during my tests, because this is what happens to me:

Hardware Jetson Xavier Jetpack 4.2.2 [L4T 32.2.1]
PGIE Model detector Yolo V3

PGIE batch size 64

Tracker NVDCF

Detection interval: 2

Video source fullHD 6FPS

Video Source fullHD 20FPS

When I run the Deepstream sample app using a non Live video source with low fps (the same happen with the 20FPS but the issue is less visible ), the app detects correctly the object for the first time and the associated Bounding Box presents the right coordinates, but the next two frames, that are not processed through the NVInfer but just through the Nvtracker, have the bounding Boxes associated to the same object with the wrong coordinates but the object’s ID is still correct, in the next frame when the NVinfer does the inference again the ID of the same object is still the same and the associated bounding box is correct again.

Then I created a folder with the following content:

Original video at 6FPS

Same original video at 20FPS

Tracking error folder where you can find the behaviour that I described above, jpeg after jpeg (direct link at the folder 6FPSInterval2 - Google Drive).

20FPS folder where you can find all my tests with different interval and different trackers using a 20FPS FullHD Video Source

6FPS folder where you can find all my tests with different interval and different trackers using a 6FPS FullHD Video Source

A README File that describe the environment that I used for my tests.

You can reach the described folder at the following link:

https://drive.google.com/drive/folders/11ouUceTTMLkm7p8XNGuog8-ItQdTmZ7C?usp=sharing

Could you please help me to understand how to solve it?

PS.

I get the same results with Jetson NANO

thanks

Hello, Rik!
I have another similar problem with tracker. But in my case I have frequent ID switches. I’m trying to deal with tracker options in tracker_config.yml.
Can you share your config file? I guess you should increase maxShadowTrackingAge and SearchRegionPaddingScale parameters.

Hello @mmv047,
in these cases I did not use the configs file for trackers, I get the same results with NVDCF and Mot_klt tracker, the last one does not need a configuration file.

Hello ,
I would like to share with you directly the pictures that represent the issue that I described above.
Look at the car with initial ID 20
Detection


Tracking

Tracking

Detection

Tracking

Tracking

Detection

Hey Customer,

We have improved NVDCF tracker in DS 5.0, would you mind to try it using DS 5.0?

BTW, what is the result if you remove tracker in your pipeline?

Hello Customer,

Thanks for reporting the issue and the repro set-up. Will investigate locally and get back to you shortly.

By the way, what DeepStream version did you use? If you’ve not tried the recently released DS 5.0 DP, could you try that out?

Hello Customer,

For 6 FPS video, the objects moves a lot faster than 20 FPS video. So we have to adjust the params related to motion modeling in the tracker. Based on the default tracker_config.yml, I changed the params like below, and I verified that it works much better.

SearchRegionPaddingScale: 3
probationAge: 2
earlyTerminationAge: 1

trackExponentialSmoothingLr_loc: 0.9 #0.5 # Learning rate for new location
trackExponentialSmoothingLr_scale: 0.9 #0.3 # Learning rate for new scale
trackExponentialSmoothingLr_velocity: 0.9 #0.05 # Learning rate for new velocity

I am not saying this is the best param for your setting, but I hope you get an idea on what params to play with to find the best one for you.

Hello @bcao,
Currently I am using DS 4.0,
In the links that I shared with you above you can find all the results of my tests, processed videos (20FPS and 6FPS) with different interval and also without tracker, obviously I shared with you also the original videos.
in a 6FPS folder you can find the out6FPS0IntervalNoTracker.mp4, that is what you are requesting for. With no tracker I did not reproduce the bounding boxes position problem.

Hello, @pshin,
I am going to try the tracker config file for NVDCF as you and @mmv047 suggest and then we will see the results.
Thanks

Hello guys,
I have done the tests (6FPS and 20FPS) and I have uploaded all the results on the same repo. You can reach them here:
6FPS/20FPS->libnvds_nvdcf.so->WithTrackerConfigFile->DefaultTracker ConfigFile(default config file provided from Nvidia)
6FPS/20FPS->libnvds_nvdcf.so->WithTrackerConfigFile->CustomTrackerConfigFile(config file modified with the params that @pshin has suggested to me).
There you can find also the others interval I tried since 0 to 3.
In the end i think that the problem of the bounding boxes is still there, for example if you have a look at these 3 files:

out20FPS2IntervalYESTracker.mp4 - Google Drive (interval 2, no tracker configuration file, NVDCF tracker used)

out20FPS2IntervalDEFConfigFile.mp4 - Google Drive (interval 2, default tracker configuration file used, NVDCF tracker used)

out20FPS2IntervalConfigFile.mp4 - Google Drive(interval 2, custom tracker configuration file used, NVDCF tracker used)

I don’t know If I made any mistake, but I didn’t see any improvement with the configuration file.
I am using the int8 precision YoloV3 model as engine and the resolution of my stream muxer is 1920x1080.

Could you please @pshin share your configuration file?
Which resolution are you using for tracker?
Which kind of model are you using for inference?

Thank you guys.

Hello Rik,

I’ve used YoloV3 provided in DeepStream and ran the following deepstream-app config with detection interval of 2 for 20FPS stream you provided. I verify that everything looked fine with these pipeline. No bbox ID changes for moving cars, and bboxes are following cars properly. I notice a bit of drag of bboxes for the frames detection is not done, but I believe you can tweak the params more, you should be able to improve the behavior. I generated an output video, but I couldn’t upload it here due to restrictions in forum reply.

Below is deepstream-app config I used. Please note that I used 960x544 in tracker width/height:

################################################################################
# Copyright (c) 2019-2020, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
#uri=file://../../../streams/sample_1080p_h264.mp4
#uri=file:///home/pshin/Downloads/NvBUGs/Issue_122834/axisOriginal_6FPS.mkv
uri=file:///home/pshin/Downloads/NvBUGs/Issue_122834/axisOriginal_20FPS.mkv
num-sources=1
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[sink2]
enable=1
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
## only SW mpeg4 is supported right now.
codec=3
sync=0
bitrate=2000000
output-file=out_issue_122834.mp4
source-id=0
qos=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
model-engine-file=model_b1_gpu0_int8.engine
labelfile-path=labels.txt
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=2
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3.txt

[tracker]
enable=1
tracker-width=960 #640
tracker-height=544 #368
#ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_mot_iou.so
#ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_mot_klt.so
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvdcf.so
ll-config-file=tracker_config.yml
#ll-config-file=iou_config.txt
gpu-id=0
#enable-batch-process applicable to DCF only
enable-batch-process=1
enable-past-frame=0

[tests]
file-loop=0

Below is the config file I used for tracker_config.yml for NvDCF tracker:

%YAML:1.0
  
NvDCF:
  # [General]
  useUniqueID: 1    # Use 64-bit long Unique ID when assignining tracker ID. Default is [true]
  maxTargetsPerStream: 99 # Max number of targets to track per stream. Recommended to set >10. Note: this value should account for the targets being tracked in shadow mode as well. Max value depends on the GPU memory capacity
  
  # [Feature Extraction]
  useColorNames: 1     # Use ColorNames feature
  useHog: 1            # Use Histogram-of-Oriented-Gradient (HOG) feature
  useHighPrecisionFeature: 1   # Use high-precision in feature extraction. Default is [true]

  # [DCF]
  filterLr: 0.15 # learning rate for DCF filter in exponential moving average. Valid Range: [0.0, 1.0]
  filterChannelWeightsLr: 0.22 # learning rate for the channel weights among feature channels. Valid Range: [0.0, 1.0]
  gaussianSigma: 0.75 # Standard deviation for Gaussian for desired response when creating DCF filter [pixels]
  featureImgSizeLevel: 2 # Size of a feature image. Valid range: {1, 2, 3, 4, 5}, from the smallest to the largest
  SearchRegionPaddingScale: 1 # Search region size. Determines how large the search region should be scaled from the target bbox.  Valid range: {1, 2, 3}, from the smallest to the largest
  
  # [MOT] [False Alarm Handling]
  maxShadowTrackingAge: 30  # Max length of shadow tracking (the shadow tracking age is incremented when (1) there's detector input yet no match or (2) tracker confidence is lower than minTrackerConfidence). Once reached, the tracker will be terminated.
  probationAge: 3           # Once the tracker age (incremented at every frame) reaches this, the tracker is considered to be valid
  earlyTerminationAge: 1    # Early termination age (in terms of shadow tracking age) during the probation period. If reached during the probation period, the tracker will be terminated prematurely.

  # [Tracker Creation Policy] [Target Candidacy]
  minDetectorConfidence: -1  # If the confidence of a detector bbox is lower than this, then it won't be considered for tracking
  minTrackerConfidence: 0.7  # If the confidence of an object tracker is lower than this on the fly, then it will be tracked in shadow mode. Valid Range: [0.0, 1.0]
  minTargetBboxSize: 10      # If the width or height of the bbox size gets smaller than this threshold, the target will be terminated.
  minDetectorBboxVisibilityTobeTracked: 0.0  # If the detector-provided bbox's visibility (i.e., IOU with image) is lower than this, it won't be considered.  
  minVisibiilty4Tracking: 0.0  # If the visibility of the tracked object (i.e., IOU with image) is lower than this, it will be terminated immediately, assuming it is going out of scene.
  
  # [Tracker Termination Policy]
  targetDuplicateRunInterval: 5 # The interval in which the duplicate target detection removal is carried out. A Negative value indicates indefinite interval. Unit: [frames]
  minIou4TargetDuplicate: 0.9 # If the IOU of two target bboxes are higher than this, the newer target tracker will be terminated.

  # [Data Association] Matching method
  useGlobalMatching: 0   # If true, enable a global matching algorithm (i.e., Hungarian method). Otherwise, a greedy algorithm wll be used.

  # [Data Association] Thresholds in matching scores to be considered as a valid candidate for matching
  minMatchingScore4Overall: 0.0   # Min total score
  minMatchingScore4SizeSimilarity: 0.5    # Min bbox size similarity score
  minMatchingScore4Iou: 0.1       # Min IOU score
  minMatchingScore4VisualSimilarity: 0.2    # Min visual similarity score
  minTrackingConfidenceDuringInactive: 0.8 #1.0  # Min tracking confidence during INACTIVE period. If tracking confidence is higher than this, then tracker will still output results until next detection 

  # [Data Association] Weights for each matching score term
  matchingScoreWeight4VisualSimilarity: 0.8  # Weight for the visual similarity (in terms of correlation response ratio)
  matchingScoreWeight4SizeSimilarity: 0.0    # Weight for the Size-similarity score
  matchingScoreWeight4Iou: 0.1               # Weight for the IOU score
  matchingScoreWeight4Age: 0.1               # Weight for the tracker age

  # [State Estimator]
  useTrackSmoothing: 1    # Use a state estimator
  stateEstimatorType: 1   # The type of state estimator among { moving_avg:1, kalman_filter:2 }

  # [State Estimator] [MovingAvgEstimator]
  trackExponentialSmoothingLr_loc: 0.5       # Learning rate for new location
  trackExponentialSmoothingLr_scale: 0.3     # Learning rate for new scale
  trackExponentialSmoothingLr_velocity: 0.05  # Learning rate for new velocity

  # [State Estimator] [Kalman Filter] 
  kfProcessNoiseVar4Loc: 0.1   # Process noise variance for location in Kalman filter
  kfProcessNoiseVar4Scale: 0.04   # Process noise variance for scale in Kalman filter
  kfProcessNoiseVar4Vel: 0.04   # Process noise variance for velocity in Kalman filter
  kfMeasurementNoiseVar4Trk: 9   # Measurement noise variance for tracker's detection in Kalman filter
  kfMeasurementNoiseVar4Det: 9   # Measurement noise variance for detector's detection in Kalman filter
  
  # [Past-frame Data] 
  useBufferedOutput: 0   # Enable storing of past-frame data in a buffer and report it back
  
  # [Instance-awareness]
  useInstanceAwareness: 0 # Use instance-awareness for multi-object tracking
  lambda_ia: 2            # Regularlization factor for each instance
  maxInstanceNum_ia: 4    # The number of nearby object instances to use for instance-awareness

Hello @pshin thank you for your time,
I did the same test with the same model and with the same configuration you provided, I uploaded the result in the same repo and you can reach it at the following link 20FPS2IntervalPshinTrackerConfig.mp4 - Google Drive
could you please confirm me that we are getting the same results?
The problem that i mentioned above regarding the the bounding boxes position is more visible with the 6FPS Video.
What version of Deepstream are you using?

Thanks

Hello Rik,

It seems like your output video has a lot more jitters. Below is my output video, so please check it out. I included Interval=5 case as well just in case you are interested.

I am using DeepStream 5.0 DP version, which is released recently.

Interval=2

Interval=5

Hi pshin,
I really appreciate the time that you are spending on this topic.
I saw your video and definitely we where seeing different results :) , using DS 4.0 I get similar results that you had just without using the tracker configuration file, increasing tracker’s width and height to 960x544 and reducing the tiled display width and height to 1280x720, if I use any tracker’s configuration file (also the default one) the results that I get are worse then the results without tracker’s config file (you can find everything in to the repo I created) but does not matter at the moment. The main thing that I really would like to understand is: do you reproduce the problem about the position of the bounding boxes that I mentioned at the beginning of this topic, with the 6FPS Video I provided?
Thanks

Hello Rik,

I advise you to use the newly released DS 5.0 DP version. There have been substantial amount of bug fixes, stability & accuracy improvements, so I would definitely use DS 5.0 DP.

Regarding 6FPS issue, I observed that the tracking wasn’t done properly with the default config setting. That’s why I suggested using different param to accommodate the motion change in 6FPS video: Deepstream Nvtracker, bounding boxes issues - #9 by pshin

With this suggested param, I saw much robust tracking in 6 FPS video.

Hi Pshin
thanks to your time,
with the config params that you suggested to me i did not see any improvements, we can conclude that, the issue can be replicated just with DS 4.0 not with the 5.0 version .
Just another thing guys, what’s the best criteria to set the right width and the right height of the tracker?

Riccardo

Hello Rik,

I guess a major factor is the min size of object bbox. If tracker width/height is too small, then it is more likely to miss objects and also visual information within bbox would be smaller, causing less accuracy.

Increasing tracker width/height wouldn’t cause much overhead to NvDCF tracker (unlike KLT tracker), so I recommend using at least 960x544 for tracker. Note that the tracker height is expected to be a multiple of 32.

Hello everyone,
I would like to share with you some results from my tests.
With DS5 and through the tracker’s config file that you suggested to me above, i confirm that the bounding boxes’s issue is less visible than the previous DS version.
In the screenshot attached below you can see some benchmark that i did with 3 different scenarios
1 Jetson Xavier with DS4 jetpack 4.2.2
1 Jetson Xavier with DS5 jetpack 4.4 DP
1 Pc with Nvidia Quadro P2000 jetpack 4.4 DP
and i noticed that:
with DS 5 on Jetson Xavier the performances are worst than using DS4 , also increasing the inference interval there is a visible gap between DS4 and DS5, otherwise the results, using Nvidia Quadro with DS5, are coherent with DS4 on Jetson Xavier.
Have I made any mistakes during my tests? Could you please share with me your results?
Moreover, i noticed other weird things on DS5,
The fist one is:
Using multiple live sources (http) i noticed that there is a worsening of inference and tracking accuracy.

Rik

1 Like

@Rik Thank you for doing these tests, i wish there were more benchmarks like these done by the community. Though, it should be noted that NVDCF in DS 5.0 DP was an attempt by Nvidia to make a more accurate tracker, maybe performance was the trade off that had to occur in the developer preview version. If you look at the config file for both versions you’ll notice a large increase in features on 5.0 NVDCF. Thus more information that needs to be processed and lower fps.

If you set maxTargetsPerStream to say 25 or 30 instead of 99. You’ll see a large performance increase, also nvstreammux will allocate (x) mb of memory to the gpu if it has to scale up/down the images. So its best to set the input resolution of your pipeline to the native resolution of the feed itself. You want to save all that GPU memory for Nvinfer and NVDCF. Also if you have multiple ip cameras they might have different native resolutions and aspect ratios, you’ll want to set enable-padding=true on nvstreammux to keep everything the same aspect ratio. This will improve inference and tracker results.

In my experience though, once you understand how NVDCF works, its better to design (at least try to) your own tracker that fits your use cases. You’ll see better results than using the reference (NVDCF) tracker that nvidia kindly provided.

1 Like