Hello Janos,
Thanks for checking out DS again for this release. We’ve worked hard to make the tracker more useful to our users.
There’s a bit of background needed to understand the behavior of NvDCF tracker and to better use it. What you reported in terms of the jitter in the number of objects per frame is actually expected behavior. To minimize the false alarms, NvDCF employs a technique called ‘Shadow Tracking’, so whenever a target is not associated with a detector-generated bbox, then it is immediately put into INACTIVE mode and being tracked in background. (Please refer to DS 5.0 documentation on NvDCF tracker at here)
During shadow tracking, the track output is not reported in case it is not a valid target. If the target is associated with a detector-generated bbox, then the target is put into ACTIVE mode and the tracking output will be reported again with the same ID.
In the pipeline, you have access to the tracking outputs stored during shadow tracking. Please refer to enable-past-frame
in the documentation on how to do it.
You may still see some targets are reported even during INACTIVE mode. It is because those objects have very high confidence. You can adjust the threshold of such confidence by adding the following param:
minTrackingConfidenceDuringInactive: 0.7
(default value is set 0.9)
Below are my output videos, and you can see that most of the targets are still shown even with Interval=3
:
Interval=0
Interval=3
I used the below params in deepstream_app config file:
[tracker]
tracker-width=960
tracker-height=544
Below is tracker_config.yml
I used to generate the videos.
%YAML:1.0
NvDCF:
# [General]
useUniqueID: 1 # Use 64-bit long Unique ID when assignining tracker ID. Default is [true]
maxTargetsPerStream: 99 # Max number of targets to track per stream. Recommended to set >10. Note: this value should account for the targets being tracked in shadow mode as well. Max value depends on the GPU memory capacity
# [Feature Extraction]
useColorNames: 1 # Use ColorNames feature
useHog: 1 # Use Histogram-of-Oriented-Gradient (HOG) feature
useHighPrecisionFeature: 1 # Use high-precision in feature extraction. Default is [true]
# [DCF]
filterLr: 0.15 # learning rate for DCF filter in exponential moving average. Valid Range: [0.0, 1.0]
filterChannelWeightsLr: 0.22 # learning rate for the channel weights among feature channels. Valid Range: [0.0, 1.0]
gaussianSigma: 0.75 # Standard deviation for Gaussian for desired response when creating DCF filter [pixels]
featureImgSizeLevel: 3 # Size of a feature image. Valid range: {1, 2, 3, 4, 5}, from the smallest to the largest
SearchRegionPaddingScale: 1 # Search region size. Determines how large the search region should be scaled from the target bbox. Valid range: {1, 2, 3}, from the smallest to the largest
# [MOT] [False Alarm Handling]
maxShadowTrackingAge: 30 # Max length of shadow tracking (the shadow tracking age is incremented when (1) there's detector input yet no match or (2) tracker confidence is lower than minTrackerConfidence). Once reached, the tracker will be terminated.
probationAge: 3 # Once the tracker age (incremented at every frame) reaches this, the tracker is considered to be valid
earlyTerminationAge: 1 # Early termination age (in terms of shadow tracking age) during the probation period. If reached during the probation period, the tracker will be terminated prematurely.
# [Tracker Creation Policy] [Target Candidacy]
minDetectorConfidence: -1 # If the confidence of a detector bbox is lower than this, then it won't be considered for tracking
minTrackerConfidence: 0.7 # If the confidence of an object tracker is lower than this on the fly, then it will be tracked in shadow mode. Valid Range: [0.0, 1.0]
minTargetBboxSize: 10 # If the width or height of the bbox size gets smaller than this threshold, the target will be terminated.
minDetectorBboxVisibilityTobeTracked: 0.0 # If the detector-provided bbox's visibility (i.e., IOU with image) is lower than this, it won't be considered.
minVisibiilty4Tracking: 0.0 # If the visibility of the tracked object (i.e., IOU with image) is lower than this, it will be terminated immediately, assuming it is going out of scene.
# [Tracker Termination Policy]
targetDuplicateRunInterval: 5 # The interval in which the duplicate target detection removal is carried out. A Negative value indicates indefinite interval. Unit: [frames]
minIou4TargetDuplicate: 0.9 # If the IOU of two target bboxes are higher than this, the newer target tracker will be terminated.
# [Data Association] Matching method
useGlobalMatching: 0 # If true, enable a global matching algorithm (i.e., Hungarian method). Otherwise, a greedy algorithm wll be used.
# [Data Association] Thresholds in matching scores to be considered as a valid candidate for matching
minMatchingScore4Overall: 0.0 # Min total score
minMatchingScore4SizeSimilarity: 0.5 # Min bbox size similarity score
minMatchingScore4Iou: 0.1 # Min IOU score
minMatchingScore4VisualSimilarity: 0.2 # Min visual similarity score
minTrackingConfidenceDuringInactive: 0.7 # Min tracking confidence during INACTIVE period. If tracking confidence is higher than this, then tracker will still output results until next detection
# [Data Association] Weights for each matching score term
matchingScoreWeight4VisualSimilarity: 0.8 # Weight for the visual similarity (in terms of correlation response ratio)
matchingScoreWeight4SizeSimilarity: 0.0 # Weight for the Size-similarity score
matchingScoreWeight4Iou: 0.1 # Weight for the IOU score
matchingScoreWeight4Age: 0.1 # Weight for the tracker age
# [State Estimator]
useTrackSmoothing: 1 # Use a state estimator
stateEstimatorType: 1 # The type of state estimator among { moving_avg:1, kalman_filter:2 }
# [State Estimator] [MovingAvgEstimator]
trackExponentialSmoothingLr_loc: 0.5 # Learning rate for new location
trackExponentialSmoothingLr_scale: 0.3 # Learning rate for new scale
trackExponentialSmoothingLr_velocity: 0.05 # Learning rate for new velocity
# [State Estimator] [Kalman Filter]
kfProcessNoiseVar4Loc: 0.1 # Process noise variance for location in Kalman filter
kfProcessNoiseVar4Scale: 0.04 # Process noise variance for scale in Kalman filter
kfProcessNoiseVar4Vel: 0.04 # Process noise variance for velocity in Kalman filter
kfMeasurementNoiseVar4Trk: 9 # Measurement noise variance for tracker's detection in Kalman filter
kfMeasurementNoiseVar4Det: 9 # Measurement noise variance for detector's detection in Kalman filter
# [Past-frame Data]
useBufferedOutput: 0 # Enable storing of past-frame data in a buffer and report it back
# [Instance-awareness]
useInstanceAwareness: 0 # Use instance-awareness for multi-object tracking
lambda_ia: 2 # Regularlization factor for each instance
maxInstanceNum_ia: 4 # The number of nearby object instances to use for instance-awareness
In Interval=3 video, we may still see some jitters in bboxes. It’s mainly because of the fact that our tracker doesn’t adjust its scale until it is re-associated with a detector bbox. That’s why bbox sizes jump at every 4th frame.
This sample video itself is pretty tricky in tracking because (1) the perspective change of targets and (2) the bbox size changes are severe as the objects are coming toward the camera.
However, we can observe that the IDs are quite stable and persist even with interval=3
case.
Please let me know if you need any help from us.