Enable-batch-process=1 slows down the NvDCF tracker

fre_deric · June 17, 2021, 3:14pm

Description

TensorRT Version : 7.2.3
GPU Type : rtx 3090 24Gb
Nvidia Driver Version : 460.73.01
CUDA Version : 11.1
CUDNN Version : 8.1.1
Operating System + Version : ubuntu 18.04

Description

From NvDCF Low-Level Tracker I understood that enabling batch-processing enable-batch-process=1 should speed up the tracking algorithm. However in my case it’s opposite. For same settings (same detection model, same config files and same input video file), NvDCF with enable-batch-process=1 is slower.

Without batch-processing:
enable-batch-process=0
enable-past-frame=0
GPU is utilized for ~80%

**PERF:  FPS 0 (Avg)	FPS 1 (Avg)	FPS 2 (Avg)	FPS 3 (Avg)	FPS 4 (Avg)	FPS 5 (Avg)	FPS 6 (Avg)	FPS 7 (Avg)	
**PERF:  37.02 (36.90)	34.60 (34.50)	34.60 (34.50)	41.80 (41.64)	37.02 (36.90)	41.80 (41.64)	41.80 (41.64)	34.60 (34.50)	
**PERF:  37.45 (37.25)	37.45 (36.51)	37.45 (36.51)	37.45 (38.49)	37.45 (37.25)	37.45 (38.49)	37.45 (38.49)	37.45 (36.51)	
**PERF:  35.56 (36.56)	35.56 (36.13)	35.56 (36.13)	35.56 (37.25)	35.56 (36.56)	35.56 (37.25)	35.56 (37.25)	35.56 (36.13)	
**PERF:  33.15 (35.57)	33.15 (35.28)	33.15 (35.28)	33.15 (36.04)	33.15 (35.57)	33.15 (36.04)	33.15 (36.04)	33.15 (35.28)	
**PERF:  34.82 (35.40)	34.82 (35.17)	34.82 (35.17)	34.82 (35.75)	34.82 (35.40)	34.82 (35.75)	34.82 (35.75)	34.82 (35.17)	
**PERF:  31.97 (34.77)	31.97 (34.59)	31.97 (34.59)	31.97 (35.05)	31.97 (34.77)	31.97 (35.05)	31.97 (35.05)	31.97 (34.59)	
**PERF:  35.30 (34.83)	35.30 (34.68)	35.30 (34.68)	35.30 (35.07)	35.30 (34.83)	35.30 (35.07)	35.30 (35.07)	35.30 (34.68)	
**PERF:  36.00 (34.99)	36.00 (34.86)	36.00 (34.86)	36.00 (35.20)	36.00 (34.99)	36.00 (35.20)	36.00 (35.20)	36.00 (34.86)	
**PERF:  35.01 (34.99)	35.01 (34.87)	35.01 (34.87)	35.01 (35.17)	35.01 (34.99)	35.01 (35.17)	35.01 (35.17)	35.01 (34.87)	
**PERF:  34.38 (34.93)	34.38 (34.82)	34.38 (34.82)	34.38 (35.09)	34.38 (34.93)	34.38 (35.09)	34.38 (35.09)	34.38 (34.82)

With batch-processing:
enable-batch-process=1
enable-past-frame=1
GPU is utilized only for ~15%

**PERF:  FPS 0 (Avg)	FPS 1 (Avg)	FPS 2 (Avg)	FPS 3 (Avg)	FPS 4 (Avg)	FPS 5 (Avg)	FPS 6 (Avg)	FPS 7 (Avg)	
**PERF:  11.52 (10.94)	11.52 (10.94)	11.47 (10.83)	11.52 (10.94)	11.52 (10.94)	11.47 (10.83)	11.53 (10.97)	11.52 (10.94)	
**PERF:  8.34 (9.05)	8.34 (9.05)	8.34 (8.98)	8.34 (9.05)	8.34 (9.05)	8.34 (8.98)	8.34 (9.09)	8.34 (9.05)	
**PERF:  7.39 (8.35)	7.39 (8.35)	7.39 (8.30)	7.39 (8.35)	7.39 (8.35)	7.39 (8.30)	7.39 (8.37)	7.39 (8.35)	
**PERF:  7.16 (8.01)	7.16 (8.01)	7.16 (7.97)	7.16 (8.01)	7.16 (8.01)	7.16 (7.97)	7.16 (8.02)	7.16 (8.01)	
**PERF:  6.71 (7.68)	6.71 (7.68)	6.71 (7.65)	6.71 (7.68)	6.71 (7.68)	6.71 (7.65)	6.71 (7.70)	6.71 (7.68)	
**PERF:  7.23 (7.59)	7.23 (7.59)	7.23 (7.56)	7.23 (7.59)	7.23 (7.59)	7.23 (7.56)	7.23 (7.60)	7.23 (7.59)	
**PERF:  6.99 (7.50)	6.99 (7.50)	6.99 (7.47)	6.99 (7.50)	6.99 (7.50)	6.99 (7.47)	6.99 (7.51)	6.99 (7.50)	
**PERF:  6.87 (7.43)	6.87 (7.43)	6.87 (7.41)	6.87 (7.43)	6.87 (7.43)	6.87 (7.41)	6.87 (7.44)	6.87 (7.43)	
**PERF:  6.56 (7.31)	6.56 (7.31)	6.56 (7.29)	6.56 (7.31)	6.56 (7.31)	6.56 (7.29)	6.56 (7.32)	6.56 (7.31)

Is it true, that ‘enable-batch-process=1’ should speed up the tracking? Did I miss anything? Is it required to recompile the NvDCF tracker for specific batch size?

deepstream_app_config:

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[source0]
enable=1
type=3
uri=file:///home/video.mp4
num-sources=8
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#1: Fakesink 2: EGL based windowed sink (nveglglessink)
type=1
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[streammux]
gpu-id=0
live-source=0
batch-size=8
batched-push-timeout=33333
width=1920
height=1080
enable-padding=0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=/opt/nvidia/deepstream/deepstream-5.1/sources/yolov4-csp/config_infer_primary.txt

[tracker]
enable=1
tracker-width=640
tracker-height=384
# NvDCF tracker
ll-lib-file=/opt/nvidia/deepstream/deepstream-5.1/lib/libnvds_nvdcf.so
ll-config-file=nvdcf_tracker_config.yml
gpu-id=0
# Enable
enable-batch-process=1
enable-past-frame=1
# Disable
#enable-batch-process=0
#enable-past-frame=0
display-tracking-id=1

nvdcf_tracker_config.yml:

NvDCF:
  # [General]
  useUniqueID: 0    # Use 64-bit long Unique ID when assignining tracker ID. Default is [true]
  maxTargetsPerStream: 200 # Max number of targets to track per stream. Recommended to set >10. Note: this value should account for the targets being tracked in shadow mode as well. Max value depends on the GPU memory capacity

  # [Feature Extraction]
  useColorNames: 1     # Use ColorNames feature
  useHog: 1            # Use Histogram-of-Oriented-Gradient (HOG) feature
  useHighPrecisionFeature: 1   # Use high-precision in feature extraction. Default is [true]

  # [DCF]
  filterLr: 0.15 # learning rate for DCF filter in exponential moving average. Valid Range: [0.0, 1.0]
  filterChannelWeightsLr: 0.22 # learning rate for the channel weights among feature channels. Valid Range: [0.0, 1.0]
  gaussianSigma: 0.75 # Standard deviation for Gaussian for desired response when creating DCF filter [pixels]
  featureImgSizeLevel: 5 # Size of a feature image. Valid range: {1, 2, 3, 4, 5}, from the smallest to the largest
  SearchRegionPaddingScale: 3 # Search region size. Determines how large the search region should be scaled from the target bbox.  Valid range: {1, 2, 3}, from the smallest to the largest

  # [MOT] [False Alarm Handling]
  maxShadowTrackingAge: 30  # Max length of shadow tracking (the shadow tracking age is incremented when (1) there's detector input yet no match or (2) tracker confidence is lower than minTrackerConfidence). Once reached, the tracker will be terminated.
  probationAge: 3           # Once the tracker age (incremented at every frame) reaches this, the tracker is considered to be valid
  earlyTerminationAge: 1    # Early termination age (in terms of shadow tracking age) during the probation period. If reached during the probation period, the tracker will be terminated prematurely.

  # [Tracker Creation Policy] [Target Candidacy]
  minDetectorConfidence: -1  # If the confidence of a detector bbox is lower than this, then it won't be considered for tracking
  minTrackerConfidence: 0.7  # If the confidence of an object tracker is lower than this on the fly, then it will be tracked in shadow mode. Valid Range: [0.0, 1.0]
  minTargetBboxSize: 10      # If the width or height of the bbox size gets smaller than this threshold, the target will be terminated.
  minDetectorBboxVisibilityTobeTracked: 0.0  # If the detector-provided bbox's visibility (i.e., IOU with image) is lower than this, it won't be considered.
  minVisibiilty4Tracking: 0.0  # If the visibility of the tracked object (i.e., IOU with image) is lower than this, it will be terminated immediately, assuming it is going out of scene.

  # [Tracker Termination Policy]
  targetDuplicateRunInterval: 5 # The interval in which the duplicate target detection removal is carried out. A Negative value indicates indefinite interval. Unit: [frames]
  minIou4TargetDuplicate: 0.9 # If the IOU of two target bboxes are higher than this, the newer target tracker will be terminated.

  # [Data Association] Matching method
  useGlobalMatching: 1   # If true, enable a global matching algorithm (i.e., Hungarian method). Otherwise, a greedy algorithm wll be used.

  # [Data Association] Thresholds in matching scores to be considered as a valid candidate for matching
  minMatchingScore4Overall: 0.0   # Min total score
  minMatchingScore4SizeSimilarity: 0.5    # Min bbox size similarity score
  minMatchingScore4Iou: 0.1       # Min IOU score
  minMatchingScore4VisualSimilarity: 0.2    # Min visual similarity score
  minTrackingConfidenceDuringInactive: 1.0  # Min tracking confidence during INACTIVE period. If tracking confidence is higher than this, then tracker will still output results until next detection 

  # [Data Association] Weights for each matching score term
  matchingScoreWeight4VisualSimilarity: 0.8  # Weight for the visual similarity (in terms of correlation response ratio)
  matchingScoreWeight4SizeSimilarity: 0.0    # Weight for the Size-similarity score
  matchingScoreWeight4Iou: 0.1               # Weight for the IOU score
  matchingScoreWeight4Age: 0.1               # Weight for the tracker age

  # [State Estimator]
  useTrackSmoothing: 1    # Use a state estimator
  stateEstimatorType: 2   # The type of state estimator among { moving_avg:1, kalman_filter:2 }

  # [State Estimator] [MovingAvgEstimator]
  trackExponentialSmoothingLr_loc: 0.5       # Learning rate for new location
  trackExponentialSmoothingLr_scale: 0.3     # Learning rate for new scale
  trackExponentialSmoothingLr_velocity: 0.05  # Learning rate for new velocity

  # [State Estimator] [Kalman Filter]
  kfProcessNoiseVar4Loc: 0.1   # Process noise variance for location in Kalman filter
  kfProcessNoiseVar4Scale: 0.04   # Process noise variance for scale in Kalman filter
  kfProcessNoiseVar4Vel: 0.04   # Process noise variance for velocity in Kalman filter
  kfMeasurementNoiseVar4Trk: 9   # Measurement noise variance for tracker's detection in Kalman filter
  kfMeasurementNoiseVar4Det: 9   # Measurement noise variance for detector's detection in Kalman filter

  # [Past-frame Data]
  useBufferedOutput: 1   # Enable storing of past-frame data in a buffer and report it back

  # [Instance-awareness]
  useInstanceAwareness: 1 # Use instance-awareness for multi-object tracking
  lambda_ia: 2            # Regularlization factor for each instance
  maxInstanceNum_ia: 4    # The number of nearby object instances to use for instance-awareness

bcao · June 21, 2021, 7:49am

Hey customer, what’s your pipeline and could you share a simple repro?

fre_deric · June 21, 2021, 10:25am

Hi @bcao,
thank you for your time!

According the deepstream_app_config that I posted above, the pipeline is:
video decoding → streammux → nvinfer → fakesink

The config_infer_primary.txt is here:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
custom-network-config=/opt/nvidia/deepstream/deepstream-5.1/sources/yolov4/yolov4-csp/my_weights/yolov4-csp.cfg
model-file=/opt/nvidia/deepstream/deepstream-5.1/sources/yolov4/yolov4-csp/my_weights/yolov4-csp_best.weights
model-engine-file=/opt/nvidia/deepstream/deepstream-5.1/sources/yolov4/yolov4-csp/my_weights/model_b8_gpu0_fp16.engine
labelfile-path=/opt/nvidia/deepstream/deepstream-5.1/sources/yolov4/yolov4-csp/my_weights/obj.names
batch-size=8
# 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=1
interval=0
gie-unique-id=1
# Infer Processing Mode 1=Primary Mode 2=Secondary Mode
process-mode=1
# Integer 0: Detector 1: Classifier 2: Segmentation 3: Instance Segmentation
network-type=0
# Integer 0: OpenCV groupRectangles() 1: DBSCAN 2: Non Maximum Suppression 3: DBSCAN + NMS Hybrid 4: No clustering
cluster-mode=4
maintain-aspect-ratio=0
parse-bbox-func-name=NvDsInferParseYolo
custom-lib-path=/opt/nvidia/deepstream/deepstream-5.1/sources/yolov4/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
pre-cluster-threshold=0.25

What else should I put here? (deepstream_app_config and nvdcf_tracker_config.yml: files are above)

Thank you!

fre_deric · June 25, 2021, 12:49pm

@bcao , I wrote the pipeline without the tracker. Of course there is the tracker as you can see in the deepstream_app_config:

The pipeline is:
video decoding → streammux → nvinfer → nvtracker → fakesink

Do you have any idea why the ‘enable-batch-process=1’ slows down the pipeline?

bcao · June 29, 2021, 4:55am

Sorry for the late, I mean can you share a simple repro with us, maybe a gst-lauanch-1.0 command with your config file or a simple app which can repro the issue. So that will allow us in the same align, would you mind to do that?

fre_deric · June 29, 2021, 12:08pm

Hi @bcao,
thank you for your reply!

I think I found the issue.

In my nvdcf_tracker_config.yml file, I use all visual feature types (useColorNames=1, useHog=1, useHighPrecisionFeature=1)

fre_deric:

  # [Feature Extraction]
  useColorNames: 1     # Use ColorNames feature
  useHog: 1            # Use Histogram-of-Oriented-Gradient (HOG) feature
  useHighPrecisionFeature: 1   # Use high-precision in feature extraction. Default is [true]

When useHOG is enabled, then the tracking is really slower with ‘enable-batch-process=1’ than with ‘enable-batch-process=0‘.

For demonstartion, I took default nvdcf_tracker_config.yml file stored in:
/opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app/tracker_config.yml
and I just enabled the useHog:

  # [Feature Extraction]
  useColorNames: 0     # Use ColorNames feature
  useHog: 1            # Use Histogram-of-Oriented-Gradient (HOG) feature
  useHighPrecisionFeature: 0   # Use high-precision in feature extraction. Default is [true]

For ‘enable-batch-process=0’ the average FPS for each stream is around 68FPS:

[NvDCF] Initialized
[NvDCF] Initialized
[NvDCF] Initialized
[NvDCF] Initialized
[NvDCF] Initialized
[NvDCF] Initialized
[NvDCF] Initialized
[NvDCF] Initialized
**PERF:  FPS 0 (Avg)	FPS 1 (Avg)	FPS 2 (Avg)	FPS 3 (Avg)	FPS 4 (Avg)	FPS 5 (Avg)	FPS 6 (Avg)	FPS 7 (Avg)	
**PERF:  0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)
**PERF:  56.22 (55.85)	56.22 (55.85)	74.13 (73.37)	74.13 (73.37)	59.76 (59.33)	56.22 (55.85)	74.13 (73.37)	56.22 (55.85)	
**PERF:  69.10 (64.80)	69.10 (64.80)	69.10 (69.98)	69.10 (69.98)	69.10 (66.11)	69.10 (64.80)	69.10 (69.98)	69.10 (64.80)	
**PERF:  71.22 (67.31)	71.22 (67.31)	71.22 (70.42)	71.22 (70.42)	71.22 (68.12)	71.22 (67.31)	71.22 (70.42)	71.22 (67.31)	
**PERF:  70.38 (68.07)	70.38 (68.07)	70.38 (70.28)	70.38 (70.28)	70.38 (68.66)	70.38 (68.07)	70.38 (70.28)	70.38 (68.07)	
**PERF:  70.30 (68.72)	70.30 (68.72)	70.30 (70.44)	70.30 (70.44)	70.30 (69.18)	70.30 (68.72)	70.30 (70.44)	70.30 (68.72)	
**PERF:  68.30 (68.58)	68.30 (68.58)	68.30 (69.97)	68.30 (69.97)	68.30 (68.95)	68.30 (68.58)	68.30 (69.97)	68.30 (68.58)	
**PERF:  68.77 (68.64)	68.77 (68.64)	68.77 (69.81)	68.77 (69.81)	68.77 (68.96)	68.77 (68.64)	68.77 (69.81)	68.77 (68.64)	
**PERF:  68.78 (68.42)	68.78 (68.42)	68.78 (69.42)	68.78 (69.42)	68.78 (68.69)	68.78 (68.42)	68.78 (69.42)	68.78 (68.42)	
**PERF:  67.55 (68.49)	67.55 (68.49)	67.55 (69.37)	67.55 (69.37)	67.55 (68.73)	67.55 (68.49)	67.55 (69.37)	67.55 (68.49)	
**PERF:  66.21 (68.33)	66.21 (68.33)	66.21 (69.11)	66.21 (69.11)	66.21 (68.54)	66.21 (68.33)	66.21 (69.11)	66.21 (68.33)	
**PERF:  69.03 (68.29)	69.03 (68.29)	69.03 (69.00)	69.03 (69.00)	69.03 (68.49)	69.03 (68.29)	69.03 (69.00)	69.03 (68.29)	
**PERF:  67.94 (68.27)	67.94 (68.27)	67.94 (68.91)	67.94 (68.91)	67.94 (68.44)	67.94 (68.27)	67.94 (68.91)	67.94 (68.27)	
**PERF:  67.91 (68.24)	67.91 (68.24)	67.91 (68.84)	67.91 (68.84)	67.91 (68.40)	67.91 (68.24)	67.91 (68.84)	67.91 (68.24)	
**PERF:  68.27 (68.29)	68.27 (68.29)	68.27 (68.84)	68.27 (68.84)	68.27 (68.44)	68.27 (68.29)	68.27 (68.84)	68.27 (68.29)	
**PERF:  68.43 (68.27)	68.43 (68.27)	68.43 (68.78)	68.43 (68.78)	68.43 (68.41)	68.43 (68.27)	68.43 (68.78)	68.43 (68.27)	
**PERF:  70.53 (68.38)	70.53 (68.38)	70.53 (68.85)	70.53 (68.85)	70.53 (68.51)	70.53 (68.38)	70.53 (68.85)	70.53 (68.38)	
**PERF:  70.41 (68.53)	70.41 (68.53)	70.41 (68.98)	70.41 (68.98)	70.41 (68.66)	70.41 (68.53)	70.41 (68.98)	70.41 (68.53)	
**PERF:  68.29 (68.56)	68.29 (68.56)	68.29 (68.98)	68.29 (68.98)	68.29 (68.68)	68.29 (68.56)	68.29 (68.98)	68.30 (68.56)	
**PERF:  69.57 (68.58)	69.57 (68.58)	69.57 (68.99)	69.57 (68.99)	69.57 (68.69)	69.57 (68.58)	69.57 (68.99)	69.57 (68.58)

For ‘enable-batch-process=1’ the average FPS for each stream is around 55FPS:

**PERF:  FPS 0 (Avg)	FPS 1 (Avg)	FPS 2 (Avg)	FPS 3 (Avg)	FPS 4 (Avg)	FPS 5 (Avg)	FPS 6 (Avg)	FPS 7 (Avg)	
**PERF:  0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)	0.00 (0.00)
**PERF:  56.86 (55.50)	60.93 (59.24)	60.93 (59.24)	56.86 (55.50)	56.86 (55.50)	56.86 (55.50)	60.93 (59.24)	56.86 (55.50)	
**PERF:  60.37 (58.79)	60.37 (60.35)	60.37 (60.35)	60.37 (58.79)	60.37 (58.79)	60.37 (58.79)	60.37 (60.35)	60.37 (58.79)	
**PERF:  54.95 (57.36)	54.95 (58.26)	54.95 (58.26)	54.95 (57.36)	54.95 (57.36)	54.95 (57.36)	54.95 (58.26)	54.95 (57.36)	
**PERF:  57.48 (57.26)	57.48 (57.90)	57.48 (57.90)	57.48 (57.26)	57.48 (57.26)	57.48 (57.26)	57.48 (57.90)	57.48 (57.26)	
**PERF:  56.88 (57.19)	56.88 (57.70)	56.88 (57.70)	56.88 (57.19)	56.88 (57.19)	56.88 (57.19)	56.88 (57.70)	56.88 (57.19)	
**PERF:  53.33 (56.62)	53.33 (57.02)	53.33 (57.02)	53.33 (56.62)	53.33 (56.62)	53.33 (56.62)	53.33 (57.02)	53.33 (56.62)	
**PERF:  54.00 (56.23)	54.00 (56.56)	54.00 (56.56)	54.00 (56.23)	54.00 (56.23)	54.00 (56.23)	54.00 (56.56)	54.00 (56.23)	
**PERF:  55.47 (56.06)	55.47 (56.35)	55.47 (56.35)	55.47 (56.06)	55.47 (56.06)	55.47 (56.06)	55.47 (56.35)	55.47 (56.06)	
**PERF:  52.51 (55.71)	52.51 (55.96)	52.51 (55.96)	52.51 (55.71)	52.51 (55.71)	52.51 (55.71)	52.51 (55.96)	52.51 (55.71)	
**PERF:  50.27 (55.12)	50.27 (55.34)	50.27 (55.34)	50.27 (55.12)	50.27 (55.12)	50.27 (55.12)	50.27 (55.34)	50.27 (55.12)	
**PERF:  53.60 (54.92)	53.60 (55.11)	53.60 (55.11)	53.60 (54.92)	53.60 (54.92)	53.60 (54.92)	53.60 (55.11)	53.60 (54.92)	
**PERF:  53.95 (54.84)	53.95 (55.02)	53.95 (55.02)	53.95 (54.84)	53.95 (54.84)	53.95 (54.84)	53.95 (55.02)	53.95 (54.84)	
**PERF:  55.88 (54.93)	55.88 (55.09)	55.88 (55.09)	55.88 (54.93)	55.88 (54.93)	55.88 (54.93)	55.88 (55.09)	55.88 (54.93)	
**PERF:  55.28 (55.00)	55.28 (55.16)	55.28 (55.16)	55.28 (55.00)	55.28 (55.00)	55.28 (55.00)	55.28 (55.16)	55.28 (55.00)	
**PERF:  52.00 (54.80)	52.00 (54.94)	52.00 (54.94)	52.00 (54.80)	52.00 (54.80)	52.00 (54.80)	52.00 (54.94)	52.00 (54.80)	
**PERF:  53.54 (54.68)	53.54 (54.81)	53.54 (54.81)	53.54 (54.68)	53.54 (54.68)	53.54 (54.68)	53.54 (54.81)	53.54 (54.68)	
**PERF:  51.63 (54.52)	51.63 (54.64)	51.63 (54.64)	51.63 (54.52)	51.63 (54.52)	51.63 (54.52)	51.63 (54.64)	51.63 (54.52)	
**PERF:  52.63 (54.43)	52.63 (54.55)	52.63 (54.55)	52.63 (54.43)	52.63 (54.43)	52.63 (54.43)	52.63 (54.55)	52.63 (54.43)	
**PERF:  52.68 (54.30)	52.68 (54.41)	52.68 (54.41)	52.68 (54.30)	52.68 (54.30)	52.68 (54.30)	52.68 (54.41)	52.68 (54.30)

When useHOG is disabled and for example only the useColorNames and useHighPrecisionFeature are enabled:

  # [Feature Extraction]
  useColorNames: 1     # Use ColorNames feature
  useHog: 0            # Use Histogram-of-Oriented-Gradient (HOG) feature
  useHighPrecisionFeature: 1   # Use high-precision in feature extraction. Default is [true]

then the tracking algorithm has very similar FPS for both ‘enable-batch-process=1’ and ‘enable-batch-process=0‘.

So my conclusion is that nvdcf tracking algorithm is slower with ‘enable-batch-process=1’ than ‘enable-batch-process=0’ if useHog is enabled. Also when useHog is disabled I did not see any speed up when ‘enable-batch-process=1’ compare to ‘enable-batch-process=0’.

bcao · July 7, 2021, 1:14am

Great work

system · September 5, 2021, 1:15am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream Nvtracker, bounding boxes issues DeepStream SDK	21	5335	October 12, 2021
NvDCF Jitter DeepStream SDK	22	2385	October 12, 2021
Test the tracker alone DeepStream SDK gstreamer	11	796	November 25, 2022
How to smooth the bbox detections with the DCF tracking in DS5.0 GA? DeepStream SDK	26	3885	October 12, 2021
Why nvv4l2decoder use too much cpu? DeepStream SDK	9	1864	October 12, 2021
New NvStreammux shows 「[ERROR push 317] push failed [-5]」 DeepStream SDK	9	599	June 4, 2024
How to determine the maximum number of inferences a gpu can make? DeepStream SDK deepstream	58	201	November 29, 2024
NVDCF Tracker + Re-ID Performance Profile DeepStream SDK	4	738	January 3, 2024
Deepstream 5.1 inference caps at 30fps DeepStream SDK	10	703	November 23, 2021
Delay in NvDsAnalytics Line Crossing Events DeepStream SDK	17	1151	November 8, 2021

Enable-batch-process=1 slows down the NvDCF tracker

Description

Description

Related topics