Please provide complete information as applicable to your setup.
**• Hardware Platform (Jetson Orin NX **
• DeepStream Version 7.0
**• JetPack Version 6.0 **
• Issue Type( bug)
• using an yolo.onnx model that was functioning normally on 6.4 and creating an engine out of it
• Requirement details : yolov4 tiny custom model
I’m encountering an issue with excessive false-positive detections after migrating my object detection pipeline from DeepStream 6.4 to DeepStream 7.0.
Background:
- I have an object detection model yolov4 tiny trained and exported using the NVIDIA TAO Toolkit.
- The exported ONNX model includes the
BatchedNMS
node, meaning Non-Maximum Suppression is performed within the model graph itself. - This setup worked well on DeepStream 6.4, producing an expected number of detections.
Migration and Problem:
- I recently migrated my system to DeepStream 7.0 (and its corresponding TensorRT, CUDA versions).
- I am using the same
.onnx
model file as before. - Since migrating to DS 7.0, the pipeline produces significantly more detections, including many clear false positives, even though the input video/images are the same.
Config file:
[property]
gpu-id=0
net-scale-factor=0.0039215686 #1 does not change anythig
model-color-format=0 #0 RGB 1 BGR
labelfile-path=filter_classes.txt
model-engine-file=filter_recognition_epoch_80.onnx_b1_gpu0_fp32.engine
onnx-file=filter_recognition_epoch_80.onnx
infer-dims=3;416;416
maintain-aspect-ratio=1
uff-input-order=0
uff-input-blob-name=Input
batch-size=1
network-mode=0
num-detected-classes=2
interval=0
gie-unique-id=1
is-classifier=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
#custom-lib-path=libnvds_infercustomparser_tao.so
custom-lib-path=libnvds_infercustomparser.so
[class-attrs-all]
#pre-cluster-threshold=0.5 #(it is already in the onnx? does not change nothing when I comment)
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0
Troubleshooting Steps Taken:
- Custom Parser: I have successfully rebuilt the TAO custom parser library (
libnvds_infercustomparser_tao.so
or equivalent) using the DeepStream 7.0 SDK. I verified usingnm -D
that the requiredNvDsInferParseCustomBatchedNMSTLT
function exists in the compiled library being used. pre-cluster-threshold
: I tested setting thepre-cluster-threshold
value in the[class-attrs-all]
section of the config file (e.g., to 0.5). Commenting out or changing this value has no apparent effect on the number of output detections. This strongly suggests the effective confidence threshold is determined internally by theBatchedNMS
node in the model.net-scale-factor
: I tested bothnet-scale-factor=1
andnet-scale-factor=0.0039215686
(after verifying my model’s expected input). While ensuring the correct value is important, changing this did not resolve the excessive detection issue. (I also deleted the.engine
file after changing this).- Engine Regeneration: I have deleted the TensorRT
.engine
file multiple times to ensure it is regenerated cleanly by DeepStream 7.0’s version of TensorRT from the original.onnx
file. This did not change the outcome. - libnvds_infercustomparser.so I rebuild the libnvds_infercustomparser.so or also libnvds_infercustomparser_tao.so without success.
Hypothesis & Questions:
I think thatdifferences or behavior changes in the BatchedNMS
implementation within the newer TensorRT version (used by DS 7.0) compared to the older version (used by DS 6.4) are causing more detections to pass the fixed confidence threshold embedded within the ONNX model’s BatchedNMS
node. Could that be right??
- Controlling Internal Threshold: How can I effectively control or increase the confidence threshold that is applied internally by the
BatchedNMS
node when exporting the model from TAO Toolkit? What specific parameters should I look for in thetao export
(or possiblytao prune
/tao deploy
) command or spec file? - TensorRT Behavior Changes: Are there any known changes in TensorRT versions (relevant to DS 6.4 vs. DS 7.0) regarding the
BatchedNMS
operation (or general layer precision/numerics) that might explain why a previously embedded threshold now results in more detections passing? - Other DS 7.0 Factors: Are there any other DeepStream 7.0
nvinfer
configuration parameters or parser behaviors I might be overlooking that could influence the final detections when using a model with internal NMS?
Is there a need to retrain onnx for Deepstream 7.0 in yolov4?:
Any guidance on how to adjust the internal NMS threshold during TAO export or other potential solutions would be greatly appreciated!
Thanks!