Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) Jetson Orin NX (8GB)
• DeepStream Version 6.3
• JetPack Version (valid for Jetson only) 5.1.2
• TensorRT Version JP 5.1.2 Standard (8,5?)
• NVIDIA GPU Driver Version (valid for GPU only) JP 5.1.2 standard
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
Trying to create and INT8 optimized TRT engine from my YOLOv4 ONNX model for use with DeepStream-YOLO / nvinfer. I am using the same model file and the same Calibration Images as from Deepstream 6.0.0. I am running the int8 calibration inside Deepstream_app as described here: DeepStream-Yolo/docs/YOLOv5.md at master · marcoslucianops/DeepStream-Yolo · GitHub (without the Ultralytics part at the beginning)
With DS 6.0 on a Jeston NX (not Orin) this resulted in a model with 94% recall over my test dataset. With DS 6.3 and Orin NX i only get 87% (tested many, many calibration runs with different parameters).
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
config_nvinfer_primary.txt
[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
custom-network-config=yolov4.cfg
model-file=yolov4.weights
model-engine-file=model_b1_gpu0_int8.engine
int8-calib-file=calib.table
labelfile-path=dc_vehicles.training.names
batch-size=1
network-mode=1
num-detected-classes=14
interval=0
gie-unique-id=1
process-mode=1
network-type=0
cluster-mode=2
maintain-aspect-ratio=0
symmetric-padding=1
force-implicit-batch-dim=0
workspace-size=6000
#parse-bbox-func-name=NvDsInferParseYolo
parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
# DLA does not make sense for us with JP 5.1.2
# because our model requires a lot of GPU fallbacks, reducing speed by 40%.
#enable-dla=1
#use-dla-core=0
#gpu-fallback=1
[class-attrs-all]
nms-iou-threshold=0.3
pre-cluster-threshold=0.3
topk=300
Is this something others also experience? This reduction in detection rates makes it a problem for us to upgrade to Jetpack 5.