Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) T4 GPU
• DeepStream Version 7.1
• TensorRT Version 10.7
• NVIDIA GPU Driver Version (valid for GPU only) 535.183.01
• Issue Type( questions, new requirements, bugs) Bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I am encountering an accuracy discrepancy when using a custom trained YOLOv6s model in a DeepStream pipeline. Specifically, the detection results differ when the scaling is performed by nvstreammux
compared to when it is done during the preprocessing stage within nvinfer
. When rescaling is performed by nvstreammux the detection drops are no longer observed, where as when performed by nvinfer I can see failed detections. _
Source - 1280x720 png images
Model - Yolov6s (640x640)
# Scaling with nvstreammux
gst-launch-1.0 multifilesrc start-index=1 location=dt_issue%d.png ! pngdec ! videorate ! "video/x-raw,framerate=10/1" ! videoconvert ! nvvideoconvert ! \
capsfilter caps="video/x-raw(memory:NVMM), format=RGBA" ! m.sink_0 nvstreammux enable-padding=1 name=m width=640 height=640 batch-size=1 ! \
nvvideoconvert ! capsfilter caps="video/x-raw(memory:NVMM), format=RGBA" ! \
nvinfer config-file-path=/yolo/deepstream_cfg/yolov6_model_config_nvinfer.txt ! \
nvvideoconvert ! nvdsosd ! nvmultistreamtiler width=640 height=640 ! nvvideoconvert ! nvv4l2h264enc bitrate=400000 \
! h264parse ! mp4mux ! filesink location=output.mp4
# Scaling with nvinfer
gst-launch-1.0 multifilesrc start-index=1 location=dt_issue%d.png ! pngdec ! videorate ! "video/x-raw,framerate=10/1" ! videoconvert ! nvvideoconvert ! \
capsfilter caps="video/x-raw(memory:NVMM), format=RGBA" ! m.sink_0 nvstreammux enable-padding=1 name=m width=1280 height=720 batch-size=1 ! \
nvvideoconvert ! capsfilter caps="video/x-raw(memory:NVMM), format=RGBA" ! \
nvinfer config-file-path=/yolo/deepstream_cfg/yolov6_model_config_nvinfer.txt ! \
nvvideoconvert ! nvdsosd ! nvmultistreamtiler width=1280 height=720 ! nvvideoconvert ! nvv4l2h264enc bitrate=400000 \
! h264parse ! mp4mux ! filesink location=output.mp4
# Configuration file (yolov6_model_config_nvinfer.txt)
[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
onnx-file=/yolo/deepstream_cfg/yolov6s_vehicle_2024_3_fp32_default.pt.onnx
model-engine-file=/yolo/deepstream_cfg/model_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
labelfile-path=/yolo/deepstream_cfg/labels_yolov6_vehicle.txt
batch-size=1
network-mode=0 # 1 --> INT8 2--> FP16
num-detected-classes=1
interval=0
gie-unique-id=1
scaling-filter=1
process-mode=1
network-type=0
cluster-mode=2
maintain-aspect-ratio=1
symmetric-padding=1
workspace-size=2000
parse-bbox-func-name=NvDsInferParseYolo
#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=/yolo/deepstream_cfg/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
[class-attrs-all]
nms-iou-threshold=0.45
pre-cluster-threshold=0.25
#post-cluster-threshold=0.4
topk=300
Inference Result [Left scaling on nvstreammux @ 640p and right nvstreammux @720p]
References
Yolov6 -
Deepstream Postprocessing