Error in output of yolov5 models when using triton + deepstream integration

Please provide complete information as applicable to your setup.

• GPU: 1650ti
• DeepStream Version: 6.2
• NVIDIA GPU Driver Version (valid for GPU only): Driver Version: 525.60.13 CUDA Version: 12.0
**• Issue Type: question

When using deepstream+triton, I am able to do inference on yolov5 models, but the detection output is not correct, it is missing some objects also the location of bbox are not accurate. The results with plain deepstream with tensorrt backend is working fine but with trtion is giving wrong outputs. Also the nms seems to be not working.

Below is the configuration files used:

deepstream_app_config.txt

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=0
rows=2
columns=2
width=1280
height=720

[source0]
enable=1
type=3
uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
#uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_office.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=3
output-file = …/umar/ds_out/out_12.mp4
container = 1
#1=h264 2=h265
codec=3
enc-type=0
sync=1
bitrate=4000000
#profile = 0
#nvbuf-memory-type=0

[osd]
enable=1
border-width=2
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0

[streammux]
##Boolean property to inform muxer that sources are live
live-source=1
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000

Set muxer output width and height

width=1280
height=720

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_yolov5_tis.txt
plugin-type=1

config_yolov5_tis.txt

infer_config {
unique_id: 1
gpu_ids: [0]
max_batch_size: 1
backend {
trt_is{
model_name: “yolov5s”
version: -1
model_repo {
root: “trtis_model_repo_v3”
log_level: 2
}
}
}
preprocess {
tensor_order: TENSOR_ORDER_LINEAR
}
postprocess {
labelfile_path:“labels.txt”
detection {
num_detected_classes:2
custom_parse_bbox_func:“NvDsInferParseYolo”
nms {
confidence_threshold:0.7
iou_threshold:0.25
topk:300
}
}
}

custom_lib {
    path: "nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so"
}

}

config.pbtxt for triton model repo:

name: “yolov5s”
platform: “tensorrt_plan”
max_batch_size : 0
input [
{
name: “input”
data_type: TYPE_FP32
dims: [1, 3, 640, 640]
}
]
output [
{
name: “boxes”
data_type: TYPE_FP32
dims: [1, 25200 , 4]
},
{
name: “scores”
data_type: TYPE_FP32
dims: [1, 25200, 1]
},
{
name: “classes”
data_type: TYPE_FP32
dims: [1, 25200, 1]
}
]
dynamic_batching { }
instance_group [
{
count: 1
kind: KIND_GPU
}
]

default_model_filename: “model_b1_gpu0_fp32.engine”

1 Like

What do you mean by the comment below?

Could you attach the result of the video?

sure, here is the video link:out_13.mp4 - Google Drive

Note: My model has two ouput classes: person, head

it is a finetuned yolov5 model based on crowdhuman dataset.

Could you attach the model to us? Or could you use our model to reproduce this problems?

Sure here is the link to the model:

It contains both the onnx file and the engine file. I have also included the labels file.

https://drive.google.com/drive/folders/1eauWNre53RPubpMqp9Z4H_U8C9IR79m5?usp=sharing

Could you attach the command to generate the engine file? Cause the engine file can only work on the same card model that generated it.
Also please attach the config file for plain deepstream with tensorrt backend as you said it works well with that.

I used the onnx model to first run with plain deepstream, it automatically generated the engine file, That engine file I used for Triton model repo.

Anyway the model is loading as it is converted on the same gpu, and the inference is happening, just that it is not happening as expected.

Also the config file for plain deepstream is below:
This is the output(Which is correct) when using plain deepstream without Triton:

deep_stream_app_config.txt

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=0
rows=2
columns=2
width=1280
height=720

[source0]
enable=1
type=3
uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
#uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_office.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=3
output-file = out_6.mp4
container = 1
#1=h264 2=h265
codec=3
enc-type=0
sync=1
bitrate=4000000
#profile = 0
#nvbuf-memory-type=0

[osd]
enable=1
border-width=2
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0

[streammux]
##Boolean property to inform muxer that sources are live
live-source=1
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000

Set muxer output width and height

width=1280
height=720

config-file property is mandatory for any gie section.

Other properties are optional and if set will override the properties set in

the infer config file.

[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV5.txt

config_infer_primary_yoloV5.txt:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
onnx-file=crowdhuman_yolov5m.onnx
model-engine-file=model_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
labelfile-path=labels.txt
batch-size=1
network-mode=0
num-detected-classes=2
interval=0
gie-unique-id=1
process-mode=1
network-type=0
cluster-mode=2
maintain-aspect-ratio=1
symmetric-padding=1
#force-implicit-batch-dim=1
#workspace-size=1000
parse-bbox-func-name=NvDsInferParseYolo
#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
nms-iou-threshold=0.45
pre-cluster-threshold=0.25
topk=300

Where did you implement and add this function: NvDsInferParseYolo? Could you attach the source code?

Yes, I used this repo to use NvDsInferParseYolo, source code can be found in this repo:

GitHub - marcoslucianops/DeepStream-Yolo: NVIDIA DeepStream SDK 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models

You need to set the normalize filed in the preprocess group in your config file:

  normalize {
      scale_factor: 0.0039215697906911373
      channel_offsets: [0, 0, 0]
  }

thanks for the reply,

Now the result is better if I use the config change you suggested, but it still not par with the plain deepstream output,

output with config change: out_25_nvidia_suggested.mp4 - Google Drive

output with plain deepstream: out_23_plain_deepstream.mp4 - Google Drive

If you look and compare both the videos, you can see clearly that there are several missed detections when we use deepstream with triton as compared with plain deepstream.

I am not able to figure out the reason behind this, please help.

Also can you please specify how you calculated the value of scale_factor, it would be really helpful

Apart from that is there a way to filter out the detection of specific class in the final output, in the documentation I have found out that in output_control we can specify specific_class_filters in the format specific_class_filters: [

{ key: 1, value {…} },

{ key: 2,

value {…} }

]

but I am not able to figure out how to actually fill the key and value.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

There are many samples in our open source code: samples\configs\deepstream-app-triton, you can refer to that. About the difference of nvinfer and nvinferserver, there are still some other parameters you haven’t set for nvinferserver. Like: cluster-mode=2 maintain-aspect-ratio=1 symmetric-padding=1. Please refer to our Guide to learn how to set that: nvinferserver

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.