YOLOV4-DS-TRITON/got an error about input unmatch

Hi!
Now I will deploy the model to my Jetson TX2 by deepstream through triton inference server.
I am using a Jetson TX2 with deep-stream 5-0.

  1. got engine file
    followed the steps of GitHub - Tianxiaomo/pytorch-YOLOv4: PyTorch ,ONNX and TensorRT implementation of YOLOv4,
python demo_darknet2onnx.py /workspace/pytorch-YOLOv4/cfg/yolov4.cfg /workspace/pytorch-YOLOv4/yolov4.weights /workspace/pytorch-YOLOv4/data/dog.jpg 1
/usr/src/tensorrt/bin/trtexec --onnx=yolov4_1_3_608_608_static.onnx --explicitBatch --saveEngine=yolov4_1_3_608_608_static.engine --workspace=2048 --fp16
  1. config triton info
    source1_primary_yoloV4_original.txt (3.1 KB)
    config_infer_primary_yoloV4_original.txt (2.4 KB)
    config.pbtxt (568 Bytes)
  2. NvDsInferParseCustomYoloV4
git clone https://github.com/NVIDIA-AI-IOT/yolov4_deepstream.git
/yolov4_deepstream/deepstream_yolov4/nvdsinfer_custom_impl_Yolo
  1. got error as following:

Opening in BLOCKING MODE
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
** INFO: <bus_callback:167>: Pipeline running

ERROR: TRTIS: TrtServerRequest failed to create inference request providerV2, trtis_err_str:INVALID_ARG, err_msg:unexpected shape for input ‘input’ for model ‘yolov4_original’. Expected [1,3,608,608], got [3,608,608]
ERROR: TRTIS failed to create request for model: yolov4_original version:-1
ERROR: TRT-IS run failed to create request for model: yolov4_original
ERROR: TRT-IS failed to run inference on model yolov4_original, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:14.451512879 20883 0x7e886e5530 WARN nvinferserver gstnvinferserver.cpp:519:gst_nvinfer_server_push_buffer:<primary_gie> error: inference failed with unique-id:1
ERROR from primary_gie: inference failed with unique-id:1
Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinferserver/gstnvinferserver.cpp(519): gst_nvinfer_server_push_buffer (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
ERROR: TRTIS: TrtServerRequest failed to create inference request providerV2, trtis_err_str:INVALID_ARG, err_msg:unexpected shape for input ‘input’ for model ‘yolov4_original’. Expected [1,3,608,608], got [3,608,608]
ERROR: TRTIS failed to create request for model: yolov4_original version:-1
ERROR: TRT-IS run failed to create request for model: yolov4_original
ERROR: TRT-IS failed to run inference on model yolov4_original, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:14.451512879 20883 0x7e886e5530 WARN nvinferserver gstnvinferserver.cpp:519:gst_nvinfer_server_push_buffer:<primary_gie> error: inference failed with unique-id:1
ERROR from primary_gie: inference failed with unique-id:1
Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinferserver/gstnvinferserver.cpp(519): gst_nvinfer_server_push_buffer (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
ERROR: TRTIS: TrtServerRequest failed to create inference request providerV2, trtis_err_str:INVALID_ARG, err_msg:unexpected shape for input ‘input’ for model ‘yolov4_original’. Expected [1,3,608,608], got [3,608,608]
ERROR: TRTIS failed to create request for model: yolov4_original version:-1
ERROR: TRT-IS run failed to create request for model: yolov4_original
ERROR: TRT-IS failed to run inference on model yolov4_original, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:14.453875925 20883 0x7e886e5530 WARN nvinferserver gstnvinferserver.cpp:519:gst_nvinfer_server_push_buffer:<primary_gie> error: inference failed with unique-id:1
ERROR from primary_gie: inference failed with unique-id:1
Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinferserver/gstnvinferserver.cpp(519): gst_nvinfer_server_push_buffer (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
ERROR: TRTIS: TrtServerRequest failed to create inference request providerV2, trtis_err_str:INVALID_ARG, err_msg:unexpected shape for input ‘input’ for model ‘yolov4_original’. Expected [1,3,608,608], got [3,608,608]
ERROR: TRTIS failed to create request for model: yolov4_original version:-1
ERROR: TRT-IS run failed to create request for model: yolov4_original

I made a workable YoloV4 version (here) based on the GitHub - NVIDIA-AI-IOT/yolov4_deepstream project, could you compare your config file and the files in deepstream_yolov4.tgz to check what the issue is.

Since the YoloV4 is already build as TensorRT engine, do you still want to run it with Triton/nvinferserver?

We need to deploy multi inference models to a source video. Could you give us some advice how to implement this more efficiently?

Hi
The config file you send to me are deepstream-app file, mine are deepstream-app-trtis file. I can not apply your config file to triton inference server

That’s why I ask “Since the YoloV4 is already build as TensorRT engine, do you still want to run it with Triton/nvinferserver?”
Using TRT should be able to get better performance than Triton

How can we deploy multi inference models through TRT? Our inference models include yolov4, yolov4-tiny and yolov5.

you can refer to deepstream deepstream-test2 sample which pipeline is like below:

Sample of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvtracker → nvinfer (secondary classifier) → nvdsosd → renderer.

if the relation of your model is different from above, please elaborate so that we can give more accurate suggestion.


the relation of models like above

I think you can refer to deepstream_reference_apps/back-to-back-detectors at master · NVIDIA-AI-IOT/deepstream_reference_apps · GitHub which run two detectors in parallel with the source data from nvstreammux

Thank you so much! I will try this one right now~

One question, why don’t use one detector to detect all classes of objects?

Different teams are responsible for different inference models. To be honest, I think this arrangement is not reasonable.

Yes, this is wasting the computing resouce since using three models to do the work that can be done by one model