Inference server failing with YoloV3 Object detection serialized Tensorrt Engine

Docker command:
nvidia-docker run --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v /home/seattle/Atchyutha/DeepStream:/data -v /home/seattle/Atchyutha/DeepStream/trt_server/YOLOv3_tensorrt_server/tensorrt_configs/:/opt/tensorrtserver/models -e LD_PRELOAD="/data/TensorRT-github/build/out/libnvinfer_plugin.so /data/trt_server/YOLOv3_tensorrt_server/lib/nvdsinfer_customparser_yolov3_tlt/libnvds_infercustomparser_yolov3_tlt.so" nvcr.io/nvidia/tensorrtserver:20.02-py3 trtserver --model-repository=/opt/tensorrtserver/models

Folder structure:
├── tensorrt_configs
│ ├── yolo_medical_mask
│ │ ├── 1
│ │ │ └── model.plan
│ │ └── config.pbtxt

config.pbtxt:
name: “yolo_medical_mask”
platform: “tensorrt_plan”
max_batch_size: 1
input [
{
name: “Input”
data_type: TYPE_FP32
dims: [ 3, 480, 640 ]
}
]
output [
{
name: “BatchedNMS”
data_type: TYPE_INT32
dims: [ ]
}
]

ERROR:

I0625 13:05:21.120477 1 server.cc:112] Initializing TensorRT Inference Server
E0625 13:05:21.797973 1 model_repository_manager.cc:1505] model output must specify 'dims' for yolo_medical_mask
..
..
error: creating server: INTERNAL - failed to load all models

Steps I followed:
1.Trained yolov3 from “/workspace/examples/yolo” sample on custom dataset(20 classes). Saved serialized model as “model.engine”(used nvcr.io/nvidia/tlt-streamanalytics:v2.0_dp_py2)
2. Renamed “model.engine”->“model.plan”, then trying to load this model in tensorrtserver(used nvcr.io/nvidia/tensorrtserver:20.02-py3)

Finally ended up with above error. Also tried with “output: dims:[ -1]”, then getting below error:

E0625 04:45:19.091502 1 model_repository_manager.cc:832] failed to load 'yolo_medical_mask' version 1: Invalid argument: model 'yolo_medical_mask_0_gpu0', tensor 'BatchedNMS': the model expects 0 dimensions (shape []) but the model configuration specifies 1 dimensions (shape [-1])

Is there any way to fix this issue?
I will appreciate any help.

Hi atchyutha

I am doing similar thing and while doing research came across your post. Did you manage to progress?

I have noticed from documentation that “dims: ” is mandatory field which must be supplied. Yes -1 is also acceptable however only if model support variable length input/output.

Secondly I noticed two libararies in your command libnvinfer_plugin.so and libnvds_infercustomparser_yolov3_tlt.so. Managed to build them from sources however when I try to preload them, the attempt is unsuccessful

nvidia-docker run --gpus=1 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/wd500gb/test_model_repository:/models -e LD_PRELOAD="/usr/src/tensorrt/libnvinfer_plugin.so /home/deepstream_tlt_apps.github/deepstream_tlt_apps/nvdsinfer_customparser_yolov3_tlt/libnvds_infercustomparser_yolov3_tlt.so" nvcr.io/nvidia/tritonserver:20.03-py3 trtserver --model-repository=/models

I get following erros

ERROR: ld.so: object ‘/usr/src/tensorrt/libnvinfer_plugin.so’ from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

ERROR: ld.so: object ‘/home/deepstream_tlt_apps.github/deepstream_tlt_apps/nvdsinfer_customparser_yolov3_tlt/libnvds_infercustomparser_yolov3_tlt.so’ from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored

There is another error “INVALID_ARGUMENT: getPluginCreator could not find plugin BatchTilePlugin_TRT version 1” which is probably due to above issues.

Any help will be much appreciated.

Thanks.

Update:

Ok I spent today and resolved the issue of preloading. I am reached to the point same as yours:

Trained on TLT, converte to engine file and my config.pbtxt as:

name: “yolov3_resnet18”
platform: “tensorrt_plan”
max_batch_size: 16

input [
{
name: “Input”
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 384, 1248]
}
]
output [
{
name: “BatchedNMS”
data_type: TYPE_INT32
dims: [1]
label_filename: “yolov3_labels.txt”
}
]

Batch size 16 is coming from the training, there are 4 classes but getting and last few lines are of output are:

2020-08-14 16:45:43.050065: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
I0814 16:45:43.076721 1 metrics.cc:164] found 1 GPUs supporting NVML metrics
I0814 16:45:43.082242 1 metrics.cc:173] GPU 0: GeForce RTX 2080 Ti
I0814 16:45:43.082674 1 server.cc:120] Initializing Triton Inference Server
I0814 16:45:43.252504 1 server_status.cc:55] New status tracking for model ‘yolov3_resnet18’
I0814 16:45:43.252556 1 model_repository_manager.cc:680] loading: yolov3_resnet18:1
I0814 16:45:45.333894 1 plan_backend.cc:267] Creating instance yolov3_resnet18_0_gpu0 on GPU 0 (7.5) using model.plan
W0814 16:45:45.336591 1 logging.cc:46] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
E0814 16:45:45.343801 1 model_repository_manager.cc:840] failed to load ‘yolov3_resnet18’ version 1: Invalid argument: model ‘yolov3_resnet18_0_gpu0’, tensor ‘BatchedNMS’: the model expects 0 dimensions (shape ) but the model configuration specifies 1 dimensions (shape [1])
error: creating server: INTERNAL - failed to load all models

I am assuming you have resolved the issue. Could you please help me with that? Thank you.