Inference server failing with YoloV3 Object detection serialized Tensorrt Engine

Docker command:
nvidia-docker run --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v /home/seattle/Atchyutha/DeepStream:/data -v /home/seattle/Atchyutha/DeepStream/trt_server/YOLOv3_tensorrt_server/tensorrt_configs/:/opt/tensorrtserver/models -e LD_PRELOAD="/data/TensorRT-github/build/out/libnvinfer_plugin.so /data/trt_server/YOLOv3_tensorrt_server/lib/nvdsinfer_customparser_yolov3_tlt/libnvds_infercustomparser_yolov3_tlt.so" nvcr.io/nvidia/tensorrtserver:20.02-py3 trtserver --model-repository=/opt/tensorrtserver/models

Folder structure:
├── tensorrt_configs
│ ├── yolo_medical_mask
│ │ ├── 1
│ │ │ └── model.plan
│ │ └── config.pbtxt

config.pbtxt:
name: “yolo_medical_mask”
platform: “tensorrt_plan”
max_batch_size: 1
input [
{
name: “Input”
data_type: TYPE_FP32
dims: [ 3, 480, 640 ]
}
]
output [
{
name: “BatchedNMS”
data_type: TYPE_INT32
dims: [ ]
}
]

ERROR:

I0625 13:05:21.120477 1 server.cc:112] Initializing TensorRT Inference Server
E0625 13:05:21.797973 1 model_repository_manager.cc:1505] model output must specify 'dims' for yolo_medical_mask
..
..
error: creating server: INTERNAL - failed to load all models

Steps I followed:
1.Trained yolov3 from “/workspace/examples/yolo” sample on custom dataset(20 classes). Saved serialized model as “model.engine”(used nvcr.io/nvidia/tlt-streamanalytics:v2.0_dp_py2)
2. Renamed “model.engine”->“model.plan”, then trying to load this model in tensorrtserver(used nvcr.io/nvidia/tensorrtserver:20.02-py3)

Finally ended up with above error. Also tried with “output: dims:[ -1]”, then getting below error:

E0625 04:45:19.091502 1 model_repository_manager.cc:832] failed to load 'yolo_medical_mask' version 1: Invalid argument: model 'yolo_medical_mask_0_gpu0', tensor 'BatchedNMS': the model expects 0 dimensions (shape []) but the model configuration specifies 1 dimensions (shape [-1])

Is there any way to fix this issue?
I will appreciate any help.