Model exported from tlt2 fails to load on tritonis

tlt2 image nvcr.io/nvidia/tlt-streamanalytics v2.0_dp_py2
+
nvcr.io/nvidia/tritonserver 20.03-py3

mobilenet_v2_ssd trained , pruned and retrained, exported with fp16,
tlt_infer works on tlt2 docker but when placing in model_repository as tensorrt_plan getting error:

I0510 18:20:12.415923 1 server_status.cc:55] New status tracking for model ‘mobilenet_v2_ssd’
I0510 18:20:12.415982 1 model_repository_manager.cc:680] loading: mobilenet_v2_ssd:1
E0510 18:20:12.436904 1 logging.cc:43] INVALID_ARGUMENT: getPluginCreator could not find plugin BatchTilePlugin_TRT version 1
E0510 18:20:12.436943 1 logging.cc:43] safeDeserializationUtils.cpp (293) - Serialization Error in load: 0 (Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
E0510 18:20:12.441026 1 logging.cc:43] INVALID_STATE: std::exception
E0510 18:20:12.441126 1 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.

Could be mismatching tensorrt versions?

SSD requires batchTilePlugin. This plugin is available in the TensorRT open source repo, but not in TensorRT 7.0. Please build the libnvinfer_plugin.so according to

Referece:
https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#intg_ssd_model

Can you give a short description of the steps to make it work with triton inference server?
Thanks

Please refer to https://docs.nvidia.com/deeplearning/sdk/triton-inference-server-guide/docs/quickstart.html

Hi, What i mean is how to get tensorrt OSS plugins working with triton inference server?

Need to build TensorRT inside triton server docker?
Get prebuilt matching binaries from somewhere?

You can follow https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/tree/master/TRT-OSS to build the libnvinfer_plugin.so. Then generate the trt engine via the tool tlt-converter.
Then let triton inference server be able to recognize TRT engine.
Reference:
https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#gen_eng_tlt_converter
Using TensorRT Inference Server with TLT models