Model exported from tlt2 fails to load on tritonis

tlt2 image v2.0_dp_py2
+ 20.03-py3

mobilenet_v2_ssd trained , pruned and retrained, exported with fp16,
tlt_infer works on tlt2 docker but when placing in model_repository as tensorrt_plan getting error:

I0510 18:20:12.415923 1] New status tracking for model ‘mobilenet_v2_ssd’
I0510 18:20:12.415982 1] loading: mobilenet_v2_ssd:1
E0510 18:20:12.436904 1] INVALID_ARGUMENT: getPluginCreator could not find plugin BatchTilePlugin_TRT version 1
E0510 18:20:12.436943 1] safeDeserializationUtils.cpp (293) - Serialization Error in load: 0 (Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
E0510 18:20:12.441026 1] INVALID_STATE: std::exception
E0510 18:20:12.441126 1] INVALID_CONFIG: Deserialize the cuda engine failed.

Could be mismatching tensorrt versions?

SSD requires batchTilePlugin. This plugin is available in the TensorRT open source repo, but not in TensorRT 7.0. Please build the according to


Can you give a short description of the steps to make it work with triton inference server?

Please refer to

Hi, What i mean is how to get tensorrt OSS plugins working with triton inference server?

Need to build TensorRT inside triton server docker?
Get prebuilt matching binaries from somewhere?

You can follow to build the Then generate the trt engine via the tool tlt-converter.
Then let triton inference server be able to recognize TRT engine.
Using TensorRT Inference Server with TLT models