Hi,
Do you want to use Triton server or native TensorRT?
For Triton server, you can pass the TensorFlow model to it directly.
For TensorRT, both onnx/uff formation are supported.
If the model is trained with TF-2.x, please use onnx as intermediate format for a better support.
And you can specify a layer as TensorRT output to get the face feature.
For ONNX model, you can find the corresponding name in this website:
https://netron.app/
For dynamic input shape, please check this document for more information:
To skip mean subtraction, you can just set the value to zero.
Thanks.