How to apply a custom int8 quantization method with TensorRT ?

Hi, I’ve been looking for a way to quantize my DNN model using my int8 weight and activation
quantization method and do inference with TensorRT.

I can train the model by quantization-aware training with some quantization method, and also can save the trained low-precision weights, but, in the inference phase, I have no clue how to apply the same activation quantization method as that of the training phase, instead of the TensorRT’s method…

So my question is that how can I implement a custom activation quantization method with TensorRT inference ?

Thank you.

Is it impossible ? Do I need to implement quantized layers and activation functions without TensorRT ?