TensorRT INT8 conversion lack of performance increase.

I currently have a keras model for YoloV3 (https://github.com/qqwweee/keras-yolo3) and I have extracted the underlying Tensorflow frozen graph. I am also able to successfully convert the frozen graph into tensorRT optimized graphs using the TF-TRT API (TF version is 1.14). I have tried both FP16 and INT8 precision modes but both are giving me same performance (4-5 FPS) on the Jetson Nano. I have also calibrated the INT8 model against some training data. What could be the issue here?


Please noticed that not all the platform support INT8 operations.
For Jetson system, only Xavier has Tensor cores hardware and be able to support the INT8 mode.

If you get the similar performance between FP32 and FP16, the layers of your model might be fallbacked into TensorFlow implementation.
You can check this information from the output log first.