I am running your image_classification example from your docker image nvcr.io/nvidia/tensorflow:19.11-tf2-py3 as follows
The tensorrt conversion completes successfully, but I see no speedup relative to FP32. Upon closer examination of the generated model, the graph nodes retain FP32 types, so the result is not surprising. Given that this is running on a Compute capability 6.1 (Quadro P6000 GPU), why did the converted model not use INT8 as requested above? How do I demonstrate the INT8 performance on this model that is described on your documentation?