I am new to TensorRT, working on optimizing our detection model on it. I roughly tested a fp16 setting for a trt model converted from .onnx, we can see inf time is 7ms, in comparison to default fp32 inf time 24ms, which looks good.
However, I read the TensorRT document, it says that TensorRT 5.0.x doesn’t support INT 8 and INT 8 calibration on ONNX. Is this true? If I have a pytorch detection model, we could save/export it to .onnx format, and then we want to optimize on TensorRT 5.0.2.x by converting this .onnx model to .trt engine. Do we have a way to do INT8 and INT8 calibration?