TensorRT5.0.x INT8 for Onnx?

Hi,

I am new to TensorRT, working on optimizing our detection model on it. I roughly tested a fp16 setting for a trt model converted from .onnx, we can see inf time is 7ms, in comparison to default fp32 inf time 24ms, which looks good.

However, I read the TensorRT document, it says that TensorRT 5.0.x doesn’t support INT 8 and INT 8 calibration on ONNX. Is this true? If I have a pytorch detection model, we could save/export it to .onnx format, and then we want to optimize on TensorRT 5.0.2.x by converting this .onnx model to .trt engine. Do we have a way to do INT8 and INT8 calibration?

Thanks.

Wendy

Hi,

The INT8 support is vary from the hardware rather than the library version.
May I know which device do you use? Jetson or desktop?

Here is our support matrix and it’s recommended to check if your device have INT8 support first:
[url]https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html#hardware-precision-matrix[/url]

Thanks.