Same inference speed with Resnet50 for int8 and fp16

1579641369 · November 19, 2020, 2:59am

hello, i have trained resnet50 classfier with myself dataset(number_classes=7). And i have tried to inference the trained model for int8 mode,but the inference speed is almost same with fp16 mode.I don’t know why. who can tell me the resons to cause it.THANKS.

AastaLLL · November 19, 2020, 8:03am

Hi,

Could you share more detail about test of the performance?
Do you use the TensorFlow or our TensorRT trtexec binary?

Thanks.

1579641369 · November 20, 2020, 6:42am

i used pytorch->onnx->tensorrt. So, how can i see the inference precision of each layer? i used python,how can i find the log?

AastaLLL · November 30, 2020, 6:58am

Hi,

You can get the layer-level performance result with trtexec + onnx input directly.
For example:

/usr/src/tensorrt/bin/trtexec --onnx=/usr/src/tensorrt/data/resnet50/ResNet50.onnx --dumpProfile

[11/30/2020-14:54:20] [I] === Profile (187 iterations ) ===
[11/30/2020-14:54:20] [I]                                                                                                        Layer   Time (ms)   Avg. Time (ms)   Time %
[11/30/2020-14:54:20] [I]                                           (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 2) [Activation]       86.29             0.46      2.7
[11/30/2020-14:54:20] [I]                                                                                 (Unnamed Layer* 3) [Pooling]       13.92             0.07      0.4
..
[11/30/2020-14:54:20] [I]                                                                           (Unnamed Layer* 179) [ElementWise]        1.58             0.01      0.0
[11/30/2020-14:54:20] [I]                                                                               (Unnamed Layer* 181) [Softmax]        1.66             0.01      0.1
[11/30/2020-14:54:20] [I]                                                                                                        Total     3209.13            17.16    100.0

Or you can add your own profiler with below’s API:
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/Profiler.html

Thanks.

Topic		Replies	Views
How to apply int8 quantization to Transformer on Xavier Jetson Xavier NX tensorrt	2	481	August 12, 2022
Inference is so slow with torch1.6 Jetson Xavier NX nvbugs , pytorch	12	3538	October 23, 2020
Runtime Performance Decreased while using int8 - tflite Jetson Xavier NX tensorflow	2	1062	September 27, 2021
How to perform fast Int8 inference on 1080 Ti (or 2080)? Deep Learning (Training & Inference) mixed-precision	0	1125	June 28, 2019
TensorRT int8 performance Jetson AGX Xavier	4	1220	October 18, 2021
High inference time while running UNet with INT8 precision TensorRT tensorrt	5	985	February 10, 2021
Question about inference speed Jetson Nano	2	589	October 18, 2021
TRT Engin in INT8 is much slower than FP16 TensorRT	4	1930	November 11, 2021
Trtexec performance not close to benchmarks Jetson Orin NX tensorrt	2	446	December 19, 2023
Inference Speed Jetson Xavier NX pytorch	6	882	April 12, 2023

Same inference speed with Resnet50 for int8 and fp16

Related topics