The impact of network input data types on inference speed and accuracy ！

wentao.robot · July 4, 2022, 9:13am

Description

At present, when I am accelerating my own model with tensorrt, I have encountered a problem, that is, when I use different data types to feed the network for inference, the inference speed and inference results of the network will be different, as follows:

the code as follows:

The impact of different data types on inference time：

different predictions：
the float32 （input data type）predict results:

the uint8（input data type） predict results:

In addition, I also compared the results of uint8 and float32 input data, as follows:

I found that the values are the same except for the data type!

Currently, my two prediction results use the same tensorrt-accelerated model based on the fp16 acceleration method.

Looking forward to your help, thanks a lot!

Environment

TensorRT Version: 8.2
GPU Type: 3060ti
Nvidia Driver Version: 510
CUDA Version: 11.3
CUDNN Version: 8.2
Operating System + Version: ubunt20.04
Python Version (if applicable): 3.7.9
TensorFlow Version (if applicable): no use
PyTorch Version (if applicable): 1.10.1+cu113
Baremetal or Container (if container which image + tag):

NVES · July 4, 2022, 9:38am

Hi,

Request you to share the model, script, profiler, and performance output if not shared already so that we can help you better.

Alternatively, you can try running your model with trtexec command.

While measuring the model performance, make sure you consider the latency and throughput of the network inference, excluding the data pre and post-processing overhead.
Please refer to the below links for more details:
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-803/best-practices/index.html#measure-performance

https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-803/best-practices/index.html#model-accuracy

Thanks!

Topic		Replies	Views
Tensorrt is slower than pytorch TensorRT	2	2194	September 15, 2021
Inference time of tensorrt 6.3 is slower than tensorrt 6.0 TensorRT tensorrt , driveos	7	911	October 12, 2021
Performance discrepancy using TensorRT engines TensorRT tensorrt	3	654	October 5, 2021
Why is TensorRT faster than TensorFlow? TensorRT	3	1561	April 26, 2022
Is the inference cost time affected by the frequency of calls? TensorRT	2	360	November 25, 2020
How can I improve my prediction performance in TenserRt 3.0? TensorRT	3	915	April 26, 2018
Does having different environment give you different result? TensorRT	1	285	April 25, 2022
TensorRT inference time increase TensorRT cuda	5	454	April 5, 2021
TensorRT inference time extremely slow TensorRT	1	443	January 31, 2023
Slow first inference and very slow two models inference TensorRT	3	1197	August 2, 2022

The impact of network input data types on inference speed and accuracy ！

Description

Environment

Related topics