TensorRT 4.0.1 - Int8 precision Vs. FP32 precision objects detections inference results


The TensorRT uff was generated and used under the following platform:
Linux distro and version - Linux-x86_64, Ubuntu, 16.04
GPU type - GeForce GTX 1080
nvidia driver version - 396.26
CUDA version - Release 9.0, V9.0.252
CUDNN version - 7.1.4
Python version – 3.5.2
Tensorflow version – 1.9
TensorRT version –

I’m using the TensorRT C++ APIs in order to inference a CNN model (Yolo3) that was developed and trained using Tensorflow and TensorRT Python APIs.

The model has three outputs based on three different sizes tiles division of the inputs images in order to be able to detect objects in three different resolutions - let’s call them Large, Medium and Small.

When the inference process is activated using Tensorflow C++ 32fp bits and TensorRT c++ 32fp bits precision the final detections results are identical and all resolutions are working well.

When the inference process is activated using the TensorRT c++ int8 bits precision the Small resolution isn’t working well which cause to undetected small or far objects.

The TensorRT Calibration logic was implemented by me and the process was performed by me according to the TesnorRT developer guide instructions using part of the original training set as the calibration set.
The CUDA engine file was generated using the created calibration table.

When i save my CNN inferences outputs to the disk of both TensorRT 32fp bits and int8 bits and check how much their values are different using a simple mean square error calculation i find that it is between 0.2 to 0.4.
When I compare the Tensorflow c++ 32fp bits with the TesnorRT c++ 32fp bits the mean square error is infinitesimal - ~(5 * (10 ^ -10))

I think that this is the root cause why my post processing phase can’t filter out the small objects.

These are my questions:

  1. What can be the root cause of this big accuracy differences between TensorRT 32fp bits and TensorRT int8 bits?
  2. Is there a way to zoom in and verify which layers in the graph are not working well with int8 bits precision?
  3. Can I select specific layer (using my TensorRT version) not to work with int8 bits and stay with 16fp bits or 32fp bits precision?


This could be due to a number of reasons. Is the pre-processing the same for doing calibration and doing inference? Is your calibration data representative of the data you’re doing inference on?

Before exploring this any further though, I would strongly suggest updating to TensorRT 6.0, as TensorRT 4 is quite old at this point, and there may have been improvements that will just fix your issue.

NVIDIA Enterprise Support

Thanks NVES_R for your quick response!

The pre-processing of the calibration and inference is the exactly the same (I’m using the same class method which implemented by me).

For the inference process, i decided to take some data from my calibration set which means that my inference data was used by the calibration process.

I did that in order to make it simple as much as possible.

I will try to update my platform with the new TensorRT version ASAP and update you with the results.

No problem. I forgot to mention this, but for simplicity, if you have Docker + Nvidia-Docker installed, you can easily run various versions of TensorRT using our NGC containers.

See here for our TensorRT containers: https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt

nvcr.io/nvidia/tensorrt:19.09-py3 contains TensorRT 6.0 and removes handling all of the dependencies on the host side.


I moved to TensorRT version in my local work station (didn’t work with NGC container) which compatible to my CUDA version 9.0 but leaved the CuDNN version to be 7.1.4 despite the fact that I saw that this TensorRT version was tested with CuDNN 7.6.3.
I supposed that CuDNN wasn’t critical for my problem (Please correct me if I wrong).

I have an object detection CNN which has a three outputs, each one is used to detect an object with different resolution – big, medium and small.
I executed the CNN with TRT6 & TRT4 in two modes: fp32 bits and int8 bits, also did that with TF but only with 32fp bits.
When I run the CNN part of the objects cannot be detected especially the small.
I downloaded the CNN outputs to the disk and save them as a binaries files.

I calculated the mean squared error between running modes outputs and got this results:
• TF 32fp bits Vs. TRT6 32fp bits - ~1.731e-06 - ~9.44e-07 – Detected objects are OK for both TRT & TF.
• TRT6 32fp bits Vs. TRT4 32fp bits - ~7.26e-07 - ~0.0 – Detected objects are OK for both TRT6 & TRT4.
• TRT6 int8 bits Vs. TRT4 int8 bits - ~ 0.19 - ~0.15 - Detected objects are not OK for both TRT6 & TRT4.
• TRT6 32fp bits Vs. TRT6 int8 bits - ~ 0.408 - ~ 0.285 - Detected objects are OK for 32fp bits and not OK for int8 bits.

Calibration process was done for TRT6 & TRT4 is the same way with the same calibration set which is subset of the training set.

Thanks for your help!

Are there any suggestions how to deal with my TRT Int8 accuracy problems?

My design depends on the TRT Int8 capability which has huge impact on my application performances results.


Hi orong13,

Sorry for the delay in response.

I believe int8 calibration is somewhat of a balancing act, similar to hyperparameter searching with neural networks. You’ll probably need to play around with the amount of calibration data you provide.


(1) You should make sure that the calibration data is fully representative of the test data you’re using for for measuring accuracy.
(2) You mention that your calibration data is a subset of your training data, but for real-world applications I believe the calibration data set should be mutually exclusive from the training/validation/test sets.

Here’s a snippet from a blog post (https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/) on this:

“You want to make sure that the calibration dataset covers all the expected scenarios; for example, clear weather, rainy day, night scenes, etc. If you are creating your own dataset, we recommend creating a separate calibration dataset. The calibration dataset shouldn’t overlap with the training, validation or test datasets, in order to avoid a situation where the calibrated model only works well on the these datasets.”

Hello NVES_R,

Thanks for your response!

I understand all your recommendations, they are make sense of course.
But still, I think that my case is much more simple and I will try to clarify why…

The calibration dataset is subset of the training/validation dataset.
The test dataset is subset of my calibration set.

For now, I don’t care what will be the results for a frame which is mutually exclusive from the training/validation/test sets.
First, I want to make sure that for the simple case where training/validation/test/calibration dataset are taken from the same frames.

When I will achieve a satisfied accuracy results for the Int8 bits inference outputs compared to the fp32bits\fp16bits inferences outputs, I will upgrade my calibration dataset to support a totally new frames which are mutually exclusive from the training/validation/test sets.

So, actually my problem is why I cannot achieve a good results for the Int8 bits inference when my calibration, training & test datasets are overlapping (No new frame)?



Could you share the error function you used for measuring the difference?
Are you calculating the MSE on the different of bounding box value?

Here is a slide to explain the detail quantization in the INT8 mode.

Please noticed that INT8 might introduce error but, in most of case, this error won’t affect the real classification result.
However, the value go through INT8 mode will be slightly different from the FP32 mode.

So instead of MSE, it’s recommended to use IoU to measure the different between INT8 and FP32.



The CNN inference operation generates three outputs, lets mark them as output1-output3.
The MSE performed on the CNN inference Int8 bits mode outputX and the CNN inference fp32bits mode outputX.

The MSE performed on the entire outputX data where the data is the detected objects information.
The outputX data is obviously before post processing phase which filters the valid data based on my configuration such as Confidence level, NMS etc.

This is the MSE code:

struct Result
	float difference = 0.f;
	float percentage = 0.f;

Result Compare(std::ifstream& f1, std::ifstream& f2, float epsilon = 0.f) //const
	Result result;
	float datum1, datum2;
	float d;
	float sum = 0.f;
	size_t count = 0;
	size_t differentCounter = 0;
	while (!f1.eof() && !f2.eof())
		f1.read(reinterpret_cast<char*>(&datum1), sizeof(float));
		f2.read(reinterpret_cast<char*>(&datum2), sizeof(float));
		d = datum1 - datum2;
		//std::cout << d << std::endl;
		sum += /*d*d*/std::fabs(d);
		differentCounter += (std::fabs(d) > epsilon) ? 1 : 0;
	result.difference = sum / static_cast<float>(count);
	result.percentage = static_cast<float>(differentCounter) / (static_cast<float>(count) + 0.f);
	return std::move(result);

Is IoU is Intersection over Union?
Can you explain please how can I use the IoU to measure the difference between Int8 & fp32?

Attached is a zip file that contains all CNN inference Int8 bits mode and fp32 bits mode outputs.

TRT6_fp32bits_VS_Int8bits.zip (6.84 MB)


There are some quantization and approximation steps inside the INT8 mode.
Due to these steps, the INT8 operation is expected to be lossy, indicating that the output won’t be exactly the same as FP32.

In general, we measure the difference between INT8 and FP32 via accuracy rather than value difference.
That’s why I recommend to use IoU to check if there is any accuracy degradation for INT8 mode.

In our TensorRT sample, there is also a difference measure for INT8 but it tolerates tiny precision difference.

In /usr/src/tensorrt/samples/sampleINT8

auto isApproximatelyEqual = [](float a, float b, double tolerance)
    return (std::abs(a - b) <= tolerance);
double fp16tolerance{0.5}, int8tolerance{1.0};

Does this measurement make sense to you?


I understand why there is an accuracy gap between FP32\FP16 and Int8 it all make sense.

My exception is that this gap will be negligible in such way that despite the fact that it exist, the CNN outputs processing by the post processing stage will produce the same results.

Maybe I’m missing something but I cannot find any differences between my comparison implementation than the one implemented in the sampleINT8.

Can you clarify what is IoU please?

Additionally, when I’m running my comparison checking with epsilon equal to 0.5 or 1.0, I’m still getting that there are a lot of cases with a bigger values gap than this epsilon.

No matter what is the comparison method, is there an expected limit to the values gap between FP32\FP16 and Int8?

If this limit exist, and my CNN generates a bigger gap than this limit, maybe it can tell something on my way of controlling the Int8 mode?!


An additional important point that I missed to share is that I’m using a custom layer.

My CNN model has several upsample layers.
Each one of them is replaced with a custom layer that I implemented by the nvinfer1::IPluginExt, nvuffparser::IPluginFactoryExt & nvinfer1::IPluginFactory according to the samples provided as part of the TensorRT package.

The enqueue class member of the of my UpSamplePlugin class (inherited from nvinfer1::IPluginExt) include an explicit implementation for both FP32 bits & FP16 bits modes.

It doesn’t include any explicit implementation for the Int8 bits because i thought that it is reflected to me as a user of the TensorRT and will be automatically considered while buildCudaEngine will be activated after calibration table creation.

  • Shall I update the enqueue method to have a specific implementation for the Int8 bits mode?
  • What is the value of configureWithFormat & supportsFormat methods nvinfer1::DataType type argument?

I just want to remind you that my accuracy problem is mostly related to the small resolutions outputs (see my problem description above) which means after the upsample custom layers.
upsample custom layer isn’t used by the large resolution at all.