What will the engine's precision be if setting with unsupported precision?

cocoyen1995 · April 19, 2021, 10:06am

Description

Hi there,

I use TensorRT to run my inference process.
During the process of converting the model (.onnx) to engine (.trt), I set the precision like this:

	if (run_fp16)
	{
		std::cout << "***USING FP16***\n";
		config->setFlag(BuilderFlag::kFP16);
	}
	else
	{
		std::cout << "***USING INT8***\n";
		config->setFlag(BuilderFlag::kINT8);

		// provided by @le8888e at https://github.com/NVIDIA/TensorRT/issues/557
		std::string calibration_imgs_list = Jconfig["cali_image_path"].get<std::string>();       //file to save calibraiton images' path, one sample a line
																								 //std::string calibration_table_save_path = Jconfig["cali_save_path"].get<std::string>();  //path to save calibration table
		std::string calibration_table_save_path = "./secret_path/cache_data.cache";  //path to save calibration table
		std::vector<int> MD_size = Jconfig["input_size"];
		float beta = Jconfig["int8_beta"].get<float>();
		std::cout << "beta:" << beta << std::endl;

		int8EntroyCalibrator *calibrator = nullptr;
		calibrator = new int8EntroyCalibrator(1, calibration_imgs_list, calibration_table_save_path, MD_size, beta);
		config->setInt8Calibrator(calibrator);

	}

	samplesCommon::enableDLA(builder, config, -1);

	ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);

The thing is, I saw there are some functions for checking whether your GPU support the precision you set during the conversion, like

Builder->platformHasFastInt8()

and

Builder->platformHasFastFp16()

For example, I didn’t see any error if I set the precision as FP16 on GPU with cc 6.1 (GTX1050) when producing engine without calling these two functions I mentioned above.
However, according to this table, the GPU I used didn’t support FP16.
If I call Builder->platformHasFastFp16(), it returned false.

Since the engine can still be produced successfully, and the inference process can run through as expected,
I’m wondering what’s the actual precision the engine use when inferencing in this kind of case?
Does buildEngineWithConfig() change the precision to FP32 automatically if finding out the GPU doesn’t support FP16?
Or it does other stuff to deal with this kind of problem?

(Thanks in advance for any help or advice!)

Environment

TensorRT Version: 7.2.3.4
GPU Type: 2080Ti (develop) / 1050 (deploy)
Nvidia Driver Version: -
CUDA Version: 11.1
CUDNN Version: 8.1.0
Operating System + Version: win10
Python Version (if applicable): -
TensorFlow Version (if applicable): 1.13.1

NVES · April 19, 2021, 10:07am

Hi ,
We recommend you to check the supported features from the below link.

You can refer below link for all the supported operators list.
For unsupported operators, you need to create a custom plugin to support the operation

github.com

onnx/onnx-tensorrt/blob/main/docs/operators.md

<!--- SPDX-License-Identifier: Apache-2.0 -->

# Supported ONNX Operators

TensorRT 8.4 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md)

TensorRT supports the following ONNX data types: DOUBLE, FLOAT32, FLOAT16, INT8, and BOOL

> Note: There is limited support for INT32, INT64, and DOUBLE types. TensorRT will attempt to cast down INT64 to INT32 and DOUBLE down to FLOAT, clamping values to `+-INT_MAX` or `+-FLT_MAX` if necessary.

See below for the support matrix of ONNX operators in ONNX-TensorRT.

## Operator Support Matrix

| Operator                  | Supported  | Supported Types | Restrictions                                                                                                           |
|---------------------------|------------|-----------------|------------------------------------------------------------------------------------------------------------------------|
| Abs                       | Y          | FP32, FP16, INT32 |
| Acos                      | Y          | FP32, FP16 |
| Acosh                     | Y          | FP32, FP16 |
| Add                       | Y          | FP32, FP16, INT32 |

This file has been truncated. show original

Thanks!

cocoyen1995 · April 19, 2021, 10:15am

Hi @NVES ,

Thanks for your quick reply!
Sorry if I didn’t make it clear,
I’m just wondering why the engine can be generated successfully if set with unsupported precision without checking if the GPU support for that or not.
And what will the engine’s precision be under this circumstances.
I’m not asking for answer about “unsupported operators” Q_Q

spolisetty · April 20, 2021, 6:46am

Hi @cocoyen1995,

Looks like you’re using GTX 1050.
Which is a Pascal chip and has functional support for FP16, but there is no FP16 tensor core, so you can run in FP16, if it has FastFp16 mode then it can use tensor core to accelerate the FP16.

Thank you.

cocoyen1995 · April 21, 2021, 3:17am

Hi @spolisetty ,

Thanks for your reply!
Although I don’t understand quite well, I’ll try to study more about this related topic.
Thanks again, have a nice day!

Topic		Replies	Views
TensorRT Engine TensorRT tensorrt	3	536	April 27, 2022
Question about the tensorrt precision transformation TensorRT	4	578	July 12, 2021
Operational precision of TensorRT TensorRT tensorrt	1	480	August 18, 2022
How to check layer precision? TensorRT	4	3634	September 1, 2022
Unable to run with half precision on Nvidia GTX 1080 TensorRT	3	1383	October 10, 2018
Extreme engine building time for certain models on Windows with FP16 TensorRT	6	1344	March 23, 2022
Check device supports FP16 TensorRT	1	1514	April 5, 2022
If I create an model.engine, how can I know the precision, int8, float16? TensorRT	3	922	September 20, 2023
Build engine TensorRT on Jetson Nano Jetson Nano tensorrt	5	2223	August 30, 2023
Failure to use API setPrecision！！！ TensorRT	1	414	November 20, 2020

What will the engine's precision be if setting with unsupported precision?

Description

Environment

Related topics