What will the engine's precision be if setting with unsupported precision?

Description

Hi there,

I use TensorRT to run my inference process.
During the process of converting the model (.onnx) to engine (.trt), I set the precision like this:

	if (run_fp16)
	{
		std::cout << "***USING FP16***\n";
		config->setFlag(BuilderFlag::kFP16);
	}
	else
	{
		std::cout << "***USING INT8***\n";
		config->setFlag(BuilderFlag::kINT8);

		// provided by @le8888e at https://github.com/NVIDIA/TensorRT/issues/557
		std::string calibration_imgs_list = Jconfig["cali_image_path"].get<std::string>();       //file to save calibraiton images' path, one sample a line
																								 //std::string calibration_table_save_path = Jconfig["cali_save_path"].get<std::string>();  //path to save calibration table
		std::string calibration_table_save_path = "./secret_path/cache_data.cache";  //path to save calibration table
		std::vector<int> MD_size = Jconfig["input_size"];
		float beta = Jconfig["int8_beta"].get<float>();
		std::cout << "beta:" << beta << std::endl;

		int8EntroyCalibrator *calibrator = nullptr;
		calibrator = new int8EntroyCalibrator(1, calibration_imgs_list, calibration_table_save_path, MD_size, beta);
		config->setInt8Calibrator(calibrator);

	}

	samplesCommon::enableDLA(builder, config, -1);

	ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);

The thing is, I saw there are some functions for checking whether your GPU support the precision you set during the conversion, like

Builder->platformHasFastInt8()

and

Builder->platformHasFastFp16()

For example, I didn’t see any error if I set the precision as FP16 on GPU with cc 6.1 (GTX1050) when producing engine without calling these two functions I mentioned above.
However, according to this table, the GPU I used didn’t support FP16.
If I call Builder->platformHasFastFp16(), it returned false.

Since the engine can still be produced successfully, and the inference process can run through as expected,
I’m wondering what’s the actual precision the engine use when inferencing in this kind of case?
Does buildEngineWithConfig() change the precision to FP32 automatically if finding out the GPU doesn’t support FP16?
Or it does other stuff to deal with this kind of problem?

(Thanks in advance for any help or advice!)

Environment

TensorRT Version: 7.2.3.4
GPU Type: 2080Ti (develop) / 1050 (deploy)
Nvidia Driver Version: -
CUDA Version: 11.1
CUDNN Version: 8.1.0
Operating System + Version: win10
Python Version (if applicable): -
TensorFlow Version (if applicable): 1.13.1

Hi ,
We recommend you to check the supported features from the below link.
https://docs.nvidia.com/deeplearning/tensorrt/support-matrix/index.html
You can refer below link for all the supported operators list.
For unsupported operators, you need to create a custom plugin to support the operation

Thanks!

Hi @NVES ,

Thanks for your quick reply!
Sorry if I didn’t make it clear,
I’m just wondering why the engine can be generated successfully if set with unsupported precision without checking if the GPU support for that or not.
And what will the engine’s precision be under this circumstances.
I’m not asking for answer about “unsupported operators” Q_Q

Hi @cocoyen1995,

Looks like you’re using GTX 1050.
Which is a Pascal chip and has functional support for FP16, but there is no FP16 tensor core, so you can run in FP16, if it has FastFp16 mode then it can use tensor core to accelerate the FP16.

Thank you.

Hi @spolisetty ,

Thanks for your reply!
Although I don’t understand quite well, I’ll try to study more about this related topic.
Thanks again, have a nice day!

1 Like