Description
Hi there,
I use TensorRT to run my inference process.
During the process of converting the model (.onnx) to engine (.trt), I set the precision like this:
if (run_fp16)
{
std::cout << "***USING FP16***\n";
config->setFlag(BuilderFlag::kFP16);
}
else
{
std::cout << "***USING INT8***\n";
config->setFlag(BuilderFlag::kINT8);
// provided by @le8888e at https://github.com/NVIDIA/TensorRT/issues/557
std::string calibration_imgs_list = Jconfig["cali_image_path"].get<std::string>(); //file to save calibraiton images' path, one sample a line
//std::string calibration_table_save_path = Jconfig["cali_save_path"].get<std::string>(); //path to save calibration table
std::string calibration_table_save_path = "./secret_path/cache_data.cache"; //path to save calibration table
std::vector<int> MD_size = Jconfig["input_size"];
float beta = Jconfig["int8_beta"].get<float>();
std::cout << "beta:" << beta << std::endl;
int8EntroyCalibrator *calibrator = nullptr;
calibrator = new int8EntroyCalibrator(1, calibration_imgs_list, calibration_table_save_path, MD_size, beta);
config->setInt8Calibrator(calibrator);
}
samplesCommon::enableDLA(builder, config, -1);
ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
The thing is, I saw there are some functions for checking whether your GPU support the precision you set during the conversion, like
Builder->platformHasFastInt8()
and
Builder->platformHasFastFp16()
For example, I didn’t see any error if I set the precision as FP16 on GPU with cc 6.1 (GTX1050) when producing engine without calling these two functions I mentioned above.
However, according to this table, the GPU I used didn’t support FP16.
If I call Builder->platformHasFastFp16()
, it returned false.
Since the engine can still be produced successfully, and the inference process can run through as expected,
I’m wondering what’s the actual precision the engine use when inferencing in this kind of case?
Does buildEngineWithConfig()
change the precision to FP32 automatically if finding out the GPU doesn’t support FP16?
Or it does other stuff to deal with this kind of problem?
(Thanks in advance for any help or advice!)
Environment
TensorRT Version: 7.2.3.4
GPU Type: 2080Ti (develop) / 1050 (deploy)
Nvidia Driver Version: -
CUDA Version: 11.1
CUDNN Version: 8.1.0
Operating System + Version: win10
Python Version (if applicable): -
TensorFlow Version (if applicable): 1.13.1