Does tensor rt 5 automatically enable tensor core for int8 and fp16 mode?

cjluo · November 15, 2018, 6:45am

Hi,

from https://devblogs.nvidia.com/tensorrt-integration-speeds-tensorflow-inference/, it seems tensor core is automatically enabled when the model is configured in int8 or fp16 mode. The article shows a python example as well.

May I know if tensor core could also be turned on when I use the C++ API?

virtual void setInt8Mode(bool mode) = 0;
and
virtual void setFp16Mode(bool mode) = 0;

Thanks!

NVES · November 15, 2018, 11:27pm

Hello,

Yes, you choose to set both INT8 and FP16 mode if the platform supports it. TensorRT will choose the most performance optimal kernel to perform inference. For more information, please reference:

cjluo · November 16, 2018, 12:09am

Thanks for the confirmation.

Also two related question:

If my model weight is trained in FP32, will tensorrt automatically convert the weight during FP16 mode?

Also if my plugins for the model are all implemented in FP32, can FP16 still be used for the non-plugin stages (e.g. convolution)?

Thanks!

NVES · November 17, 2018, 11:04pm

Hello,

TRT will convert if you specify FP16 to the builder.

cjluo · November 18, 2018, 12:47am

Thanks, what about plugin implementation, will tensorrt run default layers in FP16 mode and then run customized layers in FP32 mode?

hl2997 · April 26, 2019, 6:23pm

Hi cjluo, Have you figured out whether tensorrt runs plugin in FP16 mode? I am trying to run my custom plugin in FP16 mode, but seems there’s some technical issues with plugin implementation. I am wondering if we need to manually to do data type conversion.

cjluo · April 26, 2019, 6:33pm

I think the custom layers are still by default FP32. I haven’t spent time on FP16 plugins yet.

Topic		Replies	Views
Inputting and outputting a TensorRT engine with FP16 optimization in C++ TensorRT	4	2758	October 12, 2021
Tensorrt 7 - Best Practice for implementing plugin that supports both FP16 and Fp32 TensorRT	1	459	August 16, 2022
TensorRT inference time much faster than cuDNN TensorRT	5	1632	February 22, 2022
Question about the tensorrt precision transformation TensorRT	4	470	July 12, 2021
Implement Plugin Layer with support of FP16 mode TensorRT	0	1017	April 26, 2019
Data process about TensorRT INT8 and FP16 inference Engine Jetson TX2	4	1998	October 18, 2021
TemsorRT Fp16 mode Jetson TX1	6	1269	October 18, 2021
How to enforce convert all layers to INT8 when building int8 engine model? TensorRT	5	443	June 21, 2023
TensorRT fp16 plugin GPU-Accelerated Libraries	4	2745	August 23, 2017
TensorRT 5.1.6 Custom plugin with fp16 issue TensorRT	6	1798	November 19, 2019

Does tensor rt 5 automatically enable tensor core for int8 and fp16 mode?

Related topics