Bfloat16 quantization in tensorrt

alsozatch · December 14, 2023, 9:36pm

Does TensorRT currently support quantization to bfloat16? I don’t see such an option. If not will there be that option in the future? Thanks!

AastaLLL · December 15, 2023, 3:31am

Hi,

It’s not supported right now.
Based on the below link, it is on our list:

Thanks.

Topic		Replies	Views
TensoRT convewrsion in bfloat16 Jetson AGX Orin tensorrt	3	707	February 1, 2024
ONNX TensorRT Engines FP16/32 Jetson Nano tensorrt , nano2gb	5	202	October 23, 2025
Quantization in TensorRt Jetson Nano tensorrt	6	2152	March 2, 2022
Why inference in jetson nano with fp16 is slower than fp32 Jetson Nano tensorrt , jetson-inference	9	2148	September 5, 2021
Turing Tensor core int4 operation TensorRT	3	2957	December 11, 2018
tensorRT FP8 support TensorRT tensorrt	2	3113	June 21, 2023
Unable to quantization FP8 in TensorRT TensorRT tensorrt	1	675	June 20, 2023
resnext in nano jetson Jetson Nano	10	1270	October 14, 2021
TensorRT, result error in fp16 TensorRT	1	771	October 19, 2021
Check device supports FP16 TensorRT	1	1488	April 5, 2022