Do I have to use FP16 in the preprocessing of an image when the engine has been quantized to FP16 using "trtexec"?

Aizzaac · October 20, 2020, 10:20pm

Hi

I have used “trtexec” to create a .trt file. This is the command that I used:
/usr/src/tensorrt/bin/trtexec --onnx=inception_v1_2016_08_28_frozen.onnx --saveEngine=inception_v1_2016_08_28_fp16.trt --workspace=4096 --fp16

Now I want to use that file to make inference (WITHOUT “trtexec”). I am using the “preprocessing” and “postprocessing” of this Git: tf_to_trt_image_classification

Do I have to change the dtype=np.float32 to dtype=np.float16 in the preprocessing?

Thank you

AastaLLL · October 21, 2020, 2:43am

Hi,

The --fp16 flag only change the weight calculation precision.
Please still float32 for input buffer.

Thanks.

Aizzaac · October 21, 2020, 10:47am

The --fp16 flag only change the weight calculation precision.
this means that the quantization is of FP16, right?

So to sum up, I have to use FP32 in the preprocessing of the input image but also to allocate buffers, right?

THIS IS THE CODE TO ALLOCATE BUFFERS

THIS IS THE CODE TO PREPROCESS THE INPUT IMAGE

AastaLLL · October 22, 2020, 2:55am

Hi,

YES. Please use FP32 for allocating a buffer.

Thanks.