Do I have to use FP16 in the preprocessing of an image when the engine has been quantized to FP16 using "trtexec"?

Hi

I have used “trtexec” to create a .trt file. This is the command that I used:
/usr/src/tensorrt/bin/trtexec --onnx=inception_v1_2016_08_28_frozen.onnx --saveEngine=inception_v1_2016_08_28_fp16.trt --workspace=4096 --fp16

Now I want to use that file to make inference (WITHOUT “trtexec”). I am using the “preprocessing” and “postprocessing” of this Git: tf_to_trt_image_classification

  1. Do I have to change the dtype=np.float32 to dtype=np.float16 in the preprocessing?

Thank you

Hi,

The --fp16 flag only change the weight calculation precision.
Please still float32 for input buffer.

Thanks.

The --fp16 flag only change the weight calculation precision.
this means that the quantization is of FP16, right?

So to sum up, I have to use FP32 in the preprocessing of the input image but also to allocate buffers, right?

THIS IS THE CODE TO ALLOCATE BUFFERS

THIS IS THE CODE TO PREPROCESS THE INPUT IMAGE

Hi,

YES. Please use FP32 for allocating a buffer.

Thanks.

1 Like