I am trying to convert an .onnx model to a fp16 .engine using the below command:

trtexec --onnx=-beit-base-patch16-224.onnx --fp16 --saveEngine=model.engine --minShapes=\'pixel_values\':1x3x224x224 --optShapes=\'pixel_values\':8x3x224x224 --maxShapes=\'pixel_values\':8x3x224x224 --precisionConstraints=obey --layerPrecisions=/beit/embeddings/patch_embeddings/projection/Conv:fp32

The engine is build as expected when using fp32, however when the --fp16 flag is set, the below warnings appear and cause model outputs to be wrong:

Even when setting the --precisionConstraints=obey and layerPrecisions=/beit/embeddings/patch_embeddings/projection/Conv:fp32 or layerPrecisions=/beit/embeddings/patch_embeddings/projection/Conv.weight:fp32 the weights are still cast to fp16 causing a bad model conversion.

Am I doing something wrong or is there any way to fix the above issue?

ONNX model link: Download |

(Using the Deepstream 6.1.1 docker image)



RTX 3050TI
Please see the related post for more information.

Could you please try on the latest TensorRT version 8.5.3 and let us know if you still face this issue.

Thank you.

8.5.3 works as expected - thanks.

As a suggestion (if at all possible), it would be convenient to have a flag to automatically avoid casting layers to fp16 if they get clamped eg. --avoid-clamp

Currently I have ~20 layers that get clamped and I have to explicitly set all of them in --layerPrecision to avoid FP16 clamping.


