NvInfer Mixed-Precision ONNX

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
GPU
• DeepStream Version
6.1.1
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hello,

I am trying to to deploy an amp trained onnx model to DS in fp16 mode. Some of the layers (specifically the conv layers), require fp32 precision, since TRT will clamp the weights to 1e-7 (fp16 min) causing faulting results:

It should be possible to avoid this, by settinglayer-device-precision=/beit/embeddings/patch_embeddings/projection/Conv:fp32:gpu in the configuration file - however this seems to not work.

Do you have any suggestions as to how I can avoid casting the conv layers to FP16 and run the model in mixed precision?

ONNX model link: https://file.io/qvvjn27luBkt and config file is attached for reference.

Thanks!
beit-base-patch16-224-pt22k-ft22k.config (648 Bytes)

I do not reproduce the same build log with your model and configuration file. What is your GPU?

I’m testing on an Dell XPS with RTX 3050 TI Laptop, Cuda 11.8 and driver 522.30

Please make sure the compatibility requirement is met in your machine. Quickstart Guide — DeepStream 6.1.1 Release documentation

I think this is a TRT issue and not a DS issue - I have raised the issue on the TRT forums instead of here.

Thanks!